Docs Menu
Docs Home
/

Deploy Voyage AI Models Using Google Cloud Model Garden

You can explore and deploy Voyage AI by MongoDB models from the Google Cloud Model Garden.

Model Garden manages licenses for Voyage AI by MongoDB models and provides deployment options using either on-demand hardware or your existing Compute Engine reservations.

Voyage AI by MongoDB models are self-deployed partner models, meaning you pay for both the model usage and the Vertex AI infrastructure consumed. Vertex AI handles deployment and provides endpoint management features.

To see which models you can deploy, search for "Voyage" in the Google Cloud Model Garden.

To learn more about Voyage AI models, see Models Overview.

Pricing for Voyage AI by MongoDB models in the Google Cloud Model Garden includes:

  • Model Usage fee: A cost for using the Voyage AI model container, billed at an hourly rate. The usage fee depends on the specific model and the hardware configuration you choose for deployment. For detailed pricing information, see the pricing section on the model's listing page in the Google Cloud Marketplace.

  • Google Cloud underlying instance in your region: The cost of the underlying Google Cloud GPU instance, such as NVIDIA L4, A100, or H100) that is specific to a region, is billed monthly and priced per vCPU. To learn more, see Google Cloud Compute Engine pricing.

All billing charges appear as the use of Vertex AI on your Google Cloud bill.

To view pricing for a specific Voyage AI model:

1
2

Search for the model in the Model Garden search box.

3

Click the model you want to view to open its details page.

4

In the Overview tab, scroll to the Pricing section.

5

Click the link that leads to the model's Google Cloud Marketplace listing. The Pricing tab in the listing entry displays detailed pricing information.

When you deploy Voyage AI models, you consume Vertex AI resources that are subject to quotas. You can view and manage your quotas in the Quotas section of the Google Cloud Console's IAM page. For more information, see View the quotas for your project. In the same page, you can right-click any current quota, click Edit quota, and submit a request to increase your quota if needed.

To get started using the Voyage AI by MongoDB models through Google Cloud Vertex AI, you must:

Each model in the Model Garden lists its recommended hardware configuration. Consult the Google Cloud Model Garden for Vertex AI for each Voyage model's recommended hardware specifications.

For example, for the voyage-4 model, use the following recommended instances that the Vertex AI Model Garden suggests for deployment. These recommendations may change and we recommend that you consult the official Google Cloud Model Garden page for a particular Voyage AI model to see its recommended hardware.

  • A2 instances, such as a2-highgpu-1g or a2-ultragpu-1g, with A100 GPUs are the default choice.

  • A3 instances, such as a3-highgpu-1g, with H100 GPUs are recommended for higher performance needs.

The Model Garden lists supported regions for each Voyage AI model. If you need support in another region for any of the models, contact MongoDB support.

  • Endpoint Type: All Voyage AI models require a dedicated public endpoint type. For more information, see Choose an endpoint type.

  • Understand input_type: Query vs. Document: The input_type parameter optimizes embeddings for retrieval tasks. Use "query" for search queries and "document" for content being searched. This optimization improves retrieval accuracy. To learn more about the input_type parameter, see the Embedding and Reranking API Overview.

  • Use Different Output Dimensions: Voyage 4 models support multiple output dimensions: 256, 512, 1024 (default), and 2048. Smaller dimensions reduce storage and computation costs, while larger dimensions may provide better accuracy. Choose the dimension that best balances your accuracy requirements with resource constraints.

To find Voyage AI by MongoDB models in the Model Garden:

1

Go to the Model Garden console.

2

In the Search Models field, enter "Voyage" to display the list of Voyage AI by MongoDB models.

Note

The Google Cloud Marketplace has two search boxes: one for the entire Marketplace and one within the Vertex AI Model Garden site. To locate Voyage AI by MongoDB models, use the search box on the Vertex AI Model Garden site.

Alternatively, you can navigate to Voyage AI models through Model Garden > Model Collections > Partner Models, and then select any of the Voyage AI models listed there.

You can also scroll down to Task-specific solutions to find Voyage AI models that you can use as-is or customize to your needs.

To make predictions using a Voyage AI by MongoDB model, you must deploy it to a private dedicated endpoint for online inferences. Deployment associates physical resources with a model for low-latency and high-throughput online predictions. You can deploy multiple models to one endpoint, or the same model to multiple endpoints.

When you deploy a model, consider the following options:

  • Endpoint location

  • Model container

  • Compute resources required to run the model

Once you deploy a model, you can't change these settings. If you need to modify any deployment configuration, you must undeploy the model and redeploy it with the new settings.

Voyage AI models require a dedicated public endpoint. For more information, see Create a public endpoint in the Google Cloud Vertex AI documentation.

To deploy a model in the Google Cloud Vertex AI using the console:

1

Go the Model Garden console and search for "Voyage" in the Search Models field to display the list of Voyage AI by MongoDB models.

2
3

Click Enable. The MongoDB Marketplace End User Agreement opens. Review and accept the agreement to enable the model and get the necessary commercial use licenses.

4

After you accept the agreement, the model page displays the following options:

  • Deploy a model: Saves the model to the Model Registry and deploys it to an endpoint in Google Cloud. Continue with the following steps to deploy using the console.

  • Create an Open Notebook for Voyage Embedding Models Family: Lets you fine-tune and customize your model in a collaborative environment, and mix and match models for optimal cost and performance. See Vertex AI Notebook Samples for Voyage AI.

  • View Code: Displays code samples for deploying and using the model. To deploy programmatically using code, see Deploy Using Code.

5

Review the model's regions, hardware requirements, considerations, use cases, and pricing information.

6

Click the Deploy model button to start the deployment process.

7

A form opens that allows you to review and edit the deployment options. Vertex AI provides default settings that are optimized for the model, but you can customize them as needed. For example, you can select the machine type, GPU type, and number of replicas. The following example shows default settings for the voyage-4 model, but these may change, so review the settings carefully before deploying.

Field
Description

Resource ID

Select from the dropdown menu (preselected).

Model Name

Select from the dropdown menu (preselected).

Region

Select your desired region, such as us-central1.

Endpoint name

Provide a name for your endpoint, such as mongodb_voyage-4_latest-mg-one-click-deploy.

Serving spec

Select the machine type, such as g2-standard-4.

Accelerator type

Select the GPU type, such as NVIDIA_L4.

Accelerator count

Specify the number of GPUs, such as 1.

Replica count

Specify the minimum and maximum number of replicas, such as 1 - 1.

Reservation type

Select reservation type, such as No reservation.

VM provisioning model

Select provisioning model, such as Standard.

Endpoint access

Select Public (Dedicated endpoint).

8

Vertex AI optimizes the settings that appear, which are applied by default. To customize your settings, click Edit settings. For example, you can select a more powerful machine type or GPU.

9

The configuration screen shows you the quotas you have available. Use the link to Quotas for managing quotas if needed.

10

Click Deploy to start the deployment process.

11

You will receive a notification when deployment completes. After deployment completes, you can click Google Cloud Vertex AI, Deploy, Endpoints list to find your deployment.

If you selected View Code from the model details page, you can deploy a model programmatically using the Vertex AI SDK. This approach provides full control over deployment configuration through code.

For more information about the Google Cloud Vertex AI SDK, see the Vertex AI SDK for Python documentation.

Note

The code examples in this section are for the voyage-4 model and are subject to change. For the most current code examples, consult the View Code tab on the model's page in the Model Garden. For other Voyage AI models, the code is similar, but check that model's page in the Model Garden for model-specific details.

To deploy a model using code:

1

First, initialize the model from Model Garden and view deployment options:

from vertexai import model_garden
MODEL_NAME = "mongodb/voyage-4@latest"
model = model_garden.OpenModel(MODEL_NAME)
deploy_options = model.list_deploy_options(concise=True)
print(deploy_options)
2

Choose whether to deploy a new model or use an existing endpoint:

# Choose whether to deploy a new model or use an existing endpoint:
deployment_option = "deploy_new" # ["deploy_new", "use_existing"]
# If using existing endpoint, provide the endpoint ID:
ENDPOINT_ID = "" # {type:"string"}
if deployment_option == "deploy_new":
print("Deploying new model...")
endpoint = model.deploy(
machine_type="a3-highgpu-1g",
accelerator_type="NVIDIA_H100_80GB",
accelerator_count=1,
accept_eula=True,
use_dedicated_endpoint=True,
)
print(f"Endpoint deployed: {endpoint.display_name}")
print(f"Endpoint resource name: {endpoint.resource_name}")
else:
if not ENDPOINT_ID:
raise ValueError("Please provide an ENDPOINT_ID when using existing endpoint")
from google.cloud import aiplatform
print(f"Connecting to existing endpoint: {ENDPOINT_ID}")
endpoint = aiplatform.Endpoint(
endpoint_name=f"projects/{PROJECT_ID}/locations/{LOCATION}/endpoints/{ENDPOINT_ID}"
)
print(f"Using endpoint: {endpoint.display_name}")
print(f"Endpoint resource name: {endpoint.resource_name}")

Important

Set use_dedicated_endpoint to True as Voyage AI models require a dedicated public endpoint.

Vertex AI deploys the model to a managed endpoint that you can access to make online inferences or batch inferences through the Google Cloud console or the Vertex AI API.

For more information, see Deploy a model to an endpoint in the Google Cloud Vertex AI documentation.

3

After deployment, you can make predictions using the Vertex AI endpoint.

For all endpoint parameters and prediction options, see the Embedding and Reranking API Overview.

import json
# Multiple texts to embed
texts = [
"Machine learning enables computers to learn from data.",
"Natural language processing helps computers understand human language.",
"Computer vision allows machines to interpret visual information.",
"Deep learning uses neural networks with multiple layers."
]
# Prepare the batch request and make invoke call
body = {
"input": texts,
"output_dimension": 1024,
"input_type": "document"
}
response = endpoint.invoke(
request_path="/embeddings",
body=json.dumps(body).encode("utf-8"),
headers={"Content-Type": "application/json"}
)
# Extract embeddings
result = response.json()
embeddings = [item["embedding"] for item in result["data"]]
print(f"Number of texts embedded: {len(embeddings)}")
print(f"Embedding dimension: {len(embeddings[0])}")
print(f"\nFirst embedding (first 5 values): {embeddings[0][:5]}")
print(f"Second embedding (first 5 values): {embeddings[1][:5]}")

To remove a deployed model and its endpoint:

  1. Undeploy the model from the endpoint.

  2. Optionally delete the endpoint itself.

For detailed instructions, see Undeploy a model and delete the endpoint in the Google Cloud Vertex AI documentation.

Important

You can delete the endpoint only after all models have been undeployed from it. Undeploying models and deleting the endpoint stops all inference services and billing for that endpoint.

Back

Azure Marketplace

On this page