The Embedding and Reranking API provides programmatic access to the latest Voyage AI embedding and reranking models through a RESTful interface. This page provides an overview of the API and its features.
For detailed information and parameters, see the API specification.
API Key Management
You use MongoDB Atlas to manage API keys for the Embedding and Reranking API. This includes creating and managing your model API keys across your organization and projects, monitoring usage, and configuring rate limits.
To learn more, see Model API Keys.
Note
It is named model API key to distinguish it from other API keys in Atlas. You use this key the same way as API keys from other model providers.
Authentication
All requests to the Embedding and Reranking API must include an
Authorization header with your model API key using the Bearer token format.
Authorization: Bearer VOYAGE_API_KEY
When you use a client SDK, you set the API key when constructing a client, and the SDK sends the header on your behalf with every request. When you integrate directly with the API, you must send this header yourself.
JSON
All entities are represented in JSON. The following rules and conventions apply:
- Content Type Request Header
- When you send JSON to the server with a POST request,
specify the
Content-Type: application/jsonheader. Client SDKs handle this automatically. - Invalid Requests
- If you attempt to create a request with invalid JSON,
incorrect data types, or constraint violations
(such as exceeding token limits or batch sizes),
the server responds with a
400status code and an error message describing the issue. - Field Names for Fields with Numbers
- Fields that contain numeric values are named to
disambiguate the unit being used. For example,
token counts are specified in fields like
total_tokensandoutput_dimensionto clarify the measurement unit.
Rate Limits and Usage Tiers
The Embedding and Reranking API implements rate limiting to ensure fair usage and optimal performance. Rate limits are applied per API key and measured in two dimensions. Your rate limits increase as you advance through usage tiers.
TPM (Tokens Per Minute): Maximum number of tokens processed per minute
RPM (Requests Per Minute): Maximum number of API requests per minute
If you exceed the rate limit, the API returns a 429 (Rate Limit Exceeded)
HTTP status code.
Free trial rate limits without a payment method are 3 RPM and 10K TPM. To qualify for higher rate limits, add a payment method to your account.
Model | Tokens Per Min (TPM) | Requests Per Min (RPM) |
|---|---|---|
| 16,000,000 | 2,000 |
| 8,000,000 | 2,000 |
| 3,000,000 | 2,000 |
| 3,000,000 | 2,000 |
| 2,000,000 | 2,000 |
| 4,000,000 | 2,000 |
| 2,000,000 | 2,000 |
The rate limits for Usage Tier 2 are twice those of Usage Tier 1.
Model | Tokens Per Min (TPM) | Requests Per Min (RPM) |
|---|---|---|
| 32,000,000 | 4,000 |
| 16,000,000 | 4,000 |
| 6,000,000 | 4,000 |
| 6,000,000 | 4,000 |
| 4,000,000 | 4,000 |
| 8,000,000 | 4,000 |
| 4,000,000 | 4,000 |
The rate limits for Usage Tier 3 are three times those of Usage Tier 1.
Model | Tokens Per Min (TPM) | Requests Per Min (RPM) |
|---|---|---|
| 48,000,000 | 6,000 |
| 24,000,000 | 6,000 |
| 9,000,000 | 6,000 |
| 9,000,000 | 6,000 |
| 6,000,000 | 6,000 |
| 12,000,000 | 6,000 |
| 6,000,000 | 6,000 |
To learn more about usage tiers, see Usage Tiers.
To set custom rate limits for your organization, use the Atlas UI. To learn more, see Manage Rate Limits.
Making Requests
The following example demonstrates how you can use cURL to make a request
to the embedding service. You can also use an HTTP client in any
programming language to access the API.
For additional usage examples, see the following resources:
Accessing Voyage AI Models for HTTP request and client SDK examples
Model pages for model-specific usage.
API specification for full details on all API endpoints.
curl \ --request POST 'https://ai.mongodb.com/v1/embeddings' \ --header "Authorization: Bearer $VOYAGE_API_KEY" \ --header "Content-Type: application/json" \ --data '{ "input": [ "MongoDB is redefining what a database is in the AI era.", "Voyage AI embedding and reranking models are state-of-the-art." ], "model": "voyage-4-large" }'
Errors
To learn more about errors returned by the API, see the API specification.
Best Practices
Consider the following best practices when you use the API:
Specifying Input Type
For semantic search and retrieval tasks, set the input_type to
query or document to optimize how Voyage AI models create the
vectors. Do not omit this parameter.
The parameter adds the following prompts to your input before generating embeddings:
query: "Represent the query for retrieving supporting documents: "document: "Represent the document for retrieval: "
Example
input_type="query" transforms "When is Apple's
conference call scheduled?" into "Represent the query for retrieving
supporting documents: When is Apple's conference call scheduled?"
Troubleshooting
If you're using the Python client, you must use version 0.3.7 or later. To check the version of your Python client installation, run the following command in your terminal:
python -c "import voyageai; print(voyageai.__version__)"