Voyage AI provides state-of-the-art embedding and reranking models. MongoDB's Embedding and Reranking API provides access to the latest Voyage AI models. This page describes the available models and when to use them.
What are embedding models and rerankers?
For text embeddings, we recommend:
voyage-4-largefor the best qualityvoyage-4-litefor the lowest latency and costvoyage-4for a balance between quality and performanceA domain-specific model if your application is in one of the listed domains.
For other use cases, we recommend:
voyage-multimodal-3.5for text, image, and video embeddingsvoyage-context-3for chunk-level and document-level retrieval tasksrerank-2.5for adding reranking to most applicationsrerank-2.5-litefor adding reranking to latency-sensitive applications
Text Embeddings
Voyage AI provides the following text embedding models to capture the semantic meaning of text.
For details and example usage, see Text Embeddings.
General-Purpose Models
Use the following models for most AI search and retrieval applications.
Model | Context Length | Dimensions | Description |
|---|---|---|---|
| 32,000 tokens | 1024 (default), 256, 512, 2048 | The best general-purpose and multilingual retrieval quality. All embeddings created with the 4 series are compatible with each other. To learn more, see the blog post. |
| 32,000 tokens | 1024 (default), 256, 512, 2048 | Optimized for general-purpose and multilingual retrieval quality. All embeddings created with the 4 series are compatible with each other. To learn more, see the blog post. |
| 32,000 tokens | 1024 (default), 256, 512, 2048 | Optimized for latency and cost. All embeddings created with the 4 series are compatible with each other. To learn more, see the blog post. |
Domain-Specific Models
Use the following models for specialized domains to achieve better accuracy.
Model | Context Length | Dimensions | Description |
|---|---|---|---|
| 32,000 tokens | 1024 (default), 256, 512, 2048 | Optimized for code retrieval and documentation. To learn more, see the blog post. |
| 32,000 tokens | 1024 | Optimized for finance retrieval and RAG applications. To learn more, see the blog post. |
| 16,000 tokens | 1024 | Optimized for legal retrieval and RAG applications. To learn more, see the blog post. |
Open Models
Voyage also provides the following open-weight models.
Model | Context Length | Dimensions | Description |
|---|---|---|---|
| 32,000 tokens | 512 (default), 128, 256 | Open-weight model available on Hugging Face. All embeddings created with the 4 series are compatible with eachother To learn more, see the blog post. |
The latest models perform better than the legacy models in all aspects, such as quality, context length, latency, and throughput.
Model | Context Length | Dimensions | Description |
|---|---|---|---|
| 32,000 tokens | 1024 (default), 256, 512, 2048 | Previous generation of text embeddings for general-purpose and multilingual retrieval quality. To learn more, see the blog post. |
| 32,000 tokens | 1024 (default), 256, 512, 2048 | Previous generation of text embeddings optimized for general-purpose and multilingual retrieval quality. To learn more, see the blog post. |
| 32,000 tokens | 1024 (default), 256, 512, 2048 | Previous generation of text embeddings optimized for latency and cost. To learn more, see the blog post. |
| 16,000 tokens | 1536 | Optimized for code retrieval (17% better than alternatives). Previous generation of code embeddings. To learn more, see the blog post. |
Contextualized Chunk Embeddings
Voyage AI provides the following models that generate embeddings while incorporating surrounding context for improved retrieval accuracy.
For details and example usage, see Contextualized Chunk Embeddings.
Model | Context Length | Dimensions | Description |
|---|---|---|---|
| 32,000 tokens | 1024 (default), 256, 512, 2048 | Contextualized chunk embeddings optimized for general-purpose and multilingual retrieval quality. To learn more, see the blog post. |
Multimodal Embeddings
Voyage AI provides the following embedding models that process text, images, and video.
For details and example usage, see Multimodal Embeddings.
Model | Context Length | Dimensions | Description |
|---|---|---|---|
| 32,000 tokens | 1024 (default), 256, 512, 2048 | Rich multimodal embedding model that can vectorize interleaved text and visual data, such as screenshots of PDFs, slides, tables, figures, videos, and more. To learn more, see the blog post. |
The latest models perform better than the legacy models in all aspects, such as quality, context length, latency, and throughput.
Model | Context Length | Dimensions | Description |
|---|---|---|---|
| 32,000 tokens | 1024 | Processes text and images into unified embeddings. Supports images from 50,000 to 2 million pixels. To learn more, see the blog post. |
Rerankers
Voyage AI provides the following reranking models to refine your search results.
For details and example usage, see Rerankers.
Model | Context Length | Description |
|---|---|---|
| 32,000 | Highest accuracy. Recommended for most applications. To learn more, see the blog post. |
| 32,000 | Fast and cost-effective model optimized for latency-sensitive applications. To learn more, see the blog post. |
The latest models perform better than the legacy models in all aspects, such as quality, context length, latency, and throughput.
Model | Context Length | Description |
|---|---|---|
| 16,000 tokens | Our generalist second-generation reranker optimized for quality with multilingual support. To learn more, see the blog post. |
| 8,000 tokens | Our generalist second-generation reranker optimized for both latency and quality with multilingual support. To learn more, see the blog post. |
Pricing
Model pricing is usage-based, with charges billed to the Atlas account linked to the API key used for access. All models include a free tier. Get started with 200 million free tokens for most models, or 50 million tokens for specialized models.
Pricing is based on the number of tokens in your documents and queries.
The free tier includes 200 million tokens for most models, and 50 million tokens for the following
specialized models: voyage-finance-2, voyage-law-2, voyage-code-2.
Model | Price per 1K tokens | Price per 1M tokens | Free tokens |
|---|---|---|---|
| $0.00012 | $0.12 | 200 million |
| $0.00006 | $0.06 | 200 million |
| $0.00002 | $0.02 | 200 million |
| $0.00018 | $0.18 | 200 million |
| $0.00018 | $0.18 | 200 million |
voyage-finance-2voyage-law-2voyage-code-2 | $0.00012 | $0.12 | 50 million |
Pricing is based on the number of tokens in your documents and queries.
Model | Price per 1K tokens | Price per 1M tokens | Free tokens |
|---|---|---|---|
| $0.00018 | $0.18 | 200 million |
Pricing is based on text tokens and image pixels. The free tier includes 200 million text tokens and 150 billion pixels for multimodal models. Images are processed between 50,000 pixels (minimum) and 2 million pixels (maximum), with costs ranging from $0.00003 to $0.0012 per image. For pricing purposes, each video frame is considered an image.
Note
Images with fewer than 50,000 pixels are upscaled, processed, and charged as a 50,000-pixel image. Images containing over 2 million pixels are downsampled and charged as 2 million-pixel images.
Model | Price per 1M tokens | Price per 1B pixels | Free tier |
|---|---|---|---|
| $0.12 | $0.60 | 200M tokens, 150B pixels |
Image resolution | Number of pixels | Price per image | Price per 1K images |
|---|---|---|---|
200px × 200px | 40,000 | $0.00003 | $0.03 |
1000px × 1000px | 1 million | $0.0006 | $0.60 |
2000px × 2000px | 4 million | $0.0012 | $1.20 |
4000px × 4000px | 16 million | $0.0012 | $1.20 |
Example
The cost to vectorize a single input with 1,000 text tokens ($0.00012) and two 4 million-pixel images (2 × $0.0012) would be $0.00252.
Pricing is based on total processed tokens, calculated as
(query tokens × number of documents) + sum of tokens in all documents.
The free tier includes 200 million tokens for the latest reranker models.
Model | Price per 1K tokens | Price per 1M tokens | Est. price per request* | Free tokens |
|---|---|---|---|---|
| $0.00005 | $0.05 | $0.0025 | 200 million |
| $0.00002 | $0.02 | $0.001 | 200 million |
* Estimated price assumes 100 documents per request, with the sum of query tokens and tokens per document totaling 500.