Contextualized Chunk Embeddings

voyage-context-4 is a contextualized chunk embedding model that produces vectors for chunks that capture the full document context without any manual metadata and context augmentation. This leads to higher retrieval accuracies than with or without augmentation. The model is also simpler, faster, and cheaper. It serves as a drop-in replacement for standard embeddings without downstream workflow changes and reduces chunking strategy sensitivity.

To learn more, see the blog post.

Available Models

Model	Context Length	Dimensions	Description
In preview: `voyage-context-4`	120,000 tokens *	1024 (default), 256, 512, 2048	Contextualized chunk embeddings optimized for general-purpose and multilingual retrieval quality.
`voyage-context-3`	120,000 tokens *	1024 (default), 256, 512, 2048	Contextualized chunk embeddings optimized for general-purpose and multilingual retrieval quality. To learn more, see the blog post.

Note

* The total number of tokens across all inputs must not exceed 120K if enable_auto_chunk = true; otherwise they must not exceed 32K.

Tutorial

For a tutorial on using contextualized chunk embeddings, see Semantic Search with Voyage AI Embeddings.

Usage

Language

Back

Text Embeddings

Multimodal Embeddings