Now GA: voyage-context-4

July 1, 2026

What it is: voyage-context-4 is the next-generation contextualized chunk embedding model and a drop-in replacement for voyage-context-3, producing one vector per chunk that captures full document context -- now with a new mixture-of-experts backbone, built-in auto-chunking, transparent handling of documents beyond the 32K-token window, and native overlapping-chunk support. It outperforms voyage-context-3 by 1.40% (document-level) and 2.08% (chunk-level) across 39 datasets and is priced at $0.12 per 1M tokens, down from $0.18.

Who it's for: This feature is for developers and retrieval teams building semantic search, RAG, and agentic applications, especially with long documents and chunk-level retrieval. It's ideal for customers who want maximal retrieval accuracy without manually tuning their embedding pipeline.

Why it matters: By capturing full document context in every chunk embedding, voyage-context-4 improves retrieval quality across nearly every domain while removing chunking as a design concern, with no extra LLM calls or preprocessing logic. As a drop-in replacement priced below voyage-context-3, it raises accuracy and lowers cost at the same time.

How to get started: voyage-context-4 is available now through the Voyage AI API and the MongoDB Atlas Embedding and Reranking API -- simply swap in the model name voyage-context-4 or pass a full document with enable_auto_chunking=True. New users get 200 million free tokens.

Related Content

Blog

voyage-context-4 Blog Post

Docs

Documentation