Introducing voyage-3.5 and voyage-3.5-lite: Improved Quality for a New Retrieval Frontier

MongoDB
May 20, 2025 | Updated: August 18, 2025

TL;DR – We’re excited to introduce voyage-3.5 and voyage-3.5-lite, the latest generation of our embedding models. These models offer improved retrieval quality over voyage-3 and voyage-3-lite at the same price, setting a new frontier for price-performance. Both models support embeddings in 2048, 1024, 512, and 256 dimensions, with multiple quantization options enabled by Matryoshka learning and quantization-aware training. voyage-3.5 and voyage-3.5-lite outperform OpenAI-v3-large by 8.26% and 6.34%, respectively, on average across evaluated domains, with 2.2x and 6.5x lower respective costs and a 1.5x smaller embedding dimension. Compared with OpenAI-v3-large (float, 3072), voyage-3.5 (int8, 2048) and voyage-3.5-lite (int8, 2048) reduce vector database costs by 83%, while achieving higher retrieval quality.

Note to readers: voyage-3.5 and voyage-3.5-lite are currently available through the Voyage AI APIs directly or through the private preview of automated text embedding in Atlas Vector Search. For access, sign up for Voyage AI or register your interest in the Atlas Vector Search private preview.

We’re excited to introduce voyage-3.5 and voyage-3.5-lite, which maintain the same sizes as their predecessors—voyage-3 and voyage-3-lite—but offer improved quality for a new retrieval frontier.

As we see in the figure below, voyage-3.5 improves retrieval quality over voyage-3 by 2.66%, and voyage-3.5-lite improves over voyage-3-lite by 4.28%—both maintaining a 32K context length and their respective price points of $0.06 and $0.02 per 1M tokens.

Figure 1. voyage-3.5 and voyage-3.5-lite achieve a new frontier for price-performance.

Graph showing the price performance of voyage-3.5 and 3.5-lite against voyage-3, Cohere v4, and OpenAI models. Compared to all the rest, voyage-3.5 and 3.5-lite achieve higher quality retrieval quality per penny spent.

voyage-3.5 and voyage-3.5-lite also outperforms OpenAI-v3-large by 8.26% and 6.34%, respectively, with voyage-3.5 also outperforming Cohere-v4 by 1.63%. voyage-3.5-lite achieves retrieval quality within 0.3% of Cohere-v4 at 1/6 the cost. Both models advance the cost-performance ratio of embedding models to a new state-of-the-art through an improved mixture of training data, distillation from voyage-3-large, and the use of Voyage AI rerankers.

Matryoshka embeddings and quantization: voyage-3.5 and voyage-3.5-lite support 2048, 1024, 512, and 256-dimensional embeddings enabled by Matryoshka learning and multiple embedding quantization options—including 32-bit floating point, signed and unsigned 8-bit integer, and binary precision—while minimizing quality loss. Compared with OpenAI-v3-large (float, 3072), voyage-3.5 and voyage-3.5-lite (both int8, 2048) reduce vector database costs by 83%, while achieving outperformance of 8.25% and 6.35% respectively. Further, comparing OpenAI-v3-large (float, 3072) with voyage-3.5 and voyage-3.5-lite (both binary, 1024), vector database costs are reduced by 99%, with outperformance of 3.63% and 1.29% respectively.

Figure 2. voyage-3.5 and voyage-3.5-lite offer industry-leading performance.

Another graph that depicts retrieval quality versus relative storage costs. Again, voyage-3.5 and 3.5-lite provide you with higher quality retrievals for the same price as other competing models.

Evaluation details

Datasets: We evaluate on 100 datasets spanning eight domains: technical documentation, code, law, finance, web reviews, multilingual, long documents, and conversations. Each dataset consists of a corpus (e.g., technical documentation, court opinions) and queries (e.g., questions, summaries). The following table lists the datasets in the eight categories, except multilingual, which includes 62 datasets covering 26 languages. A list of all evaluation datasets is available in this spreadsheet.

Category	Descriptions	Datasets
TECH	Technical documentation	Cohere, 5G, OneSignal, LangChain, PyTorch
CODE	Code snippets, docstrings	LeetCodeCPP-rtl, LeetCodeJava-rtl, LeetCodePython-rtl, HumanEval-rtl, MBPP-rtl, DS1000-referenceonly-rtl, DS1000-rtl, APPS-rtl
LAW	Cases, court opinions, statues, patents	LeCaRDv2, LegalQuAD, LegalSummarization, AILA casedocs, AILA statutes
FINANCE	SEC fillings, finance QA	RAG benchmark (Apple-10K-2022), FinanceBench, TAT-QA-rtl, Finance Alpaca, FiQA-Personal-Finance-rtl, Stock News Sentiment, ConvFinQA-rtl, FinQA-rtl, HC3 Finance
WEB	Reviews, forum posts, policy pages	Huffpostsports, Huffpostscience, Doordash, Health4CA
LONG-CONTEXT	Long documents on assorted topics: government reports, academic papers, and dialogues	NarrativeQA, QMSum, SummScreenFD, WikimQA
CONVERSATION	Meeting transcripts, dialogues	Dialog Sum, QA Conv, HQA

Models: We evaluate voyage-3.5 and voyage-3.5-lite alongside several alternatives, including: OpenAI-v3 small (text-embedding-3-small) and large (text-embedding-3-large), Cohere-v4 (embed-v4.0), voyage-3-large, voyage-3, and voyage-3-lite.

Metrics: Given a query, we retrieve the top 10 documents based on cosine similarities and report the normalized discounted cumulative gain (NDCG@10), a standard metric for retrieval quality and a variant of the recall.

Results

All the evaluation results are available in this spreadsheet.

Domain-specific quality: The bar charts below illustrate the average retrieval quality of voyage-3.5 and voyage-3.5-lite with full precision and 2048 dimensions, both overall and for each domain. voyage-3.5 outperforms OpenAI-v3-large, voyage-3, and Cohere-v4 by an average of 8.26%, 2.66%, and 1.63%, respectively across domains. voyage-3.5-lite outperforms OpenAI-v3-large and voyage-3-lite by an average of 6.34% and 4.28%, respectively across domains.

Figure 3. voyage-3.5 and voyage-3.5-lite outperform across domains.

A set of bar graphs, each representing a different domain, showcasing how voyage-3.5 and 3.5-lite outperform the other models that have been referenced on the previous two graphs.

Binary rescoring: In some cases, users retrieve an initial set of documents using binary embeddings (e.g., 100 in our evaluation) and then rescore them with full-precision embeddings. For voyage-3.5 and voyage-3.5-lite, this binary rescoring approach yields up to 6.38% and 6.89% improvements, respectively, in retrieval quality over standard binary retrieval.

Try voyage-3.5 and voyage-3.5-lite!

Interested in getting started today via the Voyage API? The first 200 million tokens are free. Visit the docs to learn more.

Interested in using voyage-3.5 or voyage-3.5-lite alongside MongoDB Atlas? Register your interest in the private preview for automated embedding in Atlas Vector Search.

← Previous

Unlocking Literacy: Ojje’s Journey With MongoDB

In the rapidly evolving landscape of education technology, one startup is making waves with a bold mission to revolutionize how young minds learn to read. Ojje is redefining literacy education by combating one of the most pressing issues in education today—reading proficiency. To do so, Ojje leverages groundbreaking technology to ensure every child can access the world of stories, at their own pace, in their own language. That transformative change is powered by a strategic partnership with MongoDB . Meet Ojje: A vision beyond words From electric cars to diabetes apps, Adrian Chernoff has been at the forefront of breakthrough innovations. Now, as the Founder and CEO of Ojje , he's channeling his passion for invention and entrepreneurship into something deeply personal and universally important—literacy. At its core, Ojje is an adaptive literacy learning platform that offers stories in 15 different reading levels, available in both English and Spanish. Grounded in the science of reading, it features elements like read-aloud functionality and dyslexia-friendly fonts to engage every learner. Ojje is not just a tool—it’s a gateway to personalized literacy education. Ojje's mission is to reach every learner by providing materials that are leveled, accessible, and engaging. By doing so, Ojje aims to vastly improve reading outcomes across K-12 education. Solving a literacy crisis with innovative solutions With literacy rates in the U.S. alarmingly low—almost 70% of low-income fourth grade students cannot read at a basic level according to the National Literacy Institute— Ojje's mission couldn't be more crucial. Chernoff and his team developed their platform in response to teachers' complaints about the stark lack of appropriate reading materials available to students. Schools needed a tool that could effortlessly cater to varying reading abilities within a single classroom. Ojje fills this gap by offering a dynamic platform that adapts to individual students’ needs, allowing educators to personalize instruction. The potential to genuinely connect with every student is realized through Ojje’s innovative use of technology. Powered by MongoDB At the root of every great tech innovation is an infrastructure that allows it to flourish. For Ojje, MongoDB is that foundation. As a startup, speed and adaptability are vital, and MongoDB’s flexible document model provides just that. It allows the Ojje team to launch rapidly, scale efficiently, and to handle a variety of data structures seamlessly—all without the cumbersome need for rigid schemas. “MongoDB handles everything from structured data to student performance tracking, without unnecessary overhead,” Chernoff said. “The platform scales with our needs, and the built-in monitoring tools give our team confidence as usage grows.” Why MongoDB? For Ojje, it was about the flexibility to handle educational content, ensure secure data handling for students, and to offer scalability for thousands of classrooms. MongoDB proved to be the perfect fit, offering a balance of adaptability and comprehensive data management. Working with MongoDB also offered Ojje access to the MongoDB for Startups program, providing essential Atlas credits, valuable technical resources, and access to our vast network of partners. This support played a crucial role during Ojje’s developmental stages and early launch, helping to position the company for successful growth and innovation. What’s next for Ojje? With an eye towards broadening their impact, the Ojje team plans to expand its library to include STEM materials and engaging biographies, alongside enhancing existing content. Additionally, Ojje will introduce tools for educators to track each reader’s progress in real time, further personalizing instruction. “We believe every student deserves the chance to love reading—and every teacher deserves tools that make that possible,” Chernoff said. “That’s why we’re building Ojje: To make literacy more accessible, engaging, and joyful. When students can learn to read and read to learn, it transforms not only their K–12 experience but their entire future.” In an exciting development, Ojje will soon unveil Ojje at Home. This initiative aims to extend literacy support beyond the classroom, providing families with valuable resources to join their children on the journey to literacy. Building a future where every child reads Ojje's combination of strategic foresight, cutting-edge technology, and genuine passion for educational impact make it a standout player in the education sector. By partnering with MongoDB, the company has created a robust, adaptive platform that not only meets the demands of today’s classrooms but is poised to address future literacy challenges. As the digital landscape continues to evolve, so must our methods of teaching and learning. Ojje is leading the charge, ensuring that every child has the opportunity to love reading and reap the lifelong benefits it brings. Interested in MongoDB but not sure where to start? Check out our quick start guides for detailed instructions on deploying and using MongoDB.

May 15, 2025

Next →

From Niche NoSQL to Enterprise Powerhouse: The Story of MongoDB's Evolution

I joined MongoDB two years ago through the acquisition of Grainite, a database startup I co-founded. My journey here is built on a long career in databases, including many years at Google, where I was most recently responsible for the company’s entire suite of native databases—Bigtable, Spanner, Datastore, and Firestore—powering both Google's own products and Google Cloud customers. My passion has always been large-scale distributed systems, and I find that the database space offers the most exciting and complex challenges to solve. At MongoDB my focus is on architectural improvements across the product stack. I've been impressed with the progression of MongoDB's capabilities and the team's continuous innovation ethos. In this blog post, I’ll share some of my understanding of MongoDB’s history and how MongoDB became the de facto standard for document databases. I’ll also highlight select innovations we are actively exploring. The dawn of NoSQL During the "move fast and break things" era of Web 2.0, the digital landscape was exploding. Developers were building dynamic, data-rich applications at an unprecedented pace, and the rigid, tabular structures of legacy relational databases like Oracle and Microsoft SQL Server quickly became a bottleneck. A new approach was needed, one that prioritized developer productivity, flexibility, and massive scale. At the same time, JSON's popularity as a flexible, cross-language format for communicating between browsers and backends was surging. This collective shift toward flexibility gave rise to NoSQL databases , and MongoDB, with its native document-based approach, was at the forefront of the movement. In the early days, there was a perception that MongoDB was great for use cases like social media feeds or product catalogs, but not for enterprise applications where data integrity is non-negotiable—like financial transactions. This view was never perfectly accurate, and it certainly isn't today. So, what created this perception? It came down to two main factors: categorization and maturity. First, most early NoSQL databases were built on an “eventually consistent” model, prioritizing Availability and Partition Tolerance (AP) under the CAP theorem . MongoDB was an exception, designed to prioritize Consistency and Partition Tolerance (CP). But, in a market dominated by AP systems, MongoDB was often lumped in with the rest, leading to the imprecise label of having “light consistency.” Second, all new databases take time to mature for mission-critical workloads. Any established system-of-record database today has gone through many versions over many years to earn that trust. After more than 15 years of focused engineering, today MongoDB has the required codebase maturity, features, and proven track record for the most demanding enterprise applications. The results speak for themselves. As our CEO Dev Ittycheria mentioned during the Q2 2026 earnings call, over 70% of the Fortune 500—as well as 7 of the 10 largest banks, 14 of the 15 largest healthcare companies, and 9 of the 10 largest manufacturers globally—are MongoDB customers. This widespread adoption by the world's most sophisticated organizations is a testament to a multi-year, deliberate engineering journey that has systematically addressed the core requirements of enterprise-grade systems. MongoDB’s engineering journey: Building a foundation of trust MongoDB’s evolution from being perceived as a niche database to an enterprise powerhouse wasn't an accident; it was the result of a relentless focus on addressing the core requirements of enterprise-grade systems. Improvements instrumental to this transformation include: High availability with replica sets: The first step was eliminating single points of failure. Replica sets were introduced as self-healing clusters that provide automatic failover, ensuring constant uptime and data redundancy. Later, the introduction of a Raft-style consensus protocol provided even more reliable and faster failover and leader elections, especially in the event of a network partition. This architecture is the foundation for MongoDB’s current multi-region or run-anywhere deployments, and even allows a single replica set to span multiple cloud providers for maximum resilience. Figure 1. Horizontal scaling. Massive scalability with horizontal sharding: Introduced at the same time as replica sets, sharding is a native, foundational part of MongoDB. MongoDB built sharding to allow data to be partitioned across multiple servers, enabling virtually limitless horizontal scaling to support massive datasets and high-throughput operations. Advanced features like zone sharding further empower global applications by pinning data to specific geographic locations to reduce latency and comply with data residency laws like GDPR. Tunable consistency: Recognizing that not all data is created equal, MongoDB empowered developers with tunable read and write concerns. Within a single application, some data—like a 'page view count'—might not have the same consistency requirements as a 'order checkout value'. Instead of using separate, specialized databases for each use case, developers can use MongoDB for both. This moved the platform beyond a one-size-fits-all model, allowing teams to choose the precise level of consistency their application required per operation—from "fire and forget" for speed to fully acknowledged writes across a majority of replicas for guaranteed durability. This flexibility provides the best price/performance tradeoffs for modern applications. The game-changer, multi-document ACID transactions : From its inception, MongoDB has always provided atomic operations for single documents. The game-changing moment was the introduction of multi-document ACID transactions in 2018 with MongoDB 4.0, which was arguably the single most important development in its history. This feature, later extended to include sharded clusters, meant that complex operations involving multiple documents—like a financial transfer between two accounts—could be executed with the same atomicity, consistency, isolation, and durability (ACID) guarantees as a traditional relational database. This milestone shattered the biggest barrier to adoption for transactional applications. And the recently released MongoDB 8.2 is the most feature-rich and performant version of MongoDB yet. Strict security and compliance: To meet the stringent security demands of the enterprise, MongoDB layered in a suite of advanced security controls. Features like Role-Based Access Control (RBAC), detailed auditing, and Field-Level Encryption were just the beginning. The release of Queryable Encryption ( to which we recently introduced support for prefix, suffix, and substring queries ) marked a revolutionary breakthrough, allowing non-deterministic encrypted data to be queried without ever decrypting it on the server, ensuring data remains confidential even from the database administrator. To provide independent validation, MongoDB Atlas has achieved a number of internationally recognized security certifications and attestations, including ISO/IEC 27001 , SOC 2 Type II , PCI DSS , and HIPAA compliance, demonstrating a commitment to meeting the rigorous standards of the world's most regulated industries. Figure 2. Queryable Encryption. The ultimate proof of enterprise readiness lies in real-world adoption. Today, MongoDB is trusted by leading organizations across the most demanding sectors to run their core business systems. For example, Citizens Bank , one of the oldest and largest financial institutions in the United States, moved to modernize its fraud detection capabilities from a slow, batch-oriented legacy system. They built a new, comprehensive fraud management platform on MongoDB Atlas that allows for near real-time monitoring of transactions. This use case in a highly regulated industry requires high availability, low latency, and strong consistency to analyze transactions in real-time and prevent financial loss—a direct refutation of the old "eventual consistency" criticism. Another example is that of Bosch Digital , the software and systems house for the Bosch Group. Bosch Digital uses MongoDB for its IoT platform, Bosch IoT Insights, to manage and analyze massive volumes of data from connected devices—from power tools used in aircraft manufacturing, to sensors in vehicles. IoT data arrives at high speeds, in huge volumes, and in variable structures. This mission-critical use case demonstrates MongoDB's ability to handle the demands of industrial-scale IoT, providing the real-time analytics needed to ensure quality, prevent errors, and drive innovation. Then there’s Coinbase , which relies on MongoDB to seamlessly handle the volatile and unpredictable cryptocurrency market. Specifically, Coinbase architected a MongoDB Atlas solution that would accelerate scaling for large clusters. The result was that Coinbase end-users gained a more seamless experience. Previously, traffic spikes could impact some parts of the Coinbase app. Now, users don’t even notice changes happening behind the scenes. These are just a few examples; customers across all verticals, industries, and sizes depend on MongoDB for their most demanding production use cases. A common theme is that real-world data is messy, variable, and doesn't fit neatly into rigid, tabular structures. The old adage says that if all you have is a hammer, everything looks like a nail. For decades, developers only had the relational "hammer." With MongoDB, they now have a modern tool that adapts to how developers work and the data they need to manage and process. The road ahead: Continuous innovation MongoDB is not resting on its laurels. The team is as excited about what the future holds as they were when MongoDB was first launched, and we continue to innovate aggressively to meet—and anticipate—the modern enterprise’s demands. Here are select improvements we are actively exploring. A critical need we hear from customers is how to support elastic workloads in a price-performant way. To address this, over the past two years we’ve rolled out Search Nodes, which is a unique capability in MongoDB that allows scaling of search and vector workloads independent from the database to improve availability and price performance. We are now working closely with our most sophisticated customers to explore how to deliver similar capabilities across more of MongoDB. Our vision is to enable customers to scale compute for high-throughput queries without over-provisioning storage , and vice versa. We can do all this while building upon what is already one of the strongest security postures of any cloud database, as we continue to raise the bar for durability, availability, and performance. Another challenge facing large enterprises is the significant cost and risk associated with modernizing legacy applications. To solve this, we are making a major strategic investment in enterprise application modernization, and recently announced the MongoDB Application Modernization Platform . We have been engaged with several large enterprises in migrating their legacy relational database applications—code, data, and everything in between—over to MongoDB. This is not a traditional, manual migration effort capped by the number of bodies assigned. Instead, we are systematically developing Agentic tooling and AI-based frameworks, techniques, and processes that allow us to smartly migrate legacy applications into modern microservices-based architectures at scale. One of the more exciting findings from a recent effort, working with a large enterprise in the insurance sector, was that optimized queries on MongoDB ran just as fast, and often significantly faster, than on their legacy relational database, even when schemas were translated 1:1 between relational tables and MongoDB collections, and lots of nested queries and joins were involved. Batch jobs implemented as complex stored procedures that took several hours to execute on the relational database could be completed in under five minutes, thanks to the parallelism MongoDB natively enables (for more, see the MongoDB Developer Blog ). Based on the incredible performance gains seen in these modernization projects, we're addressing another common need: ensuring fast queries even when data models aren't perfectly optimized. We are actively exploring improvements to our Query Optimizer that will improve lookup and join performance. While the document model will always be the most performant way to model your data, we are ensuring that even when you don't create the ideal denormalized data model, MongoDB will deliver performance that is at par or better than the alternatives. Finally, developers today are often burdened with stitching together multiple services to build modern, AI-powered applications. To simplify this, the platform is expanding far beyond a traditional database, focused on providing a unified developer experience . This includes a richer ecosystem with integrated capabilities like Atlas Search for full-text search, Atlas Vector Search for AI-powered semantic search, and native Stream Processing to handle real-time data. We are already working on our first integrations, and continue to explore how embedding generation as a service within MongoDB Atlas, powered by our own Voyage AI models, can further simplify application development. From niche to necessity MongoDB began its journey as a (seemingly) niche NoSQL database with perceptions and tradeoffs that made it unsuitable for many core business applications. But, through a sustained and deliberate engineering effort, it has delivered the high availability, tunable consistency, ACID transactions, and robust security that enterprises demand. The perceptions of the past no longer match the reality of the present. When 7 of the 10 largest banks are already using MongoDB, isn’t it time to re-evaluate MongoDB for your most critical applications? For more on why innovation requires a modern, AI-ready database—and why companies like Nationwide, Wells Fargo, and The Knot Worldwide chose MongoDB over relational databases— see the MongoDB customer use case site .

September 25, 2025