Artificial Intelligence

Introducing voyage-3.5 and voyage-3.5-lite: Improved Quality for a New Retrieval Frontier

May 20, 2025

How MongoDB Helps Your Brand Thrive in the Age of AI

The Zero Moment of Truth (ZMOT) was coined by Google to describe the moment when a user researches a product online before buying—typically through search, reviews, or videos. In a world where AI agents are intermediating shopping decisions (such as through assistant bots, personal agents, or even procurement AIs), the traditional concept of ZMOT starts to break down, because: The “moment” is no longer directly human. The “truth” might be algorithmically filtered. The user delegates the decision process (partially or fully) to an agent. For retailers, this isn't a minor trend—it’s a "change everything" moment. The traditional customer journey is being radically rewired. For decades, the battle was to win the top spot on a search engine results page. But what happens when the customer isn't a person searching, but is instead an AI agent executing a command like, "Buy me the best-value noise-canceling headphones"? If your brand isn't visible to that agent, you are, for all practical purposes, invisible. The brands that will win in this new landscape are the ones that can make their products and services discoverable and transactable not just by humans, but by AI. This shift presents a profound challenge that goes beyond marketing. Brands are shifting their direct relationship with the customer, handing it over to an AI intermediary. Traditional strategies built for human psychology and search engine algorithms become obsolete when the shopper is an AI agent. The core challenges are therefore immense: How do you build trust with an algorithm? How do you communicate your brand's value in a machine-readable format? And most importantly, how do you ensure your product is the one an agent selects from a sea of competitors? This article is meant to provide you with clarity on what the future of online shopping will look like, how your brand will be affected by this new paradigm and why the MongoDB document model is the best underlying tool for organizing and exposing your product catalog to this upcoming agentic ecommerce era. So, how might we rename or reframe ZMOT for this agent-mediated paradigm? To understand this shift, let's first clarify what we mean by 'agentic AI' and 'agents.' Agentic AI refers to artificial intelligence systems capable of acting autonomously to achieve specific goals on behalf of a user, often by interacting with various tools and services. An 'agent' in this context is the specific AI entity that performs these actions. For example, imagine telling your AI assistant, ' Book me a flight to London next month within a £500 budget, departing in the morning .' An AI agent would then autonomously search, compare, and potentially book the flight for you, acting as your personal delegate. Ever since reading the news of OpenAI naming Instacart’s CEO their new Head of Applications, I haven’t stopped thinking about what this will mean for the world of e-commerce and (yes, I’m a millennial) how the term “googling” came to be and became part of our zeitgeist in the early 2000s. The world of e-commerce is on the brink of a similar paradigmatic shift. For years, brands have poured resources into search engine optimization (SEO), battling for coveted spots on search engine results pages. But what if the search engine as we know it gets disrupted? What if, instead of searching, customers simply ask an AI to find and buy for them? This isn't a far-off futuristic fantasy. It's happening now. With the rise of powerful AI assistants like OpenAI's Improved Shopping Results from ChatGPT Search and the new Operator agent, we are entering a new era of "agentic commerce." This is the Agentic Moment of Truth (AMOT): the precise point at which an autonomous agent, acting on behalf of a user, synthesizes data, context, and intent to make or recommend a purchase decision. For retailers, this is a "change everything" moment. The traditional customer journey, from discovery to purchase, is being radically rewired. The brands that will win in this new landscape are the ones that can make their products and services discoverable and transactable not just by humans, but by AI agents. Figure 1. Evolution of the customer journey thanks to agentic AI. The new customer flow: From ZMOT to AMOT For over a decade, marketers have been obsessed with the ZMOT. But, AI agents are collapsing the ZMOT. Instead of a human spending hours browsing websites, reading reviews, and comparing prices, an AI can do it in seconds. This new customer flow, driven by agents, looks something like this: The prompt: A user gives a natural language command to their AI assistant, like, "Find me the best noise-canceling headphones for under $200 with good battery life." The agent's work: The AI agent, like OpenAI's Operator, goes to work. It doesn't just crawl the web in the traditional sense. It interacts with various services and APIs to gather information, compare options, and make a recommendation. The transaction: Once the user approves the recommendation, the agent can complete the purchase, all without the user ever visiting a traditional e-commerce website. This shift has profound implications for retailers. If your brand isn't "agent-friendly," you're essentially invisible in this new world of commerce. So, how do you make your brand discoverable and transactable by AI agents? The answer is to build a remote MCP server. But what exactly is an MCP server, and what are the operational challenges for an e-commerce business in deploying one? An MCP (Model Context Protocol) server is an open standard that allows AI models to connect to and interact with external tools and data sources. Think of it as a universal language for AI. In our context, think of it as a universal translator that enables AI agents to understand and use your product catalog, inventory, pricing, and even checkout functionalities. While this is suitable for internal agentic applications, how can you provide third-party online agents with real-time, up-to-date, and commercially strategic product data? This is where a remote MCP server , powered by technologies like MongoDB Atlas , becomes not just a nice-to-have, but a mission-critical component of your tech stack. However, creating and deploying such a server generates significant operational challenges for an e-commerce business. You need to manage complex, dynamic data structures for product information, rapidly adapt to new AI agent requirements, ensure your infrastructure can scale globally and reliably, and, critically, protect sensitive customer and product data. By creating your own remote MCP server, you can expose your product catalog, inventory, pricing, and even checkout functionality to AI agents in a structured, machine-readable format, and MongoDB Atlas directly addresses these operational hurdles: Superior architecture (the document model): E-commerce data is inherently varied and complex, with products having diverse attributes. The flexible document model of MongoDB Atlas allows you to store product information in a rich, nested structure that mirrors real-world objects. Innovate faster: With the agility of the document model and MongoDB Atlas's developer-friendly environment, your teams can respond to the dynamic needs of agentic commerce at an unprecedented pace. You can rapidly iterate on how your product data is exposed and consumed by AI agents, testing new features and optimizing agent interactions without time-consuming database migrations or refactoring. This speed is crucial in a fast-evolving AI landscape. Build once, deploy everywhere: E-commerce demands low-latency access for agents and users across diverse geographic locations. MongoDB Atlas offers multi-cloud and multi-region deployment options, allowing you to deploy your remote MCP server and product catalog close to your agents and customers, wherever they are. This global distribution capability minimizes latency and ensures high availability, overcoming infrastructure management complexities and guaranteeing that your brand is always transactable. Built-in enterprise security: Exposing your valuable product catalog and transactional capabilities to AI agents requires robust security. MongoDB Atlas provides comprehensive, built-in enterprise-grade security features, including encryption at rest and in transit, network isolation, fine-grained access controls, and auditing. This ensures that your data is protected from unauthorized access and cyber threats, mitigating the significant security challenges associated with opening your systems to external AI interactions. Why retailers must act now The shift to agentic commerce is not a question of if, but when. The MCP Registry, a public directory for AI agents to discover MCP-compliant servers, is set to launch in the fall of 2025. This will be the "yellow pages" for AI agents, and if your brand isn't listed, you'll be left behind. Discover how MongoDB powers the future of retail and helps brands thrive in the age of AI. Learn more about MongoDB for Retail . Ready to boost your MongoDB skills? Visit the Atlas Learning Hub to get started.

September 9, 2025

Artificial Intelligence

Building an Interactive Manhattan Guide with Chatbot Demo Builder

Question: Know any good spots? Response: Based on the context provided, I can recommend a few good pizza spots in Manhattan: Joe's Pizza (7 Carmine St) - This is described as "one of the most beloved slice joints in the city" with a "thin, crispy crust and always fresh" pizza, perfect for a quick bite. Prince Street Pizza (27 Prince St) - Known for its "thick, square pepperoni slice with those crispy-edged 'roni cups." $1 Pizza Slices - The context mentions that $1 pizza slices can still be found, especially around 6th Ave and 8th St in Greenwich Village. This is described as a "quintessential NYC food experience." That’s from a conversation with a chatbot we built in about 10 minutes. There was no coding, no database setup—just a PDF and the Chatbot Demo Builder tool. What made the experience so valuable wasn’t just getting a good response, but seeing exactly what was happening behind the scenes after submitting our question. 🤖 Meet Chatbot Demo Builder Chatbot Demo Builder is the newest addition to the Atlas Search Playground . This no-code tool gives users an intuitive way to explore MongoDB’s vector search capabilities by building interactive Q&A bots with sample data or their own uploads. Powered by MongoDB Atlas Vector Search and Voyage AI , the Builder automatically handles index creation, query generation, and embedding, removing the common setup barriers that often slow down RAG ( retrieval-augmented generation ) experimentation. What makes the tool particularly powerful is its accessibility. Like all tools in the Atlas Search Playground, it runs entirely in your browser without requiring a MongoDB Atlas account, cluster, or collection. This means you can test ideas, iterate quickly, and share prototypes with teammates and stakeholders, all without spinning up additional infrastructure. With MongoDB.local NYC coming up on September 17, we thought it was the perfect time to put the Chatbot Demo Builder through its paces. So we decided to create something practical: a Q&A chatbot to help visitors explore Manhattan. 🧑‍💻 Building in the browser The entire process happened without leaving our browser. We started by uploading our Manhattan travel guide PDF to the Chatbot Demo Builder. Next, we configured how the chatbot would process our content using the Data Settings modal. For our chunking strategy, we chose recursive chunking with 500-token chunks and 50-token overlap. This preserves paragraph flow while ensuring important information isn’t split awkwardly across boundaries. For the embedding model, we selected voyage-3-large, which excels at general knowledge retrieval tasks. Once configured, we hit "Create Embeddings" and watched as the Builder processed our guide into a demo document collection containing metadata, chunked text, and vector embeddings. Figure 1. Data Settings modal used for chunking configuration and embedding model selection. 📍 Testing like tourists With embeddings generated, we started asking questions like curious visitors: "Where can I find a public restroom near Central Park?" "What are some good day trip ideas?" "What are some fun facts about New York City?" Each query highlighted the Builder's most powerful feature: complete transparency. When we asked about pizza, we could see the exact vector search query that ran, which chunks scored highest, and how the LLM prompt was constructed. This visibility turned experimentation from guesswork into informed iteration. We could understand not just what answers we got, but also why we got them and how to improve them. Figure 2. Vector search query and scored document results for the pizza recommendation question. 🧐 Optimizing for better results Fine-tuning our retrieval settings produced even better outputs. The Builder made these optimizations easily accessible and provided insight into exactly how they would affect results. We started by modifying numCandidates, which controls how many potential matches the system initially examines before selecting the best results. The recommended setting is at least 20 times higher than the limit value, allowing more potential matches to be evaluated before selecting the best ones; this trades a bit of latency for significantly better recall. For even higher precision, the Builder offers an exact nearest neighbor (ENN) search, which calculates distances to all indexed embeddings. While computationally intensive, it guarantees finding the exact nearest neighbors when accuracy is key. Since our Manhattan guide only had 25 documents, we could afford to use ENN without worrying about performance impact. Figure 3. Retrieval settings panel for adjusting search parameters and enabling exact nearest neighbor search. 💡 Sharing and takeaways Once we were happy with the responses, it was easy to share our work. The Builder generated a snapshot link that let the entire team test the chatbot for themselves without any additional setup. In just a few steps, we transformed a static travel PDF into a conversational guide for exploring Manhattan. Along the way, we saw how decisions about chunking strategies, embeddings, and retrieval settings directly affect answer quality. We also gained visibility into what was happening behind the scenes, giving us the insights we needed to optimize these decisions. Figure 4. Output panel tabs displaying data source, vector index, search query, and prompt details By the end, we had a chatbot capable of providing helpful local insights about Manhattan, from day trip ideas to restaurant recommendations, all while giving us complete visibility into how it generated its answers. 🗽 Beyond the big apple Chatbot Demo Builder makes it easy to explore RAG techniques. Whether you're prototyping a customer support bot, building an internal knowledge assistant, or creating an interactive travel guide, Chatbot Demo Builder allows you to gain a clearer understanding of what works best for your AI use case. Ready to get started? Try the Chatbot Demo Builder in the Atlas Search Playground, or check out the official documentation to learn about the other tools available. And if you're joining us in New York for MongoDB.local , consider this your preview of what the city has to offer—courtesy of a chatbot that knows its way around Manhattan.

September 3, 2025

Artificial Intelligence

MongoDB and Hope AI: Craft Enterprise Code with AI

The world of software development is constantly evolving, and the demand for tools that streamline processes, increase efficiency, and enable developers to easily create robust applications continues to rise. For this context, MongoDB and Bit.dev have teamed up to bring a transformative integration to the table, combining MongoDB’s leading database platform with Bit Cloud’s AI-powered, component-based development platform, featuring the Hope AI agent. Bit Cloud is the platform that brings powerful AI and composability capabilities directly to developers through Hope AI. Designed to support smarter, faster development, Hope AI enables architecture planning with control, code generation, collaborative management, and production-ready output. This partnership showcases how Bit Cloud, with Hope AI, empowers developers to innovate efficiently without sacrificing control. Let’s explore the features of Hope AI and see how it transforms the development process. Figure 1. The release process. Going from concept to plan: AI-generated code architecture One of Hope AI’s standout features is its ability to create a code architecture based on user input. This isn’t just about diving straight into coding—Hope AI first provides developers with a clear, visual plan for implementation. Think of it as having an AI architect that listens to your idea, understands your goals, and crafts a tailored blueprint for your application. At this stage, Hope AI does not generate any code; this enables developers to focus on shaping the architecture to fit their unique requirements. Whether they need to make tweaks or add entirely new elements, this phase is highly customizable. Developers can approve the initial plan at their own pace, knowing that any changes are an integral part of the process. Generating code for new or existing applications Once the architecture is approved, Hope AI seamlessly transitions into code generation. This integration is ideal for developers building new applications as well as those enhancing existing ones. For existing applications, adding features is simple—Hope AI can work off the current application and produce code that integrates directly into the existing framework. This versatility positions Hope AI as an innovative tool for projects in all stages of development, helping developers spend less time working around limitations and more time realizing their creative visions. While its current focus is on web application development, Hope AI plans to expand its capabilities to mobile app development in future iterations, making this partnership even more promising for the broader developer community. Gaining full control and ongoing flexibility A major concern with AI-generated code is the possibility of losing control over what’s created. MongoDB and Hope AI address this head-on—developers maintain complete control and can review every line of the AI-generated code. If edits are required, users can make changes directly within the generated code, ensuring the final product aligns precisely with their vision and requirements. Beyond initial creation, Hope AI remains an active participant throughout the development process. Need additional components or features later? The AI is always available to assist, making sure your code evolves alongside your project. Achieving collaboration at the core One of the most exciting features introduced by Hope AI is the ability to “snap” the code. This functionality enables developers to share and manage code with teammates, promoting seamless collaboration. Teams can work together on the project, implement updates, and review progress without any barriers. This collaborative aspect is crucial for modern development teams that thrive on interconnected workflows. Taking a privacy-first approach to code creation Another remarkable feature of Hope AI is its emphasis on privacy. The code generated by Hope AI is 100% private—no sharing for AI model training, no public access to your work. The data is protected, and only the designated project members can access the code. In today’s data-sensitive landscape, this level of privacy is critical. Developers and organizations can trust that their intellectual property is secure and that the AI agent isn’t repurposing their unique codebase for other uses. Integrating MongoDB: Configuring MongoDB Atlas credentials MongoDB plays a crucial role in this collaboration. Hope AI is designed to use MongoDB Community Edition by default, offering developers access to MongoDB’s fast, efficient, and reliable database. However, developers and organizations have the flexibility to choose MongoDB Atlas if they prefer a fully managed, customizable solution. Hope AI fully supports integration with MongoDB Atlas, allowing seamless configuration for those who opt for it. MongoDB Atlas opens up possibilities for advanced database management, including automatic scaling, global data replication, and powerful analytics features. With this level of customization, developers can create applications that meet even the most complex infrastructure demands. Promoting the future of AI-assisted development The collaboration between MongoDB and Hope AI signals more than just a technical partnership; it represents a vision for the future of development. By harnessing the power of AI to streamline coding and by empowering developers to maintain control over their projects, this integration creates an environment where creativity meets efficiency. As Hope AI expands into mobile app support and MongoDB continues to innovate on its platform capabilities, this partnership will likely set the stage for groundbreaking developments that appeal to large-scale organizations. Choosing MongoDB and Hope AI For IT decision-makers, the MongoDB-Hope AI partnership is a win for teams wanting to build faster without compromising quality or security. Here’s why this collaboration stands out: Rapid prototyping and scaling: The AI-powered architecture design and code generation significantly reduce project timelines while safeguarding customization. Secure code management: Privacy-first code generation is designed to prioritize data protection and support security best practices. Enhanced team collaboration: The Snap functionality ensures teamwork thrives, making Hope AI suitable for distributed teams. On-premises availability: The product extends its flexibility by offering an on-premises deployment option, catering to businesses that require hosting within their own infrastructures. Integrated MongoDB services: MongoDB Atlas credentials provide unparalleled database management flexibility while MongoDB Community Edition caters to developers just getting started. Building the future together The integration of MongoDB and Hope AI empowers developers everywhere. Whether you’re designing from scratch, enhancing an existing app, or scaling your team’s efforts across platforms, this partnership promises the tools and capabilities to bring your ideas to life with unprecedented efficiency and control. MongoDB and Hope AI are building the future—and developers are at the center of this exciting transformation. Ready to unlock the full potential of AI-powered development? Visit the MongoDB AI Learning Hub to learn how to begin building AI applications with MongoDB. Connect with Hope AI today to see how MongoDB and Hope AI can transform your ideas into reality.

September 3, 2025

Artificial Intelligence

Streamlining Editorial Operations with Gen AI and MongoDB

Are you overwhelmed by the sheer volume of information and the constant pressure to produce content that truly resonates? Audiences constantly demand engaging and timely topics. As the daily influx of information grows massively, it’s becoming increasingly tough to identify what’s interesting and relevant. Consequently, teams are spending more time researching trends, verifying sources, and managing tools than actually creating compelling stories. This is where artificial intelligence enters the media landscape to offer newer possibilities. Tapping into AI capabilities calls for a flexible data infrastructure in order to streamline content workflows, provide real-time insights, and help teams stay focused on what matters most. In this blog, we will explore how combining gen AI with modern databases, such as MongoDB, can efficiently improve editorial operations. Why are your content ideas running dry? Creative fatigue significantly impacts content production. Content leads face constant pressure to generate fresh ideas under tight deadlines, leading to creative blocks. In fact, a recent report from Hubspot, 16% of content marketers struggle with finding compelling new content ideas . This pressure often compromises work quality due to time constraints, leaving little room for delivering authentic content. Another main hurdle is identifying credible and trending topics quickly. In order to find reliable pieces of information, a lot of time is spent on researching and discovery rather than actual creation. This leads to missed opportunities in identifying what’s trending and reduces the audience engagement as well. This presents a clear opportunity for AI, leveraged with modern databases, to deliver a transformative solution. Using MongoDB to streamline content operations MongoDB provides a flexible, unified storage solution through its collections for modern editorial workflows. The need for a flexible data infrastructure Developing an AI-driven publishing tool necessitates a system that can ingest, process, and structure a high volume of diverse content from multiple sources.. Traditional databases often struggle with this complexity. Such a system demands the ability to ingest data from many sources, dynamically categorize content by industry, and perform advanced AI-enabled searches to scale applications. Combining flexible document-oriented databases with embedding techniques transforms varied content into structured, easily retrievable insights. Figure 1 below illustrates this integrated workflow, from raw data ingestion to semantic retrieval and AI-driven topic suggestions. Figure 1. High-level architectural diagram of the Content Lab solution, showing the flow from the front-end through microservices, backend services, and MongoDB Atlas to AI-driven topic suggestions. Raw data into actionable insights We store a diverse mix of unstructured and semi-structured content in dedicated MongoDB collections such as news, Reddit posts, suggestions, userProfiles, and drafts, organized by topic, vertical (e.g., business, health), and source metadata for efficient retrieval and categorization. These collections are continuously updated from external APIs like NewsAPI and Reddit, alongside AI services (e.g., AWS Bedrock, Anthropic Claude) integrated via backend endpoints. By leveraging embedding models, we transform raw content into organised, meaningful data, stored in their specific categories (e.g., business, health) in the form of vectors. MongoDB Atlas Vector Search and Aggregation Pipeline enables fast semantic retrieval, allowing users to query abstract ideas or keywords and get back the most relevant, trending topics ranked by a similarity score. Generative AI services then draw upon these results to automate the early stages of content development, suggesting topics and drafting initial articles to substantially reduce creative fatigue. From a blank page to first draft – With gen AI and MongoDB Once a user chooses a topic, they’re taken to a draft page, as depicted in the third step of Figure 2. Users are then guided by a large language model (LLM)-based writing assistant and supported by Tavily’s search agent, which pulls in additional contextual information. MongoDB continues to handle all associated metadata and draft state, ensuring the user’s entire journey stays connected and fast. Figure 2. Customer flow pipeline & behind-the-scenes. We also maintain a dedicated userProfiles collection, linked to both the drafts and chatbot systems. This enables dynamic personalization so, for example, a Gen Z user receives writing suggestions aligned with their tone and preferences. This level of contextual adaptation improves user engagement and supports editorial consistency. User-generated drafts are stored as new entries in a dedicated drafts collection. This facilitates persistent storage, version control, and later reuse which is essential for editorial workflows. MongoDB’s flexible schema lets us evolve the data model as we add new content types or fields without migrating data. Solving the content credibility challenge Robust data management directly addresses the content credibility. When we generate topic suggestions, we capture and store the source URLs within MongoDB, embedding these links directly into the suggestion cards shown in the UI. This allows users to quickly verify each topic’s origin and reliability. Additionally, by integrating Tavily, we retrieve related contextual information along with their URLs, further enriching each suggestion. MongoDB’s efficient handling of complex metadata and relational data ensures that editorial teams can consistently and confidently vet content sources, delivering trustworthy, high-quality drafts. By combining Atlas Vector Search, flexible collections, and real-time queries, MongoDB assists greatly in building an end-to-end content system that’s agile, adaptable and intelligent. The next section shows how this translates into a working editorial experience. From raw ideas to ready stories: Our system in action With our current solution, the editorial teams can rapidly transition from scattered ideas to structured, AI-assisted drafts, all within a smart, connected system. The combination of generative AI, semantic search, and flexible data handling enables the workflow to become faster, more spontaneous and less dependent on manual effort. Consequently, the system focuses back on creativity as it becomes convenient to discover relevant topics from verified sources and produce personalised drafts. Adaptability and scalability become the essential factors in developing intelligent systems that can produce great results within the content scope. As editorial demands grow constantly, it necessitates an infrastructure that can ingest diverse data, produce insights, and assist in real-time collaboration. This system illustrates how AI coupled with a flexible, document-oriented backend can assist teams to reduce fatigue, enhance quality and accelerate the production without increasing difficulty. It’s not just about automation; it’s about providing a more focused, efficient, and reliable path from idea to publication. Here are a few next steps to help you explore the tools and techniques behind AI-powered editorial systems: Dive Deeper with Atlas Vector Search : Explore our comprehensive tutorial to understand how Atlas Vector Search empowers semantic search and enables real-time insights from your data. Discover Real-World Applications: Learn more about how MongoDB is transforming media operations by reading the AI-Powered Media article. Check out the MongoDB for Media and Entertainment page to learn more about how we meet the dynamic needs of modern media workflows.

August 26, 2025

Artificial Intelligence

New Benchmark Tests Reveal Key Vector Search Performance Factors

Search at scale is challenging. As powerful as vector search is, it can be tough to know how to properly weigh key factors like accuracy, cost, and throughput for larger workloads. We recently released the MongoDB Benchmark for Atlas Vector Search , which outlines crucial performance optimization strategies for vector search, providing a comprehensive guide to achieving optimal results with large-scale datasets. The primary goal of our guide is to significantly reduce friction for your first vector test at scale (>10M vectors) when evaluating performance for Atlas Vector Search. With this new guide, our aim is to provide more context around how to use the benchmark, to explore the dataset (including factors considered), and to summarize and contextualize the results. Let’s take a closer look! A note on benchmarking data Every good presentation includes the requisite safe harbor slide, and the art and science of benchmarking is no different. Embarking on a large-scale vector workload can present significant hurdles stemming from a lack of accurate information and the inherent friction of initial benchmarks. Furthermore, the landscape of vector search and embedding models is rapidly evolving, and information can become outdated quickly, leading users down inefficient or incorrect paths. Without clear, up-to-date guidance, users can struggle to predict system behavior, optimize configurations, and confidently allocate resources. It’s also worth noting that numerous factors (quantization, dimensionality, filtering, search node configuration, concurrency, sharding, and more) interact in complex ways. Understanding these interactions and their specific impact on a particular workload requires deep, accurate insights. Without this, users might optimize one aspect only to inadvertently degrade another. This informational vacuum—coupled with the considerable setup overhead, complex parameter tuning, and the cost of experimentation involved in running the first benchmark—creates a substantial barrier to proving out and scaling a solution. Nonetheless, we feel that these benchmarks provide confidence in POCs for our customers and give them a starting point to work with (as opposed to having no compass to start with). With these factors in mind, let's jump into an overview of the dataset. A look at the dataset The core of this performance analysis revolves around tests conducted on subsets of the Amazon Reviews 2023 dataset, which contained 48M item descriptions across 33 product categories. The dataset was chosen due to the ability to provide a realistic, large-scale e-commerce scenario, as well as offering rich data, including user reviews (ratings, text, helpfulness votes), item metadata (price, images), and detailed item names and descriptions, which are ideal to search over. For the variable dimension tests, subsets of 5.5 million items were used, embedded with voyage-3-large to produce 2048-dimensional vectors. Views were then created to slice these into 1024, 512, and 256-dimensional vectors for testing different dimensionalities. For the large-scale, high-dimensional test, a 15.3 million-item subset—also embedded with 2048-dimensional vectors from voyage-3-large —was used. One of the key takeaways from the report is that at the highest dimensionality (15.3M vectors using voyage-3-large embeddings at 2048 dimensions), Atlas Vector Search with scalar or binary quantization configured retains 90–95% accuracy with less than 50ms of query latency. One item of note is that binary quantization can have higher latency when the number of candidates requested is in the hundreds due to the additional cost of rescoring with full-fidelity vectors, but still might be preferable for many large scale workloads due to cost effectiveness. Figure 1. Binary versus scalar quantization performance. Methodology: Benchmarking with the Amazon reviews dataset Now that we talked a little bit about the data itself and the information included, let’s outline some of the key factors that impact performance for Atlas Vector Search, and how we configured our benchmark to test them. It's also important to acknowledge why these variables are critical: Not every customer will be optimizing their search for the same thing. With that in mind, we will also attempt to identify the interplay and trade-offs between them. While this list is not exhaustive ( see the full report for more details), let’s review some of the key performance factors: Recall : Recall (a measure of search accuracy) is significantly impacted by quantization and vector dimensionality. The report highlights that while scalar quantization generally starts with higher recall, binary quantization can approach similar accuracy levels by increasing numCandidates, though this often incurs higher latency due to an additional rescoring step. Furthermore, higher-dimensional vectors (1024d and 2048d) consistently maintain better recall, especially with larger datasets and quantization, compared to lower dimensions (256d and 512d), which struggle to exceed 70-80% recall. Sizing and cost : The table in the benchmark details the resources required (RAM, storage) and associated costs for different search node tiers based on three different test cases involving varying dataset sizes, vector dimensions, and quantization methods (scalar or binary). The guide provides an example of a sample dataset noting the resource requirements scale linearly, noting how quantization reduces memory requirements substantially. Concurrency and throughput : Throughput is evaluated with multiple requests issued concurrently. Scalar quantization generally achieves higher queries per second (QPS) across various limit values due to less work per query and no rescoring. Concurrency bottlenecks are often observed, indicating that higher latency can occur. Scaling out the number of search nodes or increasing available vCPUs is recommended to resolve these bottlenecks and achieve higher QPS. Figure 2. Node tiers for different test cases. Optimizing your vector search performance This benchmark report thoroughly examines the performance of MongoDB Atlas Vector Search across various configurations and large datasets, specifically the Amazon Reviews 2023 dataset. It explores the impact of factors such as quantization (scalar and binary), vector dimensionality, filtering, search node configurations, binData compression, concurrency, and sharding on recall, latency, and throughput. While there is never a “silver bullet” due to everyone’s definition of search “success” being different, we wanted to highlight some of the various levers to consider, and methods to get the most out of your own deployment. Our goal is to provide some key considerations for how to evaluate and improve your own vector search performance, and help you to properly weigh and contextualize the key factors. Ready to optimize your vector search experience? Explore the guide in our documentation . Run it yourself with our GitHub repo .

August 21, 2025

Artificial Intelligence

Powering Long-Term Memory for Agents With LangGraph and MongoDB

We're excited to introduce the MongoDB Store for LangGraph—a powerful integration that brings flexible and scalable long-term memory to AI agents. This new integration between MongoDB and LangGraph , LangChain’s open-source agent orchestration framework, allows agents to remember and build on previous interactions across multiple sessions instead of only retaining memory for the current session. The result is more intelligent, context-aware agentic systems that learn and improve over time. This new integration complements MongoDB’s existing checkpointer integration, which handles short-term memory and persistent conversation history. Together, the MongoDB Store for LangGraph and MongoDB’s support for checkpointers provide a complete solution for building production-ready, memory-enabled agents. The need for agent memory An AI agent is a system designed to take actions or make decisions based on input, often using tools and reasoning to complete tasks. By default, agents don’t retain memory between conversations, which severely constrains what they can accomplish. Agent memory (and memory management) is a computational exocortex for AI agents. It is a dynamic, systematic process that integrates an agent’s large language model (LLM) memory (context window and parametric weights) with a persistent memory management system to encode, store, retrieve, and synthesize knowledge and experiences. Agent memory is typically divided into two main types: short-term memory and long-term memory. In a memory context, LangGraph uses “threads” to represent individual conversations or sessions. Short-term memory, managed through thread-scoped checkpointers that MongoDB supports, maintains context within a given session. While this preserves conversation continuity and manages history, it doesn’t help agents learn continuously from the past across different conversations to adapt and optimize their behavior over time. This is why we introduced the MongoDB Store for LangGraph, enabling your agents to retain memories across conversations through a cross-thread memory store. Figure 1. Short and long-term memory integration between LangGraph and MongoDB. MongoDB Store: Enabling cross-thread long-term memory The new langgraph-store-mongodb package introduces a MongoDBStore class. Available now through PyPI , this production-ready integration provides: Cross-thread persistence: Store and recall information across different conversation sessions and user interactions, allowing agents to build on previous knowledge. Native JSON structure: LangGraph stores long-term memories as JSON documents, which map directly to MongoDB documents. Each memory is organized using namespaces and a key-value structure. This enables the usage of MongoDB’s native and optimized data formats and search capabilities for efficient retrieval. Vector Search capabilities: Leverage MongoDB Atlas Vector Search for semantic memory retrieval based on meaning, not just keyword matching. Asynchronous support: Support for both synchronous and asynchronous operations for high-performance applications. Automatic connection management: Robust connection pooling and error handling to ensure reliability. Optimized TTL indexes: MongoDB’s Time-to-Live (TTL) indexes are integrated with LangGraph’s TTL system, allowing automatic removal of stale or outdated data. This improves retrieval performance, reduces storage costs, and ensures the system "forgets" obsolete memories efficiently. Ready to give your AI agents persistent long-term memory? The langgraph-store-mongodb package is available now: pip install langgraph-store-mongodb The MongoDB Store for LangGraph enables developers to build more powerful agents for different use cases, including: Customer support agents: Build agents that remember customer preferences, past issues, and resolution patterns across multiple support channels. Personal assistant applications: Build agents that learn user habits and preferences to provide increasingly personalized experiences. Enterprise knowledge management: Create agents that accumulate organizational knowledge and can retrieve relevant information semantically. Multi-agent systems: Enable agent teams to share learned experiences and coordinate through persistent memory. Why MongoDB for agent memory? Effective agentic memory requires comprehensive mechanisms for storing, retrieving, updating, and deleting memories. MongoDB Atlas provides a unified database that meets all these complex requirements: Flexible document model: Store complex, nested memories as rich JSON, matching how agents naturally read, organize, and update evolving information. Semantic search: Native vector search enables retrieval by meaning, not just exact matches. State-of-the-art models: Voyage AI provides embedding models and rerankers for cutting-edge memory retrieval. Scalable architecture: Distributed architecture, workload isolation, autoscaling, and automatic sharding capabilities for scaling AI agent memory. Enterprise security: Fine-grained role-based access control (RBAC) allows precise management of both access scope (specific services or databases) and access type (read-only or read-write). MongoDB Atlas and LangChain: A complete solution for AI agent memory Short-term memory provides an agent with immediate context, current conversation state, prior exchanges within that session, or shared memory for coordination in multi-agent systems. The most common form of short-term memory is working memory—an active, temporary context accessible during a session. MongoDB's integration with LangGraph checkpointers supports this by persisting and restoring conversation states. Other short-term memory implementations include semantic caches, such as using MongoDB's semantic cache integration with LangChain , which stores recent prompts and LLM responses for retrieval when similar queries occur. Shared memory is also used in multi-agent systems to provide a common space for coordination and information sharing. Long-term memory serves as the agent’s knowledge base, storing diverse kinds of information for future use. It includes several functional types, each requiring specific storage and retrieval strategies: Episodic memory: captures specific events and interactions, such as conversation history or summaries of key occurrences with metadata (e.g., timestamps, participants). For instance, a customer support agent can use this to recall a user’s past issues and offer personalized responses. Procedural memory: records instructions or rules for recurring tasks. A typical implementation is a social content generator agent that remembers past feedback on writing style and formatting to improve its process. Semantic memory: remembers general knowledge, facts, and concepts. This is often implemented through retrieval-augmented generation (RAG), where data is stored as vector embeddings and retrieved based on semantic similarity. Associative memory: stores key entities and relationships between different pieces of information, enabling an agent to identify patterns and make inferences by navigating these connections. It's often implemented using graph structures that support efficient exploration of relationships. One practical approach is GraphRAG . The MongoDB Store for LangGraph supports these memory types through flexible filtering and semantic search, making it a versatile approach for building reliable long-term memory in agents. LangChain also provides LangMem, a toolkit featuring pre-built tools designed specifically for extracting and managing procedural, episodic, and semantic memories. LangMem integrates natively with LangGraph, streamlining the memory engineering process. For developers seeking a straightforward approach to using various memory types with MongoDB, explore this comprehensive tutorial for implementing MongoDB alongside LangGraph and LangMem . The future of intelligent agents With the new MongoDB Store for LangGraph, we're enabling developers to build AI agents that can learn and adapt. Agents that remember user preferences, learn from mistakes, and build knowledge over time will transform how we interact with AI systems. The combination of LangGraph's sophisticated orchestration capabilities with MongoDB's flexible, scalable storage creates unprecedented opportunities for building intelligent, persistent AI agents that feel truly alive and responsive. Ready to build memory-enabled agents with LangGraph and MongoDB Atlas? Get started with the documentation .

August 20, 2025

Artificial Intelligence

Building an Agentic AI Fleet Management Solution

Artificial intelligence is revolutionizing the manufacturing and motion industry, with AI-powered solutions now capable of delivering precise, real-time insights that can optimize everything from route planning to predictive maintenance. Modern vehicles can generate an overwhelming amount of data—nearly 25 GB per hour, through a diverse range of sensors, according to an article from S&P Global Mobility. Contextualizing this data with user feedback, maintenance records, and technical knowledge becomes increasingly challenging as the system scales. These complexities can create inefficiencies, introduce overhead while processing data, and drive up operational costs, hindering the full potential of AI-driven systems. An efficient fleet management architecture can address these problems by reducing redundancies, optimizing data retrieval processes, and enabling the seamless integration and use of embeddings. MongoDB’s flexible document model fits perfectly to this approach. Unlike legacy SQL databases, MongoDB excels at managing unstructured, semi-structured, and structured data. This capability allows fleet management software to ingest and process diverse data types, including vehicle signal data, geospatial zones, fleet configurations, query logs, route telemetry, maintenance records, and real-time performance scores. In this post, we will use various MongoDB Atlas features—such as geospatial query operations, time-series collections, Atlas Charts, and aggregation pipelines—to create an agentic AI-powered fleet management system. This system demonstrates how an AI agent can enable intelligent data processing, providing real-time, context-aware responses to user queries in a streamlined manner. Fleet management software with AI overview A traditional fleet management system provides features like resource planning, route optimization, and maintenance scheduling which work together to improve Cost Management, Regulatory Compliance, and Overall Operational Effectiveness (OEE). Our solution harnesses the power of MongoDB's flexible document schema, time-series collections, and geospatial query support to give fleet managers the ability to query, filter and operate on data effectively. Additionally, an AI Agent assists users in obtaining actionable insights through a chat-driven interface. Figure 1. Architecture of the solution. The AI agent has a chatbot UI. The data captured by the agent is used to trigger an orchestration service which then calls various tools as required and gets data from MongoDB in order to complete its task. In Figure 1, the telemetry data from our cars is stored in MongoDB in time series collections via microservices. In addition to the telemetry data we store stationary car information (e.g., brand, model, year, VIN, among others) and user configurations, such as past queries and fleet settings. All of this data is leveraged by the agentic system to answer user queries and provide deeper insights for future references to similar queries. Figure 2 shows the user interface of the agentic system where queries can be submitted directly. Filters allow users to narrow results by fleet, time range, or geozone, while the AI Agent delivers answers using real-time and historical data. Figure 2. Demo chat section. When a user inputs a question into the chat box, the AI Agent analyzes it by embedding the query into metadata and searching for similar prior questions in the historical recommendations collection. Depending on the tools required, the system accesses contextual data across collections, such as time-series metrics, geospatial locations, or maintenance logs, through aggregation pipelines. Once the relevant data is assembled, the AI synthesizes the information into actionable insights, providing the user with an accurate and informative response. MongoDB features for a fleet management system RAG framework with MongoDB Vector Search Agents powered by retrieval-augmented generation (RAG) are transforming fleet management systems by seamlessly integrating real-time contextual information during response generation. MongoDB’s flexible NoSQL model complements RAG by embedding fast, low-latency document data. Combined with Voyage AI’s cost-efficient embedding model, MongoDB accelerates vector search workflows for smarter decision-making MongoDB’s Atlas Vector Search empowers the agent to operate proactively by connecting user queries with relevant insights stored in the database. For instance, when a fleet manager asks about the current positions of vehicles, the agent leverages MongoDB’s vector search to match the query against historical recommendations. If similar queries already exist, the agent retrieves pre-existing results instantly, reducing both latency and operational costs. In situations where no matching results are found, the agent complements vector search by invoking LLMs to dynamically generate answers, ensuring fleet managers receive accurate and actionable responses. This streamlined workflow, powered by MongoDB’s unique combination of vector search and flexible data modeling, allows fleet managers to act on real-time, context-aware insights. From analyzing geospatial patterns to addressing systemic vehicle issues, MongoDB enables the agent to simplify complex decision-making while maintaining efficiency. By combining predictive AI capabilities with an optimized, scalable database, this solution transforms fleet management into a more proactive, data-driven process. Polymorphysm MongoDB’s document model allows storing polymorphic data structures within the same collection, meaning documents can vary in structure and embed other documents. This flexibility enables our demo to optimize workflows by storing application-specific metadata tailored to fleet operations. For instance, the historical_recommendations collection stores query and recommendation histories generated by the system’s AI engine, with the ability to embed metadata dynamically, such as the initial question asked, the tool chosen, and the results it got. This enables improved context for future queries by streamlining read operations, and giving more context for our AI agent. For example, a document in this collection might appear as follows: Figure 3. Document model of historical_recommendations. This variability in structure without sacrificing efficiency enables MongoDB to adapt to dynamic data storage requirements inherent in polymorphic workflows. By embedding detailed context and avoiding null values, the system can streamline read operations and provide richer context to the AI agent for future queries. Time series collections MongoDB's time series collections simplify working with time series data. These specialized collections provide several benefits, including automatic creation of compound indexes for faster data retrieval, reduced disk usage, and lower I/O overhead for read operations. This makes time series collections highly efficient for managing time-stamped data, such as a constant stream of sensor data from vehicles in our application. With these capabilities, fleet managers can enable near real-time access to data, empowering AI agents to rapidly extract actionable insights for fleet management. In this demo, MongoDB optimizes query efficiency in our time series collections using its bucketing mechanism. This mechanism groups multiple data points within the same time range into compressed blocks, reducing the number of documents scanned during queries. This results in documents scanned during queries. By grouping multiple data points within the same time range, bucketing minimizes read operations and disk usage, enabling faster range queries and ensuring sustained, optimized cluster performance, even under a humongous load. GeoSpatial queries MongoDB’s native support for geospatial queries enables seamless integration of robust location-based functionalities. The ability to handle complex geographic data is a powerful tool for industries relying on real-time location-based decision-making. In our demo, this capability is leveraged to locate vehicles under various conditions, such as identifying vehicles near or inside a specified geofence, while being able to filter by maximum or minimum distance. Also, geospatial queries can be incorporated directly into aggregation pipelines, enhancing AI-driven workflows powered by our AI Agent. Key takeaways MongoDB enables fleet managers to efficiently gather, process, and analyze data to uncover actionable insights. These capabilities empower managers to optimize operations, enhance vehicle oversight, and implement smarter, data-driven strategies that drive efficiency and performance. Visit MongoDB Atlas to start modernizing your fleet management system. Ready to transform your fleet management operations? Unlock real-time insights, optimize systems, and make smarter decisions with MongoDB’s advanced features. If you're interested in exploring how MongoDB enables intelligent fleet management, check out our Leafy Fleet GitHub repository. Access the Leafy Fleet on GitHub . Additionally, dive deeper into best practices for modeling connected vehicle signal data and learn how MongoDB’s flexible data model simplifies telemetry management at scale. Read the blog post .

August 19, 2025

Artificial Intelligence

Unlock Multi-Agent AI Predictive Maintenance with MongoDB

The manufacturing sector is navigating a growing number of challenges: evolving customer demands, intricate software-mechanical product integrations, just-in-time global supply chains, and a shrinking skilled labor force. Meanwhile, the entire sector is working under intense pressure to improve productivity, manage energy consumption, and keep costs in check. To stay competitive, the industry is undergoing a digital transformation—and data is at the center of that shift. Data-driven manufacturing offers a powerful answer to many of these challenges. On the shop floor, one of the most critical and high-impact applications of these strategies is predictive maintenance. Downtime isn’t just inconvenient—it’s expensive. For example, every unproductive hour in the automotive sector now costs $2.3 million (according to Siemens "The True Cost of Downtime 2024" report). For manufacturers across all sectors, predictive maintenance is no longer optional. It’s a foundational pillar of operational excellence. At its core, predictive maintenance is about using data to anticipate machine failures before they happen. It began with traditional statistical models, evolved with machine learning, and is now entering a new era. As equipment ages and failure behaviors shift, models must adapt. This has led to the adoption of more advanced approaches, including generative AI with retrieval-augmented generation (RAG) capabilities. But the next frontier is multi-agent systems—AI-powered agents working together to monitor, reason, and act. We’ve explored how generative AI powers predictive maintenance in previous posts. In this blog post, we’ll go deeper into multi-agent systems and how MongoDB makes it easy to build and scale them for smart, responsive maintenance strategies. Advance your data-driven manufacturing strategy with Agentic AI AI agents combine large language models (LLMs) with tools, memory, and logic to autonomously handle complex tasks. On the shop floor, this means agents can automate inspections, reoptimize production schedules, assist with fault diagnostics, and more. According to a LangChain survey , 78% of companies are actively developing AI agents, and over half already have at least one agent in production. Manufacturing companies can especially benefit from agentic capabilities across a great variety of practical use cases, as shown in Figure 1. Figure 1. Agent capabilities and related practical use cases in manufacturing. But leveraging AI agents in industrial environments presents unique challenges. Integration with industrial protocols like Modbus or PROFINET is complex. Governance and security requirements are strict, especially when agents interact with production equipment. Latency is also a concern as AI models need fast, reliable data access to support real-time responses. And with agents generating and consuming large volumes of data, companies need a data foundation that is reliable and can scale without sacrificing performance. Many of these challenges are not new to manufacturers—and MongoDB has a proven track record of addressing them. Industry leaders in manufacturing and automotive trust MongoDB to power critical IoT and telemetry use cases. Bosch , for example, uses MongoDB to store, manage, and analyze huge amounts of data to power its Bosch IoT Insights solution. MongoDB’s flexible document model is ideal for diverse sensor inputs and machine telemetry, while allowing systems to iterate and evolve quickly. It’s important to remember that, at its core, MongoDB was built for change, so when it comes to integrating AI in the shopfloor, it’s no surprise that MongoDB is emerging as the ideal data layer foundation. Companies like Novo Nordisk and Cisco rely on MongoDB to build and scale their AI capabilities, and leading platforms like XMPro APEX AI leverage MongoDB Atlas to create and manage advanced AI agents for industrial applications. MongoDB Atlas makes it easy to build AI Agents and operate them at scale. As both a vector and a document database, Atlas supports various search methods for agentic RAG, while also enabling agents to store short and long-term memory in the same database. The result is a unified data layer that bridges industrial IoT and agentic AI. Predictive maintenance is a perfect example of how these capabilities come together to drive real impact on the shop floor. In the next section, we’ll walk through a practical blueprint for building a multi-agent predictive maintenance system using MongoDB Atlas. Building a multi-agent predictive maintenance system This solution demonstrates how to build a multi-agent predictive maintenance system using MongoDB Atlas, LangGraph, and Amazon Bedrock. This system can streamline complex processes, such as detecting equipment anomalies, diagnosing root causes, generating work orders, and scheduling maintenance. At a high level, this solution leverages MongoDB Atlas as the unified data layer. LangGraph provides the orchestration layer, enabling graph-based coordination among agents, while Amazon Bedrock powers the underlying foundational models used by the agents to reason and make decisions. The architecture follows a supervisor-agent pattern. The supervisor coordinates tasks and delegates to three specialized agents: Failure agent , which performs root cause analysis and generates incident reports. Work order agent , which drafts maintenance work orders with detailed requirements. Planning agent , which identifies the optimal time slot for the maintenance task based on availability and production constraints. Figure 2. High-level architecture of a multi-agent predictive maintenance system. This modular design enables the system to scale easily and adapt to different operational needs. Let’s walk through the full process in four key steps. Step 1: Failure prediction kicks off the agentic workflow The process begins with an alert—something unusual in the machine data or logs that could point to a potential failure. MongoDB provides a unified view of operational data, real-time processing capabilities, and seamless compatibility with machine learning tools. Sensor data is processed in real-time using Atlas Stream Processing integrated with ML inference models. Features like native support for Time Series data and Online Archive facilitate managing telemetry data at scale efficiently. All while the downstream applications remain up to date with the latest notifications and dashboards by using Atlas Triggers , Change Streams , and Atlas Charts . From there, the supervisor agent takes over and coordinates the next steps. Figure 3. End-to-end failure prediction process that generates the alerts. Step 2: Leverage your data for root cause analysis The supervisor notifies the Failure Agent about the alert. Manual diagnostics of a machine can take hours—sifting through manuals, historical logs, and environmental data. The AI agent automates this process. It collects relevant documents, retrieves contextual insights using Atlas vector search, and analyzes environmental conditions stored in the database—like temperature or humidity at the time of failure. With this data, the agent performs a root cause analysis and proposes corrective actions. It generates a concise incident report and shares it with the supervisor agent, which then moves the workflow forward. Figure 4. Failure Agent performing root cause analysis. Step 3: Work order process automation The Work Order Agent receives the incident report and drafts a comprehensive maintenance work order. It pulls from previous similar tasks to estimate time requirements, identify the necessary materials, and ensure the right skill sets are listed. All of this is pre-filled into a standardized work order template and saved back into MongoDB Atlas. This step also includes a human-in-the-loop checkpoint. Technicians or supervisors can review and modify the draft before it is finalized. Figure 5 Work Order Agent is generating a draft work order and routing it for human validation. Step 4: Finding the optimal maintenance schedule Once the work order is approved, the Planning Agent steps in. Its task is to schedule the maintenance activity without disrupting production. The agent queries the production calendar, checks staff shift schedules, and verifies inventory availability for required materials. It considers alert severity and rescheduling constraints to find the most efficient time slot. Once the optimal window is identified, the agent sends the updated plan to the scheduling system. Figure 6. Planning Agent is evaluating constraints to identify the optimal maintenance schedule. While we focused on a predictive maintenance work flow, this architecture can be easily extended. Need agents for compliance reporting, spare parts procurement, or shift planning? No problem. With the right foundation, the possibilities are endless. Unlocking manufacturing excellence with Agentic AI Agentic AI represents a new chapter in the evolution of predictive maintenance, enabling manufacturers to move from reactive responses to intelligent, autonomous decision-making. By combining AI agents with real-time telemetry and a unified data foundation, teams can reduce downtime, cut maintenance costs, and boost equipment reliability. But to work at scale, these systems need flexible, high-performance infrastructure. With native support for time series data, vector search, stream processing, and more, MongoDB makes it easier to build, operate, and evolve multi-agent solutions in complex industrial environments. The result is smarter operations, greater resilience, and a clear path to manufacturing excellence. Clone the GitHub repository if you are interested in trying out this solution yourself. To learn more about MongoDB’s role in the manufacturing industry, please visit our manufacturing and automotive webpage .

August 18, 2025

Artificial Intelligence

rerank-2.5 and rerank-2.5-lite: Instruction-Following Rerankers

Note to readers: rerank-2.5 and rerank-2.5-lite are available through the Voyage AI APIs directly. For access, sign up for Voyage AI . TL;DR – We are excited to introduce the rerank-2.5 series, which significantly improves upon rerank-2’s performance while also introducing instruction-following capabilities for the first time. On our standard suite of 93 retrieval datasets spanning multiple domains, rerank-2.5 and rerank-2.5-lite improve retrieval accuracy by 7.94% and 7.16% over Cohere Rerank v3.5. Furthermore, the new instruction-following feature allows users to steer the model’s output relevance scores using natural language. On the Massive Instructed Retrieval Benchmark (MAIR), rerank-2.5 and rerank-2.5-lite outperform Cohere Rerank v3.5 by 12.70% and 10.36%, respectively, and by similar margins on our in-house evaluation datasets. Both models now support a 32K token context length – 8x that of Cohere Rerank v3.5 and double that of rerank-2 – enabling more accurate retrieval across longer documents. Rerankers are a critical component in sophisticated retrieval systems, refining initial search results to deliver superior accuracy. Today, we are excited to announce rerank-2.5 and rerank-2.5-lite. Both models outperform LLMs as rerankers – a topic which we will dive deeper into in an upcoming blog post. These models are the product of an improved mixture of training data and advanced distillation techniques from our larger, in-house instruction-following models. Both rerank-2.5 and rerank-2.5-lite now support a 32K token context length, an 8x increase over Cohere Rerank v3.5. This allows for the reranking of much longer documents without truncation and comes with no change in pricing. For an introduction into rerankers, check out our previous post . Instruction-following capability A key feature of the rerank-2.5 series is its instruction-following capability. This allows users to dynamically steer the reranking process by providing explicit instructions alongside their query. These instructions can define the user’s notion of relevance or specify the desired characteristics of the documents to be retrieved. Leveraging the new instruction-following capability is straightforward. Users can simply append or prepend natural language instructions to their queries. The model is designed to understand these instructions and adjust the output relevance score accordingly. Examples of instructions - Instructions can include, but are not limited to, the following examples: Emphasizing query components: Specify which parts of a document are most important. For an application that searches academic papers, a standing instruction could be “Prioritize the title and ignore the abstract” to consistently surface the most relevant research based on titles. Defining document types: Direct the reranker to retrieve a specific type of document (e.g., for the query “legal implications of AI,” an instruction could be “Retrieve regulatory documents and legal statutes, not court cases.”). For example, a legal research tool could be configured with the instruction “Retrieve regulatory documents and legal statutes, not court cases” to ensure that all queries prioritize statutory law over case law. Disambiguating queries with contexts: Provide complementary information so that ambiguous queries can be clarified. For example, an instruction could be “This is an e-commerce application about cars” so that the word “Jaguar” will be interpreted as the car brand rather than as an animal. Concrete examples of instructions and the impact of instructions on search results are available in Appendix A. Accuracy gains from instruction following: The instruction-following feature is particularly useful for search/retrieval tasks where user intent can be nuanced. To demonstrate this, we built a set of in-house evaluation datasets composed of 24 domain-specific instruction-following datasets across 7 domains (web, tech, legal, finance, conversational, medical, and code). On domain-specific data, the accuracy of rerank-2.5 and rerank-2.5-lite is increased by an average of 8.13% and 7.55%, respectively, when leveraging instructions. Figure 1. Accuracy of rerank-2.5 with and without instructions for domain-specific instruction. Domain-specific instruction following results. Evaluation details Datasets: For standard results without instruction following, we conducted an evaluation across 9 domains: technical documentation, code, law, finance, web reviews, multilingual, long documents, medical, and conversations. The multilingual domain is composed of 51 datasets from 31 languages. Detailed information about each of the domains and languages can be found in the rerank-2 release blog . To evaluate instruction-following capabilities, we utilize a set of in-house domain-specific and real-world instruction-following datasets (detailed in the previous section) as well as the MAIR (Massive Instructed Retrieval) benchmark , an academic benchmark with task-specific instructions in domains such as web, legal, and biomedical search. Method and metrics: We evaluate the retrieval quality of various rerankers on top of four first-stage search methods: (1) lexical search with BM25, (2) OpenAI v3 large (text-embedding-3-large), (3) voyage-3-large, and (4) voyage-3.5. For each query, the first-stage method retrieves up to 100 candidate documents. The reranker then re-orders these documents, and we retrieve the top 10. We report the normalized discounted cumulative gain (NDCG@10), the standard metric for retrieval quality. Baselines: We compare our models against rerank-2-lite, rerank-2, Cohere Rerank 3.5, and Qwen3-Reranker-8B. Results rerank-2.5 and rerank-2.5-lite collectively set a new cost-to-performance frontier. Specifically, rerank-2.5 outperforms rerank-2 by 1.85% at the same price per token, while rerank-2.5-lite outperforms rerank-2-lite by 3.40% at the same price per token. Furthermore, rerank-2.5-lite performs better than Qwen3-Reranker-8B, the best open source reranker, despite being over an order of magnitude smaller. Figure 2. Retrieval quality versus price per million tokens for rerankers. We use $0.10 for Qwen3-Reranker-8B following the industry standard for 8B-parameter models. Real-world instruction following: In addition to the 24 domain-specific instruction-following datasets, we also curated 3 instruction-following datasets from real-world applications. Evaluating on these datasets shows that the accuracy of rerank-2.5 and rerank-2.5-lite is increased by an average of 11.48% and 7.83%, respectively, when leveraging instructions. Figure 3. Accuracy of Voyage AI with and without instruction for real-world instruction. Real-world instruction following results. Results without instruction following: The first bar chart below shows the average accuracy of each reranker when evaluated across 9 domains without instruction following. rerank-2.5 and rerank-2.5-lite consistently emerge as the top-performing rerankers, regardless of the first-stage retrieval method used. This is not the case for Cohere Rerank v3.5, which hurts retrieval quality when applied on top of voyage-3-large (the most powerful first-stage retrieval method). In particular: Averaged across the four first-stage retrieval methods, rerank-2.5 outperforms Cohere Rerank v3.5, Qwen3-Reranker-8B, and rerank-2 by 7.94%, 2.25%, 1.85%, respectively. rerank-2.5-lite, while optimized for latency, still outperforms Cohere Rerank v3.5, Qwen3 Reranker 8B, and rerank-2 by 7.16%, 1.47%, and 1.08%, respectively. Both rerank-2.5 and rerank-2.5-lite provide a significant quality improvement on top of all first-stage retrieval results. Figure 4. Reranker averages across domains without instruction. The bar charts below illustrate NDCG@10 across different languages. Both rerank-2.5 and rerank-2.5-lite consistently increase performance across the board for all languages and first-stage retrieval methods. Specifically: Averaged across the four first-stage retrieval methods, rerank-2.5 outperforms Cohere Rerank v3.5, Qwen3-Reranker-8B, and rerank-2 by 3.26%, 2.34%, and 1.35%, respectively. Likewise, rerank-2.5-lite outperforms Cohere Rerank v3.5, Qwen3-Reranker-8B, and rerank-2-lite by 1.93%, 1.01%, and 2.70%, respectively. Figure 5. Retrieval accuracy averages for reranker models across languages. Detailed domain-specific and multilingual results using BM25, voyage-3-large, and voyage-3.5 as first-stage retrieval methods can be found in Appendix B. MAIR benchmark - The figures below illustrate the accuracy gains attained by rerank-2.5 and rerank-2.5-lite on MAIR. Both rerank-2.5 and rerank-2.5-lite consistently improve atop all first-stage search results. Specifically: rerank-2.5 outperforms Cohere Rerank v3.5 and rerank-2 by an average of 12.70% and 4.90% when evaluated atop the four first-stage retrieval methods. rerank-2.5-lite outperforms Cohere Rerank v3.5 and rerank-2 by an average of 10.36% and 2.57% when evaluated atop the four first-stage retrieval methods. Figure 6. Accuracy gains through MAIR. Detailed results: Numeric results for all evaluations are available in this spreadsheet . Try rerank-2.5 and rerank-2.5-lite today! Both rerank-2.5 and rerank-2.5-lite are available today with flexible, token-based pricing. For existing rerank-2 and rerank-2-lite users, we recommend upgrading to rerank-2.5 and rerank-2.5-lite, respectively. This upgrade provides better quality and double the context length at the same cost. We will continue to offer the rerank-2 series for existing users who do not wish to upgrade to rerank-2.5. For new users, head over to our docs to get started and learn more; first 200M tokens are free. As our results show, combining Voyage embedding models with Voyage rerankers delivers the highest possible retrieval accuracy. Appendix A – Examples of instruction following table, th, td { border: 1px solid black; border-collapse: collapse; } th, td { padding: 5px; } Query & Instruction Model Results (Top-1 Document) Query: Who is at highest risk for Hand-foot-and-mouth disease? Instruction: Focus on age-related risk factors and the most common age group. rerank-2 (No Instruction): Children are at the highest risk of getting the disease. rerank-2.5 (With Instruction): The most important risk factor is age. The infection occurs most often in children under age 10, but can be seen in adolescents and occasionally adults. The outbreaks occur most often in the summer and early fall. Query: what does unlock my device mean? Instruction: I am an international business person and need to unlock my work phone for use with multiple carriers. My search should prioritize the implications of unlocking a device provided by my employer, focusing on adherence to my company’s BYOD policies and the impact on global connectivity. rerank-2 (No Instruction): The term “unlock my device” typically means making a mobile phone or tablet work with a different service provider’s network… It’s like having a key to open up your selection of providers. rerank-2.5 (With Instruction): Gaining access to other network services on a phone initially configured for a specific telecommunications company… For a professional who travels across borders on business transactions, this process allows seamless switching to local service providers, ensuring constant connection to corporate networks and clients, given that this does not contravene any pre-established protocols… Query: Why are historical prices of stocks different on different websites? Which one should I believe? Instruction: Explain the process and challenges of collecting and reconciling historical stock price data across different financial websites. rerank-2 (No Instruction): I still can’t understand why there is a price discrepancancy. There isn’t. It’s the same stock and price differences between such major exchanges will always be minimal… rerank-2.5 (With Instruction): The cause of incomplete/inaccurate financial data’s appearing on free sites is that it is both complicated and expensive to obtain and parse these data. Even within a single country, different pieces of financial data are handled by different authorities… There are some companies (e.g. Bloomberg) whose entire business model is to do the above… Appendix B – Figures for domain-specific and multilingual results The figures below show results on domain-specific datasets without instructions for BM25, voyage-3-large, and voyage-3.5 as the first-stage retrieval method, respectively: The figures below show results on multilingual datasets when using BM25, voyage-3-large, and voyage-3.5 as the first-stage retrieval method, respectively:

August 11, 2025

Artificial Intelligence

How Tavily Uses MongoDB to Enhance Agentic Workflows

As AI agents grow in popularity and are used in increasingly mission-critical ways, preventing hallucinations and giving agents up-to-date context is more important than ever. Context can come from many sources—prompts, documents, proprietary internal databases, and the internet itself. Among these sources, the internet stands out as uniquely valuable, a best-in-class resource for humans and LLMs alike due to its massive scale and constant updates. But how can large language models (LLMs) access the latest and greatest information from the internet? Enter Tavily , one of the companies at the heart of this effort. Tavily provides an easy way to connect the web to LLMs, giving them the answers and context they need to be even more useful. MongoDB had the opportunity to sit down with Rotem Weiss, CEO of Tavily, and Eyal Ben Barouch, Tavily’s Head of Data and AI, to talk about the company’s history, how Tavily uses MongoDB, and the future of agentic workflows. Tavily’s origins Tavily began in 2023 with a simple but powerful idea. "We started with an open source project called GPT Researcher ," Weiss said. "It did something pretty simple—go to the web, do some research, get content, and write a report." That simplicity struck a chord. The project exploded, getting over 20,000 GitHub stars in under two years, signaling to the team that they had tapped into something developers desperately needed. The viral success revealed a fundamental gap in how AI systems access information. "So many use cases today require real-time search, whether it's from the web or from your users," Weiss noted. "And that is basically RAG (retrieval-augmented generation) ." "Developers are slowly realizing not everything is semantic, and that vector search alone cannot be the only solution for RAG," Weiss said. Indeed, for certain use cases, vector stores benefit from further context. This insight, buttressed by breakthrough research around CRAG (Corrective RAG) , pointed toward a future where systems automatically turn to the web to search when they lack sufficient information. Solving the real-time knowledge problem Consider the gap between static training data and our dynamic reality. Questions like "What is the weather today?" or "What was the score of the game last night?" require an injection of real-time information to accurately answer. Tavily's system fills this gap by providing AI agents with fresh, accurate data from the web, exactly when they need it. The challenge Tavily addresses goes beyond information retrieval. “Even if your model ‘knows’ the answer, it still needs to be sent in the right direction with grounded results—using Tavily makes your answers more robust,” Weiss explained. The new internet graph Weiss envisions a fundamental shift in how we think about the architecture of the web. "If you think about the new internet, it’s a fundamentally different thing. The internet used to be between people—you would send emails, you would search websites, etc. Now we have new players, the AI agents, who act as new nodes on the internet graph." These new nodes change everything. As they improve, AI agents can perform many of the same actions as humans, but with different needs and expectations. "Agents want different things than people want," Weiss explained. "They want answers; they don't need fancy UIs and a regular browser experience. They need a quick, scalable system to give them answers in real time. That's what Tavily gives you." The company's focus remains deliberately narrow and deep. "We always want to stick to the infrastructure layer compared to our competitors, since you don't know where the industry is going," Weiss said. "If we focus on optimizing the latency, the accuracy, the scalability, that's what is going to win, and that's what we're focused on." Figure 1. The road to insightful responses for users with TavilyHybridClient. MongoDB: The foundation for speed and scale To build their infrastructure, Tavily needed a database that could meet their ambitious performance requirements. For Weiss, the choice was both practical and personal. "MongoDB is the first database I ever used as a professional in my previous company," he said. "That's how I started, and I fell in love with MongoDB. It's amazing how flexible it is–it's so easy to implement everything." The document model, the foundation upon which MongoDB is built, allowed Tavily to build and scale an enterprise-grade solution quickly. But familiarity alone didn't drive the decision. MongoDB Atlas had the performance characteristics Tavily required. "Latency is one of the things that we always optimize for, and MongoDB delivers excellent price performance," Tavily’s Ben Barouch explained. "The performance is much more similar to a hot cache than a cold cache. It's almost like it's in memory!" The managed service aspect proved equally crucial. "MongoDB Atlas also saves a lot of engineering time," Weiss noted. In a fast-moving startup environment, MongoDB Atlas enabled Weiss to focus on building Tavily and not worry about the underlying data infrastructure. "Today, companies need to move extremely fast, and at very lean startups, you need to only focus on what you are building. MongoDB allows Tavily to focus on what matters most, our customers and our business." Three pillars of success The Tavily team highlighted three specific MongoDB Atlas characteristics that have become essential to their operations: Vector search : Perhaps most importantly for the AI era, MongoDB's vector search capabilities allow it to be "the memory for agents." As Weiss put it, "The only place where a company can have an edge is their proprietary data. Every company can access the best models, every company can search the web, every company can have good agent orchestration. The only differentiation is utilizing your internal, proprietary data and injecting it in the fastest and most efficient way to the prompt." MongoDB, first with Atlas Vector Search and now with Hybrid Search , has effective ways of giving agents performant context, setting them apart from those built with other technologies. Autoscaling : "Our system is built for a very fast-moving company, and we need to scale in a second," Weiss continued. "We don't need to waste time each week making changes that are done automatically by MongoDB Atlas." Monitoring : "We have other systems where we need to do our own monitoring with other cloud providers, and it's a lot of work that MongoDB Atlas takes care of for us," Weiss explained. "MongoDB has great visibility." Betting on proven innovation Tavily has been impressed with the way MongoDB has kept a finger on the pulse of the evolving AI landscape and added features accordingly. “I believed that MongoDB would be up to date quickly, and I was right," Weiss said. "MongoDB quickly thought about vector search, about other features that I needed, and got them in the product. Not having to bolt-on a separate vector database and having those capabilities natively in Atlas is a game changer for us." Ben Barouch emphasized the strategic value of MongoDB’s entire ecosystem, including the community built around the database: "When everyone's offering the same solutions, they become the baseline, and then the things that MongoDB excels at, things like reliability and scalability, are really amplified. The community, especially, is great; MongoDB has excellent developer relations, so learning and using MongoDB is very easy." The partnership between MongoDB and Tavily extends beyond technology to trust. "In this crazy market, where you have new tools every two hours and things are constantly changing, you want to make sure that you're choosing companies you trust to handle things correctly and fast," Weiss said. "I want a vendor where if I have feedback, I'm not afraid to say it, and they will listen." Looking ahead: The multi-agent future As Tavily continues building the infrastructure for AI agents to search the web, Weiss sees the next evolution already taking shape. "The future is going to be thinking about combining these one, two, three, four agents into a workflow that makes sense for specific use cases and specific companies. That will be the new developer experience." This vision of orchestrated AI workflows represents just the beginning. With MongoDB Atlas providing the scalable, reliable foundation they need, Tavily is positioning itself at the center of a fundamental shift in how information flows through our digital world. The internet welcomed people first, then connected them in revolutionary ways. Now, as AI agents join the network, companies like Tavily are building the infrastructure to ensure this next chapter of digital evolution is both powerful and accessible. With MongoDB as their foundation, they're not just adapting to the future—they're building it. Interested in building with MongoDB Atlas yourself? Try it today ! Use Tavily for working memory in this MongoDB tutorial . Explore Tavily’s Crawl to RAG example.

August 5, 2025

Artificial Intelligence

Automotive Document Intelligence with MongoDB Atlas Search

Picture two scenarios happening simultaneously across the automotive industry: In a service bay, a technician searches frantically through multiple systems for the correct procedure to address an unfamiliar warning code. They need safety warnings, torque specifications, and part numbers—immediately. Instead, they’re lost in hundreds of PDF pages, risking safety violations and extending repair times. Meanwhile, a customer sits at home, trying to understand a dashboard warning light. They search their owner’s manual PDF, scroll through forums, and eventually call the dealership—waiting on hold just to ask a simple question about whether they can drive safely to their appointment. Both scenarios represent massive inefficiencies in how automotive documentation is stored, accessed, and delivered. With technician shortages costing shops over $60,000 monthly per unfilled position , and 67% of customers preferring self-service options , the industry faces a critical gap between information availability and accessibility. We prototyped a solution that shows how you can transform static automotive manuals into intelligent, searchable knowledge bases using MongoDB Atlas . By combining flexible document storage with semantic search capabilities, you can create platforms that serve both technicians seeking repair procedures and customers looking for quick answers. Building intelligent documentation systems Automotive technical documentation presents unique challenges. Most existing systems have fixed, unchangeable data formats designed primarily for compliance rather than usability. These systems often vary across locations, lack integration with user profiles, and don’t support rapid data access. Organizations need to build custom ingestion pipelines that can process diverse documentation formats and create intelligent, searchable content. Success requires linking each interaction to user identity and storing information that supports immediate, personalized engagement. MongoDB’s flexible document model enables developers to create highly enriched documentation chunks that go far beyond simple text storage. Each document can contain the original content alongside extensive metadata, including source references, safety classifications, procedural hierarchies, user permissions, version control, and contextual relationships. As your organizational needs evolve, you can add new fields and metadata structures without schema migrations or downtime, enabling documentation systems to adapt to changing business needs. An alternative—or complementary—approach is using contextualized chunk embedding models like voyage-context-3 . Instead of relying on manual metadata or context augmentation, this model generates vector embeddings that inherently capture full-document context for each chunk. It leads to higher retrieval accuracy, reduces sensitivity to chunking strategy, and simplifies the pipeline with no downstream changes. Whether you choose a metadata-rich approach, an embedding-first strategy, or both, MongoDB supports it all. Figure 1. Document processing pipeline. This flexibility proves essential when organizations have multiple documentation sources in different formats. Custom processing pipelines can normalize content from various systems while preserving the unique metadata and relationships that make each source valuable. MongoDB’s document structure naturally accommodates this complexity, storing structured technical specifications alongside unstructured procedural text and user interaction history—all queryable through a single interface. Using a unified search that understands context MongoDB Atlas provides three complementary search capabilities that work together to deliver intelligent responses: MongoDB Atlas Search handles precise queries like part numbers and error codes. Technicians searching for a specific part number instantly find relevant diagnostic procedures, while customers typing “coolant warning light” get clear explanations. MongoDB Atlas Vector Search understands intent and context. A customer asking “Why is my engine making a clicking noise?” finds relevant content even without using technical terminology. This approach enables semantic understanding of automotive diagnostic information, enabling queries to match meaning rather than exact keywords. Hybrid search with $rankFusion combines both approaches, ensuring users find information whether they use technical terms or natural language: { $rankFusion: { input: { pipelines: { textSearch: { $search: ... }, vectorSearch: { $vectorSearch: ... } } }, combination: { weights: { textSearch: 1, vectorSearch: 1 } } } } Setting up scalable architecture for dual-purpose knowledge delivery The same MongoDB knowledge base serves both technicians and customers through tailored interfaces. Technicians access detailed procedures with safety warnings, technical specifications, and shop management system integration, while customers receive plain-language explanations, severity assessments, and service scheduling integration. Figure 2. MongoDB Atlas servicing both the technician interface and the customer portal. Custom-built processing pipelines can transform thousands of manual pages across multiple languages. MongoDB Atlas deployments can handle billions of documents while maintaining subsecond query performance. MongoDB Atlas Search and MongoDB Atlas Vector Search work together across this rich metadata, ensuring that whether users search for an error code or “Why won’t my car start?,” the system uses all available context to return relevant results quickly. Having a real-world impact When organizations replace static manuals with an AI-ready documentation platform, the upside reveals itself almost immediately: Customers find answers faster and adopt apps more readily, technicians spend less time hunting for information and more time generating revenue, and compliance teams rest easier knowing that critical warnings and audit trails live right inside every workflow. Iron Mountain’s new InSight Digital Experience Platform (DXP) , built on MongoDB Atlas and MongoDB Atlas Vector Search, is a great example of these benefits in action. By turning mountains of unstructured physical and digital content into searchable, structured data, Iron Mountain gives its customers powerful semantic search, context-aware recommendations, and AI-driven workflow automation—all while meeting strict regulatory requirements. Whether a user is looking for the latest repair bulletin, a decades-old loan document, or a region-specific compliance record, InSight DXP surfaces the right information instantly and tailors the guidance to each user’s expertise level. Transform your technical documentation today The automotive industry faces a clear inflection point. With McKinsey projecting $80 billion in automotive software market value by 2030 and technician shortages reaching crisis levels, organizations that modernize their documentation systems from a cost center into a competitive advantage will capture disproportionate value. Ready to revolutionize how your organization manages technical knowledge? Explore our automotive solutions and get started with MongoDB Atlas Vector Search today . Visit the MongoDB AI Learning Hub to learn more about building AI applications with MongoDB.

August 4, 2025

Artificial Intelligence

Fine-tune MongoDB Deployments with AppMap’s AI Tools and Diagrams

In a rapidly changing landscape, organizations that adapt for growth, efficiency, and competitiveness will be best positioned to succeed. Central to this effort is the continuous fine-tuning and troubleshooting of existing deployments, enabling companies to deliver high-performance applications that meet their business requirements. Yet, navigating application components often leads to long development cycles and high costs. Developers spend valuable time deciphering various programming languages, frameworks, and infrastructures to optimize their systems. They may have to work with complicated, intertwined code, which makes updates difficult. Moreover, older architectures increase information overload with no institutional memory to understand current workloads. To help organizations overcome these challenges, AppMap partnered with MongoDB Atlas to fine-tune MongoDB deployments and achieve optimal performance, enabling developers to build more modern and efficient applications. The AppMap solution empowers developers with AI-driven insights and interactive diagrams that clarify application behavior, decode complex application architectures, and streamline troubleshooting. This integration delivers personalized recommendations for query optimization, proper indexing, and better database interactions. Complementing these capabilities, MongoDB Atlas offers the flexibility, performance, and security essential for building resilient applications and advancing AI-powered experiences. AppMap’s technology stack Founded in 2020 by CEO Elizabeth Lawler, AppMap empowers developers to visualize, understand, and optimize application behavior. By analyzing applications in action, AppMap delivers precise insights into interactions and performance dynamics, recording APIs, functions, and service behaviors. This information is then presented as interactive diagrams, as shown in Figure 1, which can be easily searched and navigated to streamline the development process. Figure 1. Interactive diagram for a MongoDB query. As shown below, AppMap also features Navie, an AI assistant. Navie offers customers advanced code architecture analysis and customized recommendations, derived from capturing application behavior at runtime. This rich data empowers Navie to deliver smarter suggestions, assisting teams in debugging complex issues, asking contextual questions about unfamiliar code, and making more informed code changes. Figure 2. The AppMap Navie AI assistant. With these tools, AppMap improves the quality of the code running with MongoDB, helping developers better understand the flow of their apps. Using AppMap in a MongoDB application Imagine that your team has developed a new e-commerce application running on MongoDB. But you're unfamiliar with how this application operates, so you'd like to gain insights into its behavior. In this scenario, you decide to analyze your application using AppMap by executing the node package with your standard run command. npx appmap-node npm run dev With this command, you use your application just like you normally would. But now every time your app communicates through an API, it will create records. These records are used to create diagrams that help you see and understand how your application works. You can look at these diagrams to get more insights into your app's behavior and how it interacts with the MongoDB database. Figure 3. Interaction diagram for an e-commerce application. Next, you can use the Navie AI assistant to receive tailored insights and suggestions for your application. For instance, you can ask Navie to identify the MongoDB commands your application uses and to provide advice on optimizing query performance. Navie will identify the workflow of your application and may propose strategies to refine database queries, such as reindexing for improved efficiency or adjusting aggregation framework parameters. Figure 4. Insights provided by the Navie AI assistant. With this framework established, you can seamlessly interact with your MongoDB application, gain insights into its usage, enhance its performance, and achieve quicker time to market. Enhancing MongoDB apps with AppMap Troubleshooting and optimizing your MongoDB applications can be challenging, due to the complexity of related microservices that run your services. AppMap facilitates this process by providing in-depth insights into your application behavior with an AI-powered assistant, helping developers better understand your code. With faster root cause analysis and deeper code understanding, businesses can boost developer productivity, improve application performance, and enhance customer satisfaction. These benefits ultimately lead to greater agility and a stronger competitive position in the market. Enhance your development experience with MongoDB Atlas and AppMap . To learn more about how to fine-tune apps with MongoDB, check out the best practices guide for MongoDB performance and stop by our Partner Ecosystem Catalog to read about our integrations with MongoDB’s ever-evolving partner ecosystem.

July 30, 2025

Artificial Intelligence

Ready to get Started with MongoDB Atlas?

Start Free