Artificial Intelligence

Introducing voyage-3.5 and voyage-3.5-lite: Improved Quality for a New Retrieval Frontier

May 20, 2025

Streamlining Editorial Operations with Gen AI and MongoDB

Are you overwhelmed by the sheer volume of information and the constant pressure to produce content that truly resonates? Audiences constantly demand engaging and timely topics. As the daily influx of information grows massively, it’s becoming increasingly tough to identify what’s interesting and relevant. Consequently, teams are spending more time researching trends, verifying sources, and managing tools than actually creating compelling stories. This is where artificial intelligence enters the media landscape to offer newer possibilities. Tapping into AI capabilities calls for a flexible data infrastructure in order to streamline content workflows, provide real-time insights, and help teams stay focused on what matters most. In this blog, we will explore how combining gen AI with modern databases, such as MongoDB, can efficiently improve editorial operations. Why are your content ideas running dry? Creative fatigue significantly impacts content production. Content leads face constant pressure to generate fresh ideas under tight deadlines, leading to creative blocks. In fact, a recent report from Hubspot, 16% of content marketers struggle with finding compelling new content ideas . This pressure often compromises work quality due to time constraints, leaving little room for delivering authentic content. Another main hurdle is identifying credible and trending topics quickly. In order to find reliable pieces of information, a lot of time is spent on researching and discovery rather than actual creation. This leads to missed opportunities in identifying what’s trending and reduces the audience engagement as well. This presents a clear opportunity for AI, leveraged with modern databases, to deliver a transformative solution. Using MongoDB to streamline content operations MongoDB provides a flexible, unified storage solution through its collections for modern editorial workflows. The need for a flexible data infrastructure Developing an AI-driven publishing tool necessitates a system that can ingest, process, and structure a high volume of diverse content from multiple sources.. Traditional databases often struggle with this complexity. Such a system demands the ability to ingest data from many sources, dynamically categorize content by industry, and perform advanced AI-enabled searches to scale applications. Combining flexible document-oriented databases with embedding techniques transforms varied content into structured, easily retrievable insights. Figure 1 below illustrates this integrated workflow, from raw data ingestion to semantic retrieval and AI-driven topic suggestions. Figure 1. High-level architectural diagram of the Content Lab solution, showing the flow from the front-end through microservices, backend services, and MongoDB Atlas to AI-driven topic suggestions. Raw data into actionable insights We store a diverse mix of unstructured and semi-structured content in dedicated MongoDB collections such as news, Reddit posts, suggestions, userProfiles, and drafts, organized by topic, vertical (e.g., business, health), and source metadata for efficient retrieval and categorization. These collections are continuously updated from external APIs like NewsAPI and Reddit, alongside AI services (e.g., AWS Bedrock, Anthropic Claude) integrated via backend endpoints. By leveraging embedding models, we transform raw content into organised, meaningful data, stored in their specific categories (e.g., business, health) in the form of vectors. MongoDB Atlas Vector Search and Aggregation Pipeline enables fast semantic retrieval, allowing users to query abstract ideas or keywords and get back the most relevant, trending topics ranked by a similarity score. Generative AI services then draw upon these results to automate the early stages of content development, suggesting topics and drafting initial articles to substantially reduce creative fatigue. From a blank page to first draft – With gen AI and MongoDB Once a user chooses a topic, they’re taken to a draft page, as depicted in the third step of Figure 2. Users are then guided by a large language model (LLM)-based writing assistant and supported by Tavily’s search agent, which pulls in additional contextual information. MongoDB continues to handle all associated metadata and draft state, ensuring the user’s entire journey stays connected and fast. Figure 2. Customer flow pipeline & behind-the-scenes. We also maintain a dedicated userProfiles collection, linked to both the drafts and chatbot systems. This enables dynamic personalization so, for example, a Gen Z user receives writing suggestions aligned with their tone and preferences. This level of contextual adaptation improves user engagement and supports editorial consistency. User-generated drafts are stored as new entries in a dedicated drafts collection. This facilitates persistent storage, version control, and later reuse which is essential for editorial workflows. MongoDB’s flexible schema lets us evolve the data model as we add new content types or fields without migrating data. Solving the content credibility challenge Robust data management directly addresses the content credibility. When we generate topic suggestions, we capture and store the source URLs within MongoDB, embedding these links directly into the suggestion cards shown in the UI. This allows users to quickly verify each topic’s origin and reliability. Additionally, by integrating Tavily, we retrieve related contextual information along with their URLs, further enriching each suggestion. MongoDB’s efficient handling of complex metadata and relational data ensures that editorial teams can consistently and confidently vet content sources, delivering trustworthy, high-quality drafts. By combining Atlas Vector Search, flexible collections, and real-time queries, MongoDB assists greatly in building an end-to-end content system that’s agile, adaptable and intelligent. The next section shows how this translates into a working editorial experience. From raw ideas to ready stories: Our system in action With our current solution, the editorial teams can rapidly transition from scattered ideas to structured, AI-assisted drafts, all within a smart, connected system. The combination of generative AI, semantic search, and flexible data handling enables the workflow to become faster, more spontaneous and less dependent on manual effort. Consequently, the system focuses back on creativity as it becomes convenient to discover relevant topics from verified sources and produce personalised drafts. Adaptability and scalability become the essential factors in developing intelligent systems that can produce great results within the content scope. As editorial demands grow constantly, it necessitates an infrastructure that can ingest diverse data, produce insights, and assist in real-time collaboration. This system illustrates how AI coupled with a flexible, document-oriented backend can assist teams to reduce fatigue, enhance quality and accelerate the production without increasing difficulty. It’s not just about automation; it’s about providing a more focused, efficient, and reliable path from idea to publication. Here are a few next steps to help you explore the tools and techniques behind AI-powered editorial systems: Dive Deeper with Atlas Vector Search : Explore our comprehensive tutorial to understand how Atlas Vector Search empowers semantic search and enables real-time insights from your data. Discover Real-World Applications: Learn more about how MongoDB is transforming media operations by reading the AI-Powered Media article. Check out the MongoDB for Media and Entertainment page to learn more about how we meet the dynamic needs of modern media workflows.

August 26, 2025

Artificial Intelligence

New Benchmark Tests Reveal Key Vector Search Performance Factors

Search at scale is challenging. As powerful as vector search is, it can be tough to know how to properly weigh key factors like accuracy, cost, and throughput for larger workloads. We recently released the MongoDB Benchmark for Atlas Vector Search , which outlines crucial performance optimization strategies for vector search, providing a comprehensive guide to achieving optimal results with large-scale datasets. The primary goal of our guide is to significantly reduce friction for your first vector test at scale (>10M vectors) when evaluating performance for Atlas Vector Search. With this new guide, our aim is to provide more context around how to use the benchmark, to explore the dataset (including factors considered), and to summarize and contextualize the results. Let’s take a closer look! A note on benchmarking data Every good presentation includes the requisite safe harbor slide, and the art and science of benchmarking is no different. Embarking on a large-scale vector workload can present significant hurdles stemming from a lack of accurate information and the inherent friction of initial benchmarks. Furthermore, the landscape of vector search and embedding models is rapidly evolving, and information can become outdated quickly, leading users down inefficient or incorrect paths. Without clear, up-to-date guidance, users can struggle to predict system behavior, optimize configurations, and confidently allocate resources. It’s also worth noting that numerous factors (quantization, dimensionality, filtering, search node configuration, concurrency, sharding, and more) interact in complex ways. Understanding these interactions and their specific impact on a particular workload requires deep, accurate insights. Without this, users might optimize one aspect only to inadvertently degrade another. This informational vacuum—coupled with the considerable setup overhead, complex parameter tuning, and the cost of experimentation involved in running the first benchmark—creates a substantial barrier to proving out and scaling a solution. Nonetheless, we feel that these benchmarks provide confidence in POCs for our customers and give them a starting point to work with (as opposed to having no compass to start with). With these factors in mind, let's jump into an overview of the dataset. A look at the dataset The core of this performance analysis revolves around tests conducted on subsets of the Amazon Reviews 2023 dataset, which contained 48M item descriptions across 33 product categories. The dataset was chosen due to the ability to provide a realistic, large-scale e-commerce scenario, as well as offering rich data, including user reviews (ratings, text, helpfulness votes), item metadata (price, images), and detailed item names and descriptions, which are ideal to search over. For the variable dimension tests, subsets of 5.5 million items were used, embedded with voyage-3-large to produce 2048-dimensional vectors. Views were then created to slice these into 1024, 512, and 256-dimensional vectors for testing different dimensionalities. For the large-scale, high-dimensional test, a 15.3 million-item subset—also embedded with 2048-dimensional vectors from voyage-3-large —was used. One of the key takeaways from the report is that at the highest dimensionality (15.3M vectors using voyage-3-large embeddings at 2048 dimensions), Atlas Vector Search with scalar or binary quantization configured retains 90–95% accuracy with less than 50ms of query latency. One item of note is that binary quantization can have higher latency when the number of candidates requested is in the hundreds due to the additional cost of rescoring with full-fidelity vectors, but still might be preferable for many large scale workloads due to cost effectiveness. Figure 1. Binary versus scalar quantization performance. Methodology: Benchmarking with the Amazon reviews dataset Now that we talked a little bit about the data itself and the information included, let’s outline some of the key factors that impact performance for Atlas Vector Search, and how we configured our benchmark to test them. It's also important to acknowledge why these variables are critical: Not every customer will be optimizing their search for the same thing. With that in mind, we will also attempt to identify the interplay and trade-offs between them. While this list is not exhaustive ( see the full report for more details), let’s review some of the key performance factors: Recall : Recall (a measure of search accuracy) is significantly impacted by quantization and vector dimensionality. The report highlights that while scalar quantization generally starts with higher recall, binary quantization can approach similar accuracy levels by increasing numCandidates, though this often incurs higher latency due to an additional rescoring step. Furthermore, higher-dimensional vectors (1024d and 2048d) consistently maintain better recall, especially with larger datasets and quantization, compared to lower dimensions (256d and 512d), which struggle to exceed 70-80% recall. Sizing and cost : The table in the benchmark details the resources required (RAM, storage) and associated costs for different search node tiers based on three different test cases involving varying dataset sizes, vector dimensions, and quantization methods (scalar or binary). The guide provides an example of a sample dataset noting the resource requirements scale linearly, noting how quantization reduces memory requirements substantially. Concurrency and throughput : Throughput is evaluated with multiple requests issued concurrently. Scalar quantization generally achieves higher queries per second (QPS) across various limit values due to less work per query and no rescoring. Concurrency bottlenecks are often observed, indicating that higher latency can occur. Scaling out the number of search nodes or increasing available vCPUs is recommended to resolve these bottlenecks and achieve higher QPS. Figure 2. Node tiers for different test cases. Optimizing your vector search performance This benchmark report thoroughly examines the performance of MongoDB Atlas Vector Search across various configurations and large datasets, specifically the Amazon Reviews 2023 dataset. It explores the impact of factors such as quantization (scalar and binary), vector dimensionality, filtering, search node configurations, binData compression, concurrency, and sharding on recall, latency, and throughput. While there is never a “silver bullet” due to everyone’s definition of search “success” being different, we wanted to highlight some of the various levers to consider, and methods to get the most out of your own deployment. Our goal is to provide some key considerations for how to evaluate and improve your own vector search performance, and help you to properly weigh and contextualize the key factors. Ready to optimize your vector search experience? Explore the guide in our documentation . Run it yourself with our GitHub repo .

August 21, 2025

Artificial Intelligence

Powering Long-Term Memory for Agents With LangGraph and MongoDB

We're excited to introduce the MongoDB Store for LangGraph—a powerful integration that brings flexible and scalable long-term memory to AI agents. This new integration between MongoDB and LangGraph , LangChain’s open-source agent orchestration framework, allows agents to remember and build on previous interactions across multiple sessions instead of only retaining memory for the current session. The result is more intelligent, context-aware agentic systems that learn and improve over time. This new integration complements MongoDB’s existing checkpointer integration, which handles short-term memory and persistent conversation history. Together, the MongoDB Store for LangGraph and MongoDB’s support for checkpointers provide a complete solution for building production-ready, memory-enabled agents. The need for agent memory An AI agent is a system designed to take actions or make decisions based on input, often using tools and reasoning to complete tasks. By default, agents don’t retain memory between conversations, which severely constrains what they can accomplish. Agent memory (and memory management) is a computational exocortex for AI agents. It is a dynamic, systematic process that integrates an agent’s large language model (LLM) memory (context window and parametric weights) with a persistent memory management system to encode, store, retrieve, and synthesize knowledge and experiences. Agent memory is typically divided into two main types: short-term memory and long-term memory. In a memory context, LangGraph uses “threads” to represent individual conversations or sessions. Short-term memory, managed through thread-scoped checkpointers that MongoDB supports, maintains context within a given session. While this preserves conversation continuity and manages history, it doesn’t help agents learn continuously from the past across different conversations to adapt and optimize their behavior over time. This is why we introduced the MongoDB Store for LangGraph, enabling your agents to retain memories across conversations through a cross-thread memory store. Figure 1. Short and long-term memory integration between LangGraph and MongoDB. MongoDB Store: Enabling cross-thread long-term memory The new langgraph-store-mongodb package introduces a MongoDBStore class. Available now through PyPI , this production-ready integration provides: Cross-thread persistence: Store and recall information across different conversation sessions and user interactions, allowing agents to build on previous knowledge. Native JSON structure: LangGraph stores long-term memories as JSON documents, which map directly to MongoDB documents. Each memory is organized using namespaces and a key-value structure. This enables the usage of MongoDB’s native and optimized data formats and search capabilities for efficient retrieval. Vector Search capabilities: Leverage MongoDB Atlas Vector Search for semantic memory retrieval based on meaning, not just keyword matching. Asynchronous support: Support for both synchronous and asynchronous operations for high-performance applications. Automatic connection management: Robust connection pooling and error handling to ensure reliability. Optimized TTL indexes: MongoDB’s Time-to-Live (TTL) indexes are integrated with LangGraph’s TTL system, allowing automatic removal of stale or outdated data. This improves retrieval performance, reduces storage costs, and ensures the system "forgets" obsolete memories efficiently. Ready to give your AI agents persistent long-term memory? The langgraph-store-mongodb package is available now: pip install langgraph-store-mongodb The MongoDB Store for LangGraph enables developers to build more powerful agents for different use cases, including: Customer support agents: Build agents that remember customer preferences, past issues, and resolution patterns across multiple support channels. Personal assistant applications: Build agents that learn user habits and preferences to provide increasingly personalized experiences. Enterprise knowledge management: Create agents that accumulate organizational knowledge and can retrieve relevant information semantically. Multi-agent systems: Enable agent teams to share learned experiences and coordinate through persistent memory. Why MongoDB for agent memory? Effective agentic memory requires comprehensive mechanisms for storing, retrieving, updating, and deleting memories. MongoDB Atlas provides a unified database that meets all these complex requirements: Flexible document model: Store complex, nested memories as rich JSON, matching how agents naturally read, organize, and update evolving information. Semantic search: Native vector search enables retrieval by meaning, not just exact matches. State-of-the-art models: Voyage AI provides embedding models and rerankers for cutting-edge memory retrieval. Scalable architecture: Distributed architecture, workload isolation, autoscaling, and automatic sharding capabilities for scaling AI agent memory. Enterprise security: Fine-grained role-based access control (RBAC) allows precise management of both access scope (specific services or databases) and access type (read-only or read-write). MongoDB Atlas and LangChain: A complete solution for AI agent memory Short-term memory provides an agent with immediate context, current conversation state, prior exchanges within that session, or shared memory for coordination in multi-agent systems. The most common form of short-term memory is working memory—an active, temporary context accessible during a session. MongoDB's integration with LangGraph checkpointers supports this by persisting and restoring conversation states. Other short-term memory implementations include semantic caches, such as using MongoDB's semantic cache integration with LangChain , which stores recent prompts and LLM responses for retrieval when similar queries occur. Shared memory is also used in multi-agent systems to provide a common space for coordination and information sharing. Long-term memory serves as the agent’s knowledge base, storing diverse kinds of information for future use. It includes several functional types, each requiring specific storage and retrieval strategies: Episodic memory: captures specific events and interactions, such as conversation history or summaries of key occurrences with metadata (e.g., timestamps, participants). For instance, a customer support agent can use this to recall a user’s past issues and offer personalized responses. Procedural memory: records instructions or rules for recurring tasks. A typical implementation is a social content generator agent that remembers past feedback on writing style and formatting to improve its process. Semantic memory: remembers general knowledge, facts, and concepts. This is often implemented through retrieval-augmented generation (RAG), where data is stored as vector embeddings and retrieved based on semantic similarity. Associative memory: stores key entities and relationships between different pieces of information, enabling an agent to identify patterns and make inferences by navigating these connections. It's often implemented using graph structures that support efficient exploration of relationships. One practical approach is GraphRAG . The MongoDB Store for LangGraph supports these memory types through flexible filtering and semantic search, making it a versatile approach for building reliable long-term memory in agents. LangChain also provides LangMem, a toolkit featuring pre-built tools designed specifically for extracting and managing procedural, episodic, and semantic memories. LangMem integrates natively with LangGraph, streamlining the memory engineering process. For developers seeking a straightforward approach to using various memory types with MongoDB, explore this comprehensive tutorial for implementing MongoDB alongside LangGraph and LangMem . The future of intelligent agents With the new MongoDB Store for LangGraph, we're enabling developers to build AI agents that can learn and adapt. Agents that remember user preferences, learn from mistakes, and build knowledge over time will transform how we interact with AI systems. The combination of LangGraph's sophisticated orchestration capabilities with MongoDB's flexible, scalable storage creates unprecedented opportunities for building intelligent, persistent AI agents that feel truly alive and responsive. Ready to build memory-enabled agents with LangGraph and MongoDB Atlas? Get started with the documentation .

August 20, 2025

Artificial Intelligence

Building an Agentic AI Fleet Management Solution

Artificial intelligence is revolutionizing the manufacturing and motion industry, with AI-powered solutions now capable of delivering precise, real-time insights that can optimize everything from route planning to predictive maintenance. Modern vehicles can generate an overwhelming amount of data—nearly 25 GB per hour, through a diverse range of sensors, according to an article from S&P Global Mobility. Contextualizing this data with user feedback, maintenance records, and technical knowledge becomes increasingly challenging as the system scales. These complexities can create inefficiencies, introduce overhead while processing data, and drive up operational costs, hindering the full potential of AI-driven systems. An efficient fleet management architecture can address these problems by reducing redundancies, optimizing data retrieval processes, and enabling the seamless integration and use of embeddings. MongoDB’s flexible document model fits perfectly to this approach. Unlike legacy SQL databases, MongoDB excels at managing unstructured, semi-structured, and structured data. This capability allows fleet management software to ingest and process diverse data types, including vehicle signal data, geospatial zones, fleet configurations, query logs, route telemetry, maintenance records, and real-time performance scores. In this post, we will use various MongoDB Atlas features—such as geospatial query operations, time-series collections, Atlas Charts, and aggregation pipelines—to create an agentic AI-powered fleet management system. This system demonstrates how an AI agent can enable intelligent data processing, providing real-time, context-aware responses to user queries in a streamlined manner. Fleet management software with AI overview A traditional fleet management system provides features like resource planning, route optimization, and maintenance scheduling which work together to improve Cost Management, Regulatory Compliance, and Overall Operational Effectiveness (OEE). Our solution harnesses the power of MongoDB's flexible document schema, time-series collections, and geospatial query support to give fleet managers the ability to query, filter and operate on data effectively. Additionally, an AI Agent assists users in obtaining actionable insights through a chat-driven interface. Figure 1. Architecture of the solution. The AI agent has a chatbot UI. The data captured by the agent is used to trigger an orchestration service which then calls various tools as required and gets data from MongoDB in order to complete its task. In Figure 1, the telemetry data from our cars is stored in MongoDB in time series collections via microservices. In addition to the telemetry data we store stationary car information (e.g., brand, model, year, VIN, among others) and user configurations, such as past queries and fleet settings. All of this data is leveraged by the agentic system to answer user queries and provide deeper insights for future references to similar queries. Figure 2 shows the user interface of the agentic system where queries can be submitted directly. Filters allow users to narrow results by fleet, time range, or geozone, while the AI Agent delivers answers using real-time and historical data. Figure 2. Demo chat section. When a user inputs a question into the chat box, the AI Agent analyzes it by embedding the query into metadata and searching for similar prior questions in the historical recommendations collection. Depending on the tools required, the system accesses contextual data across collections, such as time-series metrics, geospatial locations, or maintenance logs, through aggregation pipelines. Once the relevant data is assembled, the AI synthesizes the information into actionable insights, providing the user with an accurate and informative response. MongoDB features for a fleet management system RAG framework with MongoDB Vector Search Agents powered by retrieval-augmented generation (RAG) are transforming fleet management systems by seamlessly integrating real-time contextual information during response generation. MongoDB’s flexible NoSQL model complements RAG by embedding fast, low-latency document data. Combined with Voyage AI’s cost-efficient embedding model, MongoDB accelerates vector search workflows for smarter decision-making MongoDB’s Atlas Vector Search empowers the agent to operate proactively by connecting user queries with relevant insights stored in the database. For instance, when a fleet manager asks about the current positions of vehicles, the agent leverages MongoDB’s vector search to match the query against historical recommendations. If similar queries already exist, the agent retrieves pre-existing results instantly, reducing both latency and operational costs. In situations where no matching results are found, the agent complements vector search by invoking LLMs to dynamically generate answers, ensuring fleet managers receive accurate and actionable responses. This streamlined workflow, powered by MongoDB’s unique combination of vector search and flexible data modeling, allows fleet managers to act on real-time, context-aware insights. From analyzing geospatial patterns to addressing systemic vehicle issues, MongoDB enables the agent to simplify complex decision-making while maintaining efficiency. By combining predictive AI capabilities with an optimized, scalable database, this solution transforms fleet management into a more proactive, data-driven process. Polymorphysm MongoDB’s document model allows storing polymorphic data structures within the same collection, meaning documents can vary in structure and embed other documents. This flexibility enables our demo to optimize workflows by storing application-specific metadata tailored to fleet operations. For instance, the historical_recommendations collection stores query and recommendation histories generated by the system’s AI engine, with the ability to embed metadata dynamically, such as the initial question asked, the tool chosen, and the results it got. This enables improved context for future queries by streamlining read operations, and giving more context for our AI agent. For example, a document in this collection might appear as follows: Figure 3. Document model of historical_recommendations. This variability in structure without sacrificing efficiency enables MongoDB to adapt to dynamic data storage requirements inherent in polymorphic workflows. By embedding detailed context and avoiding null values, the system can streamline read operations and provide richer context to the AI agent for future queries. Time series collections MongoDB's time series collections simplify working with time series data. These specialized collections provide several benefits, including automatic creation of compound indexes for faster data retrieval, reduced disk usage, and lower I/O overhead for read operations. This makes time series collections highly efficient for managing time-stamped data, such as a constant stream of sensor data from vehicles in our application. With these capabilities, fleet managers can enable near real-time access to data, empowering AI agents to rapidly extract actionable insights for fleet management. In this demo, MongoDB optimizes query efficiency in our time series collections using its bucketing mechanism. This mechanism groups multiple data points within the same time range into compressed blocks, reducing the number of documents scanned during queries. This results in documents scanned during queries. By grouping multiple data points within the same time range, bucketing minimizes read operations and disk usage, enabling faster range queries and ensuring sustained, optimized cluster performance, even under a humongous load. GeoSpatial queries MongoDB’s native support for geospatial queries enables seamless integration of robust location-based functionalities. The ability to handle complex geographic data is a powerful tool for industries relying on real-time location-based decision-making. In our demo, this capability is leveraged to locate vehicles under various conditions, such as identifying vehicles near or inside a specified geofence, while being able to filter by maximum or minimum distance. Also, geospatial queries can be incorporated directly into aggregation pipelines, enhancing AI-driven workflows powered by our AI Agent. Key takeaways MongoDB enables fleet managers to efficiently gather, process, and analyze data to uncover actionable insights. These capabilities empower managers to optimize operations, enhance vehicle oversight, and implement smarter, data-driven strategies that drive efficiency and performance. Visit MongoDB Atlas to start modernizing your fleet management system. Ready to transform your fleet management operations? Unlock real-time insights, optimize systems, and make smarter decisions with MongoDB’s advanced features. If you're interested in exploring how MongoDB enables intelligent fleet management, check out our Leafy Fleet GitHub repository. Access the Leafy Fleet on GitHub . Additionally, dive deeper into best practices for modeling connected vehicle signal data and learn how MongoDB’s flexible data model simplifies telemetry management at scale. Read the blog post .

August 19, 2025

Artificial Intelligence

Unlock Multi-Agent AI Predictive Maintenance with MongoDB

The manufacturing sector is navigating a growing number of challenges: evolving customer demands, intricate software-mechanical product integrations, just-in-time global supply chains, and a shrinking skilled labor force. Meanwhile, the entire sector is working under intense pressure to improve productivity, manage energy consumption, and keep costs in check. To stay competitive, the industry is undergoing a digital transformation—and data is at the center of that shift. Data-driven manufacturing offers a powerful answer to many of these challenges. On the shop floor, one of the most critical and high-impact applications of these strategies is predictive maintenance. Downtime isn’t just inconvenient—it’s expensive. For example, every unproductive hour in the automotive sector now costs $2.3 million (according to Siemens "The True Cost of Downtime 2024" report). For manufacturers across all sectors, predictive maintenance is no longer optional. It’s a foundational pillar of operational excellence. At its core, predictive maintenance is about using data to anticipate machine failures before they happen. It began with traditional statistical models, evolved with machine learning, and is now entering a new era. As equipment ages and failure behaviors shift, models must adapt. This has led to the adoption of more advanced approaches, including generative AI with retrieval-augmented generation (RAG) capabilities. But the next frontier is multi-agent systems—AI-powered agents working together to monitor, reason, and act. We’ve explored how generative AI powers predictive maintenance in previous posts. In this blog post, we’ll go deeper into multi-agent systems and how MongoDB makes it easy to build and scale them for smart, responsive maintenance strategies. Advance your data-driven manufacturing strategy with Agentic AI AI agents combine large language models (LLMs) with tools, memory, and logic to autonomously handle complex tasks. On the shop floor, this means agents can automate inspections, reoptimize production schedules, assist with fault diagnostics, and more. According to a LangChain survey , 78% of companies are actively developing AI agents, and over half already have at least one agent in production. Manufacturing companies can especially benefit from agentic capabilities across a great variety of practical use cases, as shown in Figure 1. Figure 1. Agent capabilities and related practical use cases in manufacturing. But leveraging AI agents in industrial environments presents unique challenges. Integration with industrial protocols like Modbus or PROFINET is complex. Governance and security requirements are strict, especially when agents interact with production equipment. Latency is also a concern as AI models need fast, reliable data access to support real-time responses. And with agents generating and consuming large volumes of data, companies need a data foundation that is reliable and can scale without sacrificing performance. Many of these challenges are not new to manufacturers—and MongoDB has a proven track record of addressing them. Industry leaders in manufacturing and automotive trust MongoDB to power critical IoT and telemetry use cases. Bosch , for example, uses MongoDB to store, manage, and analyze huge amounts of data to power its Bosch IoT Insights solution. MongoDB’s flexible document model is ideal for diverse sensor inputs and machine telemetry, while allowing systems to iterate and evolve quickly. It’s important to remember that, at its core, MongoDB was built for change, so when it comes to integrating AI in the shopfloor, it’s no surprise that MongoDB is emerging as the ideal data layer foundation. Companies like Novo Nordisk and Cisco rely on MongoDB to build and scale their AI capabilities, and leading platforms like XMPro APEX AI leverage MongoDB Atlas to create and manage advanced AI agents for industrial applications. MongoDB Atlas makes it easy to build AI Agents and operate them at scale. As both a vector and a document database, Atlas supports various search methods for agentic RAG, while also enabling agents to store short and long-term memory in the same database. The result is a unified data layer that bridges industrial IoT and agentic AI. Predictive maintenance is a perfect example of how these capabilities come together to drive real impact on the shop floor. In the next section, we’ll walk through a practical blueprint for building a multi-agent predictive maintenance system using MongoDB Atlas. Building a multi-agent predictive maintenance system This solution demonstrates how to build a multi-agent predictive maintenance system using MongoDB Atlas, LangGraph, and Amazon Bedrock. This system can streamline complex processes, such as detecting equipment anomalies, diagnosing root causes, generating work orders, and scheduling maintenance. At a high level, this solution leverages MongoDB Atlas as the unified data layer. LangGraph provides the orchestration layer, enabling graph-based coordination among agents, while Amazon Bedrock powers the underlying foundational models used by the agents to reason and make decisions. The architecture follows a supervisor-agent pattern. The supervisor coordinates tasks and delegates to three specialized agents: Failure agent , which performs root cause analysis and generates incident reports. Work order agent , which drafts maintenance work orders with detailed requirements. Planning agent , which identifies the optimal time slot for the maintenance task based on availability and production constraints. Figure 2. High-level architecture of a multi-agent predictive maintenance system. This modular design enables the system to scale easily and adapt to different operational needs. Let’s walk through the full process in four key steps. Step 1: Failure prediction kicks off the agentic workflow The process begins with an alert—something unusual in the machine data or logs that could point to a potential failure. MongoDB provides a unified view of operational data, real-time processing capabilities, and seamless compatibility with machine learning tools. Sensor data is processed in real-time using Atlas Stream Processing integrated with ML inference models. Features like native support for Time Series data and Online Archive facilitate managing telemetry data at scale efficiently. All while the downstream applications remain up to date with the latest notifications and dashboards by using Atlas Triggers , Change Streams , and Atlas Charts . From there, the supervisor agent takes over and coordinates the next steps. Figure 3. End-to-end failure prediction process that generates the alerts. Step 2: Leverage your data for root cause analysis The supervisor notifies the Failure Agent about the alert. Manual diagnostics of a machine can take hours—sifting through manuals, historical logs, and environmental data. The AI agent automates this process. It collects relevant documents, retrieves contextual insights using Atlas vector search, and analyzes environmental conditions stored in the database—like temperature or humidity at the time of failure. With this data, the agent performs a root cause analysis and proposes corrective actions. It generates a concise incident report and shares it with the supervisor agent, which then moves the workflow forward. Figure 4. Failure Agent performing root cause analysis. Step 3: Work order process automation The Work Order Agent receives the incident report and drafts a comprehensive maintenance work order. It pulls from previous similar tasks to estimate time requirements, identify the necessary materials, and ensure the right skill sets are listed. All of this is pre-filled into a standardized work order template and saved back into MongoDB Atlas. This step also includes a human-in-the-loop checkpoint. Technicians or supervisors can review and modify the draft before it is finalized. Figure 5 Work Order Agent is generating a draft work order and routing it for human validation. Step 4: Finding the optimal maintenance schedule Once the work order is approved, the Planning Agent steps in. Its task is to schedule the maintenance activity without disrupting production. The agent queries the production calendar, checks staff shift schedules, and verifies inventory availability for required materials. It considers alert severity and rescheduling constraints to find the most efficient time slot. Once the optimal window is identified, the agent sends the updated plan to the scheduling system. Figure 6. Planning Agent is evaluating constraints to identify the optimal maintenance schedule. While we focused on a predictive maintenance work flow, this architecture can be easily extended. Need agents for compliance reporting, spare parts procurement, or shift planning? No problem. With the right foundation, the possibilities are endless. Unlocking manufacturing excellence with Agentic AI Agentic AI represents a new chapter in the evolution of predictive maintenance, enabling manufacturers to move from reactive responses to intelligent, autonomous decision-making. By combining AI agents with real-time telemetry and a unified data foundation, teams can reduce downtime, cut maintenance costs, and boost equipment reliability. But to work at scale, these systems need flexible, high-performance infrastructure. With native support for time series data, vector search, stream processing, and more, MongoDB makes it easier to build, operate, and evolve multi-agent solutions in complex industrial environments. The result is smarter operations, greater resilience, and a clear path to manufacturing excellence. Clone the GitHub repository if you are interested in trying out this solution yourself. To learn more about MongoDB’s role in the manufacturing industry, please visit our manufacturing and automotive webpage .

August 18, 2025

Artificial Intelligence

How Tavily Uses MongoDB to Enhance Agentic Workflows

As AI agents grow in popularity and are used in increasingly mission-critical ways, preventing hallucinations and giving agents up-to-date context is more important than ever. Context can come from many sources—prompts, documents, proprietary internal databases, and the internet itself. Among these sources, the internet stands out as uniquely valuable, a best-in-class resource for humans and LLMs alike due to its massive scale and constant updates. But how can large language models (LLMs) access the latest and greatest information from the internet? Enter Tavily , one of the companies at the heart of this effort. Tavily provides an easy way to connect the web to LLMs, giving them the answers and context they need to be even more useful. MongoDB had the opportunity to sit down with Rotem Weiss, CEO of Tavily, and Eyal Ben Barouch, Tavily’s Head of Data and AI, to talk about the company’s history, how Tavily uses MongoDB, and the future of agentic workflows. Tavily’s origins Tavily began in 2023 with a simple but powerful idea. "We started with an open source project called GPT Researcher ," Weiss said. "It did something pretty simple—go to the web, do some research, get content, and write a report." That simplicity struck a chord. The project exploded, getting over 20,000 GitHub stars in under two years, signaling to the team that they had tapped into something developers desperately needed. The viral success revealed a fundamental gap in how AI systems access information. "So many use cases today require real-time search, whether it's from the web or from your users," Weiss noted. "And that is basically RAG (retrieval-augmented generation) ." "Developers are slowly realizing not everything is semantic, and that vector search alone cannot be the only solution for RAG," Weiss said. Indeed, for certain use cases, vector stores benefit from further context. This insight, buttressed by breakthrough research around CRAG (Corrective RAG) , pointed toward a future where systems automatically turn to the web to search when they lack sufficient information. Solving the real-time knowledge problem Consider the gap between static training data and our dynamic reality. Questions like "What is the weather today?" or "What was the score of the game last night?" require an injection of real-time information to accurately answer. Tavily's system fills this gap by providing AI agents with fresh, accurate data from the web, exactly when they need it. The challenge Tavily addresses goes beyond information retrieval. “Even if your model ‘knows’ the answer, it still needs to be sent in the right direction with grounded results—using Tavily makes your answers more robust,” Weiss explained. The new internet graph Weiss envisions a fundamental shift in how we think about the architecture of the web. "If you think about the new internet, it’s a fundamentally different thing. The internet used to be between people—you would send emails, you would search websites, etc. Now we have new players, the AI agents, who act as new nodes on the internet graph." These new nodes change everything. As they improve, AI agents can perform many of the same actions as humans, but with different needs and expectations. "Agents want different things than people want," Weiss explained. "They want answers; they don't need fancy UIs and a regular browser experience. They need a quick, scalable system to give them answers in real time. That's what Tavily gives you." The company's focus remains deliberately narrow and deep. "We always want to stick to the infrastructure layer compared to our competitors, since you don't know where the industry is going," Weiss said. "If we focus on optimizing the latency, the accuracy, the scalability, that's what is going to win, and that's what we're focused on." Figure 1. The road to insightful responses for users with TavilyHybridClient. MongoDB: The foundation for speed and scale To build their infrastructure, Tavily needed a database that could meet their ambitious performance requirements. For Weiss, the choice was both practical and personal. "MongoDB is the first database I ever used as a professional in my previous company," he said. "That's how I started, and I fell in love with MongoDB. It's amazing how flexible it is–it's so easy to implement everything." The document model, the foundation upon which MongoDB is built, allowed Tavily to build and scale an enterprise-grade solution quickly. But familiarity alone didn't drive the decision. MongoDB Atlas had the performance characteristics Tavily required. "Latency is one of the things that we always optimize for, and MongoDB delivers excellent price performance," Tavily’s Ben Barouch explained. "The performance is much more similar to a hot cache than a cold cache. It's almost like it's in memory!" The managed service aspect proved equally crucial. "MongoDB Atlas also saves a lot of engineering time," Weiss noted. In a fast-moving startup environment, MongoDB Atlas enabled Weiss to focus on building Tavily and not worry about the underlying data infrastructure. "Today, companies need to move extremely fast, and at very lean startups, you need to only focus on what you are building. MongoDB allows Tavily to focus on what matters most, our customers and our business." Three pillars of success The Tavily team highlighted three specific MongoDB Atlas characteristics that have become essential to their operations: Vector search : Perhaps most importantly for the AI era, MongoDB's vector search capabilities allow it to be "the memory for agents." As Weiss put it, "The only place where a company can have an edge is their proprietary data. Every company can access the best models, every company can search the web, every company can have good agent orchestration. The only differentiation is utilizing your internal, proprietary data and injecting it in the fastest and most efficient way to the prompt." MongoDB, first with Atlas Vector Search and now with Hybrid Search , has effective ways of giving agents performant context, setting them apart from those built with other technologies. Autoscaling : "Our system is built for a very fast-moving company, and we need to scale in a second," Weiss continued. "We don't need to waste time each week making changes that are done automatically by MongoDB Atlas." Monitoring : "We have other systems where we need to do our own monitoring with other cloud providers, and it's a lot of work that MongoDB Atlas takes care of for us," Weiss explained. "MongoDB has great visibility." Betting on proven innovation Tavily has been impressed with the way MongoDB has kept a finger on the pulse of the evolving AI landscape and added features accordingly. “I believed that MongoDB would be up to date quickly, and I was right," Weiss said. "MongoDB quickly thought about vector search, about other features that I needed, and got them in the product. Not having to bolt-on a separate vector database and having those capabilities natively in Atlas is a game changer for us." Ben Barouch emphasized the strategic value of MongoDB’s entire ecosystem, including the community built around the database: "When everyone's offering the same solutions, they become the baseline, and then the things that MongoDB excels at, things like reliability and scalability, are really amplified. The community, especially, is great; MongoDB has excellent developer relations, so learning and using MongoDB is very easy." The partnership between MongoDB and Tavily extends beyond technology to trust. "In this crazy market, where you have new tools every two hours and things are constantly changing, you want to make sure that you're choosing companies you trust to handle things correctly and fast," Weiss said. "I want a vendor where if I have feedback, I'm not afraid to say it, and they will listen." Looking ahead: The multi-agent future As Tavily continues building the infrastructure for AI agents to search the web, Weiss sees the next evolution already taking shape. "The future is going to be thinking about combining these one, two, three, four agents into a workflow that makes sense for specific use cases and specific companies. That will be the new developer experience." This vision of orchestrated AI workflows represents just the beginning. With MongoDB Atlas providing the scalable, reliable foundation they need, Tavily is positioning itself at the center of a fundamental shift in how information flows through our digital world. The internet welcomed people first, then connected them in revolutionary ways. Now, as AI agents join the network, companies like Tavily are building the infrastructure to ensure this next chapter of digital evolution is both powerful and accessible. With MongoDB as their foundation, they're not just adapting to the future—they're building it. Interested in building with MongoDB Atlas yourself? Try it today ! Use Tavily for working memory in this MongoDB tutorial . Explore Tavily’s Crawl to RAG example.

August 5, 2025

Artificial Intelligence

Automotive Document Intelligence with MongoDB Atlas Search

Picture two scenarios happening simultaneously across the automotive industry: In a service bay, a technician searches frantically through multiple systems for the correct procedure to address an unfamiliar warning code. They need safety warnings, torque specifications, and part numbers—immediately. Instead, they’re lost in hundreds of PDF pages, risking safety violations and extending repair times. Meanwhile, a customer sits at home, trying to understand a dashboard warning light. They search their owner’s manual PDF, scroll through forums, and eventually call the dealership—waiting on hold just to ask a simple question about whether they can drive safely to their appointment. Both scenarios represent massive inefficiencies in how automotive documentation is stored, accessed, and delivered. With technician shortages costing shops over $60,000 monthly per unfilled position , and 67% of customers preferring self-service options , the industry faces a critical gap between information availability and accessibility. We prototyped a solution that shows how you can transform static automotive manuals into intelligent, searchable knowledge bases using MongoDB Atlas . By combining flexible document storage with semantic search capabilities, you can create platforms that serve both technicians seeking repair procedures and customers looking for quick answers. Building intelligent documentation systems Automotive technical documentation presents unique challenges. Most existing systems have fixed, unchangeable data formats designed primarily for compliance rather than usability. These systems often vary across locations, lack integration with user profiles, and don’t support rapid data access. Organizations need to build custom ingestion pipelines that can process diverse documentation formats and create intelligent, searchable content. Success requires linking each interaction to user identity and storing information that supports immediate, personalized engagement. MongoDB’s flexible document model enables developers to create highly enriched documentation chunks that go far beyond simple text storage. Each document can contain the original content alongside extensive metadata, including source references, safety classifications, procedural hierarchies, user permissions, version control, and contextual relationships. As your organizational needs evolve, you can add new fields and metadata structures without schema migrations or downtime, enabling documentation systems to adapt to changing business needs. An alternative—or complementary—approach is using contextualized chunk embedding models like voyage-context-3 . Instead of relying on manual metadata or context augmentation, this model generates vector embeddings that inherently capture full-document context for each chunk. It leads to higher retrieval accuracy, reduces sensitivity to chunking strategy, and simplifies the pipeline with no downstream changes. Whether you choose a metadata-rich approach, an embedding-first strategy, or both, MongoDB supports it all. Figure 1. Document processing pipeline. This flexibility proves essential when organizations have multiple documentation sources in different formats. Custom processing pipelines can normalize content from various systems while preserving the unique metadata and relationships that make each source valuable. MongoDB’s document structure naturally accommodates this complexity, storing structured technical specifications alongside unstructured procedural text and user interaction history—all queryable through a single interface. Using a unified search that understands context MongoDB Atlas provides three complementary search capabilities that work together to deliver intelligent responses: MongoDB Atlas Search handles precise queries like part numbers and error codes. Technicians searching for a specific part number instantly find relevant diagnostic procedures, while customers typing “coolant warning light” get clear explanations. MongoDB Atlas Vector Search understands intent and context. A customer asking “Why is my engine making a clicking noise?” finds relevant content even without using technical terminology. This approach enables semantic understanding of automotive diagnostic information, enabling queries to match meaning rather than exact keywords. Hybrid search with $rankFusion combines both approaches, ensuring users find information whether they use technical terms or natural language: { $rankFusion: { input: { pipelines: { textSearch: { $search: ... }, vectorSearch: { $vectorSearch: ... } } }, combination: { weights: { textSearch: 1, vectorSearch: 1 } } } } Setting up scalable architecture for dual-purpose knowledge delivery The same MongoDB knowledge base serves both technicians and customers through tailored interfaces. Technicians access detailed procedures with safety warnings, technical specifications, and shop management system integration, while customers receive plain-language explanations, severity assessments, and service scheduling integration. Figure 2. MongoDB Atlas servicing both the technician interface and the customer portal. Custom-built processing pipelines can transform thousands of manual pages across multiple languages. MongoDB Atlas deployments can handle billions of documents while maintaining subsecond query performance. MongoDB Atlas Search and MongoDB Atlas Vector Search work together across this rich metadata, ensuring that whether users search for an error code or “Why won’t my car start?,” the system uses all available context to return relevant results quickly. Having a real-world impact When organizations replace static manuals with an AI-ready documentation platform, the upside reveals itself almost immediately: Customers find answers faster and adopt apps more readily, technicians spend less time hunting for information and more time generating revenue, and compliance teams rest easier knowing that critical warnings and audit trails live right inside every workflow. Iron Mountain’s new InSight Digital Experience Platform (DXP) , built on MongoDB Atlas and MongoDB Atlas Vector Search, is a great example of these benefits in action. By turning mountains of unstructured physical and digital content into searchable, structured data, Iron Mountain gives its customers powerful semantic search, context-aware recommendations, and AI-driven workflow automation—all while meeting strict regulatory requirements. Whether a user is looking for the latest repair bulletin, a decades-old loan document, or a region-specific compliance record, InSight DXP surfaces the right information instantly and tailors the guidance to each user’s expertise level. Transform your technical documentation today The automotive industry faces a clear inflection point. With McKinsey projecting $80 billion in automotive software market value by 2030 and technician shortages reaching crisis levels, organizations that modernize their documentation systems from a cost center into a competitive advantage will capture disproportionate value. Ready to revolutionize how your organization manages technical knowledge? Explore our automotive solutions and get started with MongoDB Atlas Vector Search today . Visit the MongoDB AI Learning Hub to learn more about building AI applications with MongoDB.

August 4, 2025

Artificial Intelligence

Fine-tune MongoDB Deployments with AppMap’s AI Tools and Diagrams

In a rapidly changing landscape, organizations that adapt for growth, efficiency, and competitiveness will be best positioned to succeed. Central to this effort is the continuous fine-tuning and troubleshooting of existing deployments, enabling companies to deliver high-performance applications that meet their business requirements. Yet, navigating application components often leads to long development cycles and high costs. Developers spend valuable time deciphering various programming languages, frameworks, and infrastructures to optimize their systems. They may have to work with complicated, intertwined code, which makes updates difficult. Moreover, older architectures increase information overload with no institutional memory to understand current workloads. To help organizations overcome these challenges, AppMap partnered with MongoDB Atlas to fine-tune MongoDB deployments and achieve optimal performance, enabling developers to build more modern and efficient applications. The AppMap solution empowers developers with AI-driven insights and interactive diagrams that clarify application behavior, decode complex application architectures, and streamline troubleshooting. This integration delivers personalized recommendations for query optimization, proper indexing, and better database interactions. Complementing these capabilities, MongoDB Atlas offers the flexibility, performance, and security essential for building resilient applications and advancing AI-powered experiences. AppMap’s technology stack Founded in 2020 by CEO Elizabeth Lawler, AppMap empowers developers to visualize, understand, and optimize application behavior. By analyzing applications in action, AppMap delivers precise insights into interactions and performance dynamics, recording APIs, functions, and service behaviors. This information is then presented as interactive diagrams, as shown in Figure 1, which can be easily searched and navigated to streamline the development process. Figure 1. Interactive diagram for a MongoDB query. As shown below, AppMap also features Navie, an AI assistant. Navie offers customers advanced code architecture analysis and customized recommendations, derived from capturing application behavior at runtime. This rich data empowers Navie to deliver smarter suggestions, assisting teams in debugging complex issues, asking contextual questions about unfamiliar code, and making more informed code changes. Figure 2. The AppMap Navie AI assistant. With these tools, AppMap improves the quality of the code running with MongoDB, helping developers better understand the flow of their apps. Using AppMap in a MongoDB application Imagine that your team has developed a new e-commerce application running on MongoDB. But you're unfamiliar with how this application operates, so you'd like to gain insights into its behavior. In this scenario, you decide to analyze your application using AppMap by executing the node package with your standard run command. npx appmap-node npm run dev With this command, you use your application just like you normally would. But now every time your app communicates through an API, it will create records. These records are used to create diagrams that help you see and understand how your application works. You can look at these diagrams to get more insights into your app's behavior and how it interacts with the MongoDB database. Figure 3. Interaction diagram for an e-commerce application. Next, you can use the Navie AI assistant to receive tailored insights and suggestions for your application. For instance, you can ask Navie to identify the MongoDB commands your application uses and to provide advice on optimizing query performance. Navie will identify the workflow of your application and may propose strategies to refine database queries, such as reindexing for improved efficiency or adjusting aggregation framework parameters. Figure 4. Insights provided by the Navie AI assistant. With this framework established, you can seamlessly interact with your MongoDB application, gain insights into its usage, enhance its performance, and achieve quicker time to market. Enhancing MongoDB apps with AppMap Troubleshooting and optimizing your MongoDB applications can be challenging, due to the complexity of related microservices that run your services. AppMap facilitates this process by providing in-depth insights into your application behavior with an AI-powered assistant, helping developers better understand your code. With faster root cause analysis and deeper code understanding, businesses can boost developer productivity, improve application performance, and enhance customer satisfaction. These benefits ultimately lead to greater agility and a stronger competitive position in the market. Enhance your development experience with MongoDB Atlas and AppMap . To learn more about how to fine-tune apps with MongoDB, check out the best practices guide for MongoDB performance and stop by our Partner Ecosystem Catalog to read about our integrations with MongoDB’s ever-evolving partner ecosystem.

July 30, 2025

Artificial Intelligence

Introducing voyage-context-3: Focused Chunk-Level Details with Global Document Context

Note to readers: voyage-context-3 is currently available through the Voyage AI API directly. For access, sign up for Voyage AI . TL;DR : We’re excited to introduce voyage-context-3, a contextualized chunk embedding model that produces vectors for chunks that capture the full document context without any manual metadata and context augmentation, leading to higher retrieval accuracies than with or without augmentation. It’s also simpler, faster, and cheaper, and is a drop-in replacement for standard embeddings without downstream workflow changes, also reducing chunking strategy sensitivity. On chunk-level and document-level retrieval tasks, voyage-context-3 outperforms OpenAI-v3-large by 14.24% and 12.56%, Cohere-v4 by 7.89% and 5.64%, Jina-v3 late chunking by 23.66% and 6.76%, and contextual retrieval by 20.54% and 2.40%, respectively. It also supports multiple dimensions and multiple quantization options enabled by Matryoshka learning and quantization-aware training, saving vectorDB costs while maintaining retrieval accuracy. For example, voyage-context-3 (binary, 512) outperforms OpenAI-v3-large (float, 3072) by 0.73% while reducing vector database storage costs by 99.48%—virtually the same performance at 0.5% of the cost. We’re excited to introduce voyage-context-3, a novel contextualized chunk embedding model, where chunk embedding encodes not only the chunk's own content, but also captures the contextual information from the full document. voyage-context-3 provides a seamless drop-in replacement for standard, context-agnostic embedding models used in existing retrieval-augmented generation (RAG) pipelines, while offering improved retrieval quality through its ability to capture relevant contextual information. Compared to both context-agnostic models with isolated chunking (e.g., OpenAI-v3-large, Cohere-v4) as well as existing methods that add context and metadata to chunks, including overlapping chunks and attaching metadata, voyage-context-3 delivers significant gains in retrieval performance while simplifying the tech stack. On chunk-level (retrieving the most relevant chunk) and document-level retrieval (retrieving the document containing the most relevant chunk), voyage-context-3 outperforms on average: OpenAI-v3-large and Cohere-v4 by 14.24% and 12.56%, and 7.89% and 5.64%, respectively. Context augmentation methods Jina-v3 late 1 chunking and contextual retrieval 2 by 23.66% and 6.76%, and 20.54% and 2.40%, respectively. voyage-3-large by 7.96% and 2.70%, respectively. Chunking challenges in RAG Focused detail vs global context. Chunking—breaking large documents into smaller segments, or chunks—is a common and often necessary step in RAG systems. Originally, chunking was primarily driven by the models’ limited context window (which is significantly extended by, e.g., Voyage’s models lately). More importantly, it allows the embeddings to contain precise fine-grained information about the corresponding passages, and as a result, allows the search system to pinpoint precisely relevant passages. However, this focus can come at the expense of a broader context. Finally, without chunking, users must pass complete documents to downstream large language models (LLMs), driving up costs as many tokens may be irrelevant to the query. For instance, if a 50-page legal document is vectorized into a single embedding, detailed information—such as the sentence “All data transmissions between the Client and the Service Provider’s infrastructure shall utilize AES-256 encryption in GCM mode”—is likely to be buried or lost in the aggregate. By chunking the document into paragraphs and vectorizing each one separately, the resulting embeddings can better capture localized details like “AES-256 encryption.” However, such a paragraph may not contain global context—such as the Client’s name—which is necessary to answer queries like “What encryption methods does Client VoyageAI want to use?” Ideally, we want both focused detail and global context—without tradeoffs . Common workarounds—such as chunk overlaps, context summaries using LLMs (e.g., Anthropic’s contextual retrieval), or metadata augmentation—can introduce extra steps into an already complex AI application pipeline. These steps often require further experimentation to tune, resulting in increased development time and serving cost overhead. Introducing contextualized chunk embeddings We’re excited to introduce contextualized chunk embeddings that capture both focused detail and global context. Our model processes the entire document in a single pass and generates a distinct embedding for each chunk. Each vector encodes not only the specific information within its chunk but also coarse-grained, document-level context, enabling richer and more semantically aware retrieval. The key is that the neural network sees all the chunks at the same time and decides intelligently what global information from other chunks should be injected into the individual chunk embeddings. Full document automatic context aware: Contextualized chunk embeddings capture the full context of the document without requiring the user to manually or explicitly provide contextual information. This leads to improved retrieval performance compared to isolated chunk embeddings, while remaining simpler, faster, and cheaper than other context-augmentation methods. Seamless drop-in replacement and storage cost parity: voyage-context-3 is a seamless drop-in replacement for standard, context-agnostic embedding models used in existing search systems, RAG pipelines, and agentic systems. It accepts the same input chunks and produces vectors with identical output dimensions and quantization—now enriched with document-level context for better retrieval performance. In contrast to ColBERT , which introduces an extensive amount of vectors and storage costs, voyage-context-3 generates the same number of vectors and is fully compatible with any existing vector database. Less sensitive to chunking strategy: While chunking strategy still influences RAG system behavior—and the optimal approach depends on data and downstream tasks—our contextualized chunk embeddings are empirically shown to reduce the system's sensitivity to these strategies, because the model intelligently supplement overly short chunks with global contexts. Contextualized chunk embeddings outperform manual or LLM-based contextualization because neural networks are trained to capture context intelligently from large datasets, surpassing the limitations of ad hoc efforts. voyage-context-3 was trained using both document-level and chunk-level relevance labels, along with a dual objective that teaches the model to preserve chunk-level granularity while incorporating global context. table, th, td { border: 1px solid black; border-collapse: collapse; } th, td { padding: 5px; } Context Preservation Engineering Complexity Retrieval Accuracy Standard Embeddings (e.g., OpenAI-v3-large) None Low Moderate Metadata Augmentation & Contextual Retrieval (e.g., Jina-v3 late chunking) Partial High Moderate-High Contextualized Chunk Embeddings (e.g., voyage-context-3) Full, Principled Low Highest Evaluation details Chunk-level and document-level retrieval For a given query, chunk-level retrieval returns the most relevant chunks, while document-level retrieval returns the documents containing those chunks. The figure below illustrates both retrieval levels across chunks from n documents. The most relevant chunk, often referred to as the “golden chunk,” is bolded and shown in green. Its corresponding parent document is shown in blue. Datasets We evaluate on 93 domain-specific retrieval datasets, spanning nine domains: web reviews, law, medical, long documents, technical documentation, code, finance, conversations, and multilingual, which are listed in this spreadsheet . Every dataset contains a set of queries and a set of documents. Each document consists of an ordered sequence of chunks, which are created by us via a reasonable chunking strategy. As usual, every query has a number of relevant documents with a potential score indicating the degree of relevance, which we call document-level relevance labels and can be used for the evaluation of document-level retrieval. Moreover, each query also has a list of most relevant chunks with relevance scores, which are curated through various ways, including labeling by LLMs. These are referred to as chunk-level relevance labels and are used for chunk-level retrieval evaluation. We also include proprietary real-world datasets, such as technical documentation and documents containing header metadata. Finally, we assess voyage-context-3 across different embedding dimensions and various quantization options, on standard single-embedding retrieval evaluation, using the same datasets as in our previous retrieval-quality-versus-storage-cost analysis . Models We evaluate voyage-context-3 alongside several alternatives, including: OpenAI-v3-large (text-embedding-3-large), Cohere-v4 (embed-v4.0), Jina-v3 late chunking (jina-embeddings-v3), contextual retrieval, voyage-3.5, and voyage-3-large. Metrics Given a query, we retrieve the top 10 documents based on cosine similarities and report the normalized discounted cumulative gain (NDCG@10), a standard metric for retrieval quality and a variant of the recall. Results All the evaluation results are available in this spreadsheet , and we analyze the data below. Domain-specific quality. The bar charts below show the average retrieval quality of voyage-context-3 with full-precision 2048 embeddings for each domain. In the following chunk-level retrieval chart, we can see that voyage-context-3 outperforms all other models across all domains. As noted earlier, for chunk-level retrieval, voyage-context-3 outperforms on average OpenAI-v3-large, Cohere-v4, Jina-v3 late chunking, and contextual retrieval by 14.24%, 7.89%, 23.66%, and 20.54%, respectively. voyage-context-3 also outperforms all other models across all domains in document-level retrieval, as shown in the corresponding chart below. On average, voyage-context-3 outperforms OpenAI-v3-large, Cohere-v4, Jina-v3 late chunking, and contextual retrieval by 12.56%, 5.64%, 6.76%, and 2.40%, respectively. Real-world datasets. voyage-context-3 performs strongly on our proprietary real-world technical documentation and in-house datasets, outperforming all other models. The bar chart below shows chunk-level retrieval results. Document-level retrieval results are provided in the evaluation spreadsheet . Chunking sensitivity . Compared to standard, context-agnostic embeddings, voyage-context-3 is less sensitive to variations in chunk size and delivers stronger performance with smaller chunks. For example, on document-level retrieval, voyage-context-3 shows only a 2.06% variance, compared to 4.34% for voyage-3-large, and outperforms voyage-3-large by 6.63% when using 64-token chunks. Context metadata . We also evaluate performance when context metadata is prepended to chunks. Even with metadata prepended to chunks embedded by voyage-3-large, voyage-context-3 outperforms it by up to 5.53%, demonstrating better retrieval performance without the extra work and resources required to prepend metadata. Matryoshka embeddings and quantization . voyage-context-3 supports 2048, 1024, 512, and 256- dimensional embeddings enabled by Matryoshka learning and multiple embedding quantization options—including 32-bit floating point, signed and unsigned 8-bit integer, and binary precision—while minimizing quality loss. To clarify in relation to the previous figures, the chart below illustrates single-embedding retrieval on documents. Compared with OpenAI-v3-large (float, 3072), voyage-context-3 (int8, 2048) reduces vector database costs by 83% with 8.60% better retrieval quality. Further, comparing OpenAI-v3-large (float, 3072) with voyage-context-3 (binary, 512), vector database costs are reduced by 99.48% with 0.73% better retrieval quality; that’s virtually the same retrieval performance at 0.5% of the cost. Try voyage-context-3 voyage-context-3 is available today! The first 200 million tokens are free. Get started with this quickstart tutorial . You can swap in voyage-context-3 into any existing RAG pipeline you have without requiring any downstream changes. Contextualized chunk embeddings are especially effective for: Long, unstructured documents such as white papers, legal contracts, and research reports. Cross-chunk reasoning , where queries require information that spans multiple sections. High-sensitivity retrieval tasks —such as in finance, medical, or legal domains—where missing context can lead to costly errors. To learn more about building AI applications with MongoDB, visit the MongoDB AI Learning Hub . 1 Jina. “ Late Chunking in Long-Context Embedding Models .” August 22, 2024 2 Anthropic. “ Introducing Contextual Retrieval .” September 19, 2024.

July 23, 2025

Artificial Intelligence

Revolutionizing Inventory Classification with Generative AI

In today's volatile geopolitical environment, the global automotive industry faces compounding disruptions that require a fundamental rethink of data and operations strategy. After decades of low import taxes, the return of tariffs as a tool of economic negotiations has led the global automotive industry to delay model-year transitions and disrupt traditional production and release cycles. As of June 2025, only 3% of US automotive inventory comprises next-model-year vehicles —less than half the number seen at this time in previous years. This severe decline in new-model availability, compounded by a 12.2% year-over-year drop in overall inventory, is pressuring consumer pricing and challenging traditional dealer inventory management. In this environment of constrained supply, better tools are urgently needed to classify and control vehicle, spare part, and raw material inventories for both dealers and manufacturers. Traditionally, dealerships and automakers have relied on ABC analysis to segment and control inventory by value. This widely used method classifies items into Category A, B, or C. For example, Category A items typically represent just 20% of stock but drive 80% of sales, while Category C items might comprise half the inventory yet contribute only 5% to the bottom line. This approach effectively helps prioritize resource allocation and promotional efforts. Figure 1. ABC analysis for inventory classification. While ABC analysis is known for its ease of use, it has been criticized for its focus on dollar usage. For example, not all Category C items are necessarily low-priority, as some may be next-model-year units arriving early or aging stock affected by shifting consumer preferences. Other criteria—such as lead-time, commonality, obsolescence, durability, inventory cost, and order size requirements—have also been recognized as critical for inventory classification. A multi-criteria inventory classification (MCIC) methodology, therefore, adds additional criteria to dollar usage. MCIC can be achieved with methods like statistical clustering or unsupervised machine learning techniques. Yet, a significant blind spot remains: the vast amount of unstructured data that organizations must deal with; unstructured data accounts for an estimated 80% of the world's total. Traditional ABC analysis—and even MCIC—often overlook the growing influence of insights gleaned from unstructured sources like customer sentiment and product reviews on digital channels. But now, valuable intelligence from reviews, social media posts, and dealer feedback can be vectorized and transformed into actionable features using large language models (LLMs). For instance, analyzing product reviews can yield qualitative metrics like the probability of recommending or repurchasing a product, or insights into customer expectations vs. the reality of ownership. This textual analysis can also reveal customers' product perspectives, directly informing future demand. By integrating these signals into inventory classification models, businesses can gain a deeper understanding of true product value and demand elasticity. This fusion of structured and unstructured data represents a crucial shift from reactive inventory management to predictive and customer-centric decision-making. In this blog post, we propose a novel methodology to convert unstructured data into powerful feature sets for augmenting inventory classification models. Figure 2. Transforming unstructured data into features for machine learning models. How MongoDB enables AI-driven inventory classification So, how does MongoDB empower the next generation of AI-driven inventory classification? It all comes down to four crucial steps, and MongoDB provides the robust technology and features to support every single one. Figure 3. Methodology and requirements for gen AI-powered inventory classification. Step 1: Create and store vector embeddings from unstructured data MongoDB Atlas enables modern vector search workflows. Unstructured data like product reviews, supplier notes, or customer support transcripts can be vectorized via embedding models (such as Voyage AI models) and ingested into MongoDB Atlas, where they are stored next to the original text chunks. This data then becomes searchable using MongoDB Atlas Vector Search, which allows you to run native semantic search queries directly inside the database. Unlike solutions that require separate databases for structured and vector data, MongoDB stores them side by side using the flexible document model, enabling unified access via one API. This reduces system complexity, technical debt, and infrastructure footprint—and allows for low-latency semantic searches. Figure 4. Product reviews can be stored as vector embeddings in MongoDB Atlas. Step 2: Design and store evaluation criteria In a gen AI-powered inventory classification system, evaluation criteria are no longer a set of static rules stored in a spreadsheet. Instead, the criteria are dynamic and data-backed, and are generated via an AI agent using structured and unstructured data—and enriched by domain experts using business objectives and constraints. As shown in Figure 5, the criteria for features like “Product Durability” can be defined based on relevant unstructured data stored in MongoDB (product reviews, audit reports) as well as structured data like inventory turnover and sales history. Such criteria are not just instructions or rules, but are knowledge objects with structure and semantic depth. The AI agent uses tools such as generate_criteria and embed_criteria tool and iterates over each product in the inventory. It leverages the LLM to create the criteria definition and uses an embedding model (e.g., voyage-3-large ) to generate embeddings of each definition. MongoDB Atlas is uniquely suited to store these dynamic criteria. Each rule is modeled as a flexible JSON document containing the name of the feature, criteria definition, data sources use, and the embeddings. Since there are different types of products (different car models/makes and different car parts), the documents can evolve over time without requiring schema migrations and be queried and retrieved by the AI agent in real time. MongodB Atlas provides all the necessary tools for this design—a flexible document model database, vector search, and full search tools—that can be leveraged by the AI agent to create the criteria. Figure 5. Unstructured and structured data are used by the AI agent to create criteria for feature generation. Step 3: Create an agentic application to perform transformation based on the criteria In the third step, we have another AI agent that operates over products, criteria, and unstructured data to generate enriched feature sets. This agent iterates over every product and uses MongoDB Atlas Vector Search to find relevant customer reviews to apply the criteria to and calculate a numerical feature score. The new features are added to the original features JSON document in MongoDB. In Figure 6, the agent has created “durability” and “criticality” features from the product reviews. MongoDB Atlas is the ideal foundation for this agentic architecture. Again, it provides the agent the tools it needs for features to evolve, adding new dimensions without requiring schema redesign. This results in an adaptive classification dataset that contains both structured and unstructured data. Figure 6. An AI agent enriches product features with vectorized review data to generate new features. Step 4: Rerun the inventory classification model with new features added As a final step, the inventory classification domain experts can assign or balance weights to existing and new features, choose a classification technique, and rerun inventory classification to find new inventory classes. Figure 7 shows the process where generative AI features are used in the existing inventory classification algorithm. Figure 7. Domain experts can rerun classification after balancing weights. Figure 8 shows the solution in action. The customer satisfaction score is created by LLM a using customer reviews vectorized collection and then utilized in the inventory classification model with a new weight of 0.2. Figure 8. Inventory classification using generative AI. Driving smarter inventory decisions As the automotive industry navigates slowing sales and uneven inventory, traditional inventory classification techniques also need to evolve. Though such techniques provide a solid foundation, they fall short in the face of geopolitical uncertainty, tariff-driven supply shifts, and fast-evolving consumer expectations. By combining structured sales and consumption data with unstructured insights, and enabling agentic AI using MongoDB, the automotive industry can enable a new era of inventory intelligence where products are dynamically classified based on all available data—both structured and unstructured. Clone the GitHub repository if you are interested in trying out this solution yourself. To learn more about MongoDB’s role in the manufacturing industry, please visit our manufacturing and automotive webpage .

July 16, 2025

Artificial Intelligence

Build an AI-Ready Data Foundation with MongoDB Atlas on Azure

It’s time for a database reality check. While conversations around AI usually focus on its immense potential, these advancements are also bringing developers face to face with an immediate challenge: Their organizations’ data infrastructure isn’t ready for AI. Many developers now find themselves trying to build tomorrow’s applications on yesterday’s foundations. But what if your database could shift from bottleneck to breakthrough? Is your database holding you back? Traditional databases were built for structured data in a pre-AI world—they’re simply not designed to handle today’s need for flexible, real-time data processing. Rigid schemas force developers to spend time managing database structure instead of building features, while separate systems for operational data and analytics create costly delays and complexity. Your data architecture might be holding you back if: Your developers spend more time wrestling with data than innovating. AI implementation feels like forcing a square peg into a round hole. Real-time analytics are anything but real-time. Go from theory to practice: Examples of modern data architecture at work Now is the time to rethink your data foundation by moving from rigid to flexible schemas that adapt as applications evolve. Across industries, leading organizations are unifying operational and analytical structures to eliminate costly synchronization processes. Most importantly, they’re embracing databases that speak developers’ language. In the retail sector , business demands include dynamic pricing that responds to market conditions in real-time. Using MongoDB Atlas with Azure OpenAI from Microsoft Azure, retailers are implementing sophisticated pricing engines that analyze customer behavior and market conditions, enabling data-driven decisions at scale. In the healthcare sector , organizations can connect MongoDB Atlas to Microsoft Fabric for advanced imaging analysis and results management, streamlining the flow of critical diagnostic information while maintaining security and compliance. More specifically, when digital collaboration platform Mural faced a 1,700% surge in users, MongoDB Atlas on Azure handled its unstructured application data. The results aligned optimally with modern data principles: Mural’s small infrastructure team maintained performance during massive growth, while other engineers were able to focus on innovation rather than database management. As noted by Mural’s Director of DevOps, Guido Vilariño, this approach enabled Mural’s team to “build faster, ship faster, and ultimately provide more expeditious value to customers.” This is exactly what happens when your database becomes a catalyst rather than an obstacle. Shift from “database as storage” to “database as enabler” Modern databases do more than store information—they actively participate in application intelligence. When your database becomes a strategic asset rather than just a record-keeping necessity, development teams can focus on innovation instead of infrastructure management. What becomes possible when data and AI truly connect? Intelligent applications can combine operational data with Azure AI services. Vector search capabilities can enhance AI-driven features with contextual data. Applications can handle unpredictable workloads through automated scaling. Seamless integration occurs between data processing and AI model deployment. Take the path to a modern data architecture The deep integration between MongoDB Atlas and Microsoft’s Intelligent Data Platform eliminates complex middleware, so organizations can streamline their data architecture while maintaining enterprise-grade security. The platform unifies operational data, analytics, and AI capabilities—enabling developers to build modern applications without switching between multiple tools or managing separate systems. This unified approach means security and compliance aren’t bolt-on features—they’re core capabilities. From Microsoft Entra ID integration for access control to Azure Key Vault for data protection, the platform provides comprehensive security while simplifying the development experience. As your applications scale, the infrastructure scales with you, handling everything from routine workloads to unexpected traffic spikes without adding operational complexity. Make your first move Starting your modernization journey doesn’t require a complete infrastructure overhaul or the disruption of existing operations. You can follow a gradual migration path that prioritizes business continuity and addresses specific challenges. The key is having clear steps for moving from legacy to modern architecture. Make decisions that simplify rather than complicate: Choose platforms that reduce complexity rather than add to it. Focus on developer experience and productivity. Prioritize solutions that scale with your needs. For example, you can begin with a focused proof of concept that addresses a specific challenge—perhaps an AI feature that’s been difficult to implement or a data bottleneck that’s slowing development. Making small wins in these areas demonstrates value quickly and builds momentum for broader adoption. As you expand your implementation, focus on measurable results that matter to your organization. Tracking these metrics—whether they’re developer productivity, application performance, or new capabilities—helps justify further investment and refine your approach. Avoid these common pitfalls As you undertake your modernization journey, avoid these pitfalls: Attempting to modernize everything simultaneously: This often leads to project paralysis. Instead, prioritize applications based on business impact and technical feasibility. Creating new data silos: In your modernization efforts, the goal must be integration and simplification. Adding complexity: remember that while simplicity scales, complexity compounds. Each decision should move you toward a more streamlined architecture, not a more convoluted one. The path to a modern, AI-ready data architecture is an evolution, not a revolution. Each step builds on the last, creating a foundation that supports not just today’s applications but also tomorrow’s innovations. Take the next step: Ready to modernize your data architecture for AI? Explore these capabilities further by watching the webinar “ Enhance Developer Agility and AI-Readiness with MongoDB Atlas on Azure .” Then get started on your modernization journey! Visit the MongoDB AI Learning Hub to learn more about building AI applications with MongoDB.

July 8, 2025

Artificial Intelligence

Unified Commerce for Retail Innovation with MongoDB Atlas

Unified commerce is often touted as a transformative concept, yet it represents a long-standing challenge for retailers—disparate data sources and siloed systems. It’s less of a revolutionary concept and more of a necessary shift to make long-standing problems more manageable. Doing so provides a complete business overview—and enables personalized customer experiences—by breaking down silos and ensuring consistent interactions across online, in-store, and mobile channels. Real-time data analysis enables targeted content and recommendations. Unified commerce boosts operating efficiency by connecting systems and automating processes, reducing manual work, errors, and costs, while improving customer satisfaction. Positive customer experience results in repeat customers, improving revenue, and reducing the cost of customer acquisition. MongoDB Atlas offers a robust foundation for unified commerce, addressing critical challenges within the retail sector and providing capabilities that enhance customer experience, optimize operations, and foster business growth. Figure 1. Customer touchpoints in the retail ecosystem. Retail businesses are shifting to a customer-centric and data-driven approach by unifying the customer journey for a seamless, personalized experience that builds loyalty and growth. While retail has long relied on omnichannel strategies with stores, websites, apps, and social media, these often involve separate systems, causing fragmented experiences and inefficiencies. Unified commerce, integrating physical and digital retail via a unified data platform, is a necessary evolution for retailers facing challenges with diverse platforms and data silos. Cloud-based data architectures, AI, and event-driven processing can overcome these hurdles, enabling enhanced customer engagement, optimized operations, and revenue growth. This integration delivers a frictionless customer experience crucial in today's digital marketplace. Figure 2. Enabling a customer-centric approach with unified commerce. MongoDB Atlas for unified commerce MongoDB Atlas provides a strong foundation for unified commerce, addressing key challenges in the retail sector and offering capabilities that enhance customer experience, optimize operations, and drive business growth. MongoDB's flexible document model allows retailers to consolidate varied data, eliminating data silos. This provides consistent, real-time information across all channels for enhanced customer experiences and better decision-making. In MongoDB diverse data can store without rigid schemas, enabling quick adaptation to changing needs and faster integration of siloed physical and digital systems. Figure 3. Unified customer 360 using MongoDB. Real-world adoption: Lidl , part of Schwarz group, implemented an automatic stock reordering application for branches and warehouses, addressing complex data and high volumes to improve supply chain efficiency through real-time data synchronization. Real-time data synchronization for enhanced Cx In retail, real-time processing of customer interactions is crucial. MongoDB's Change Streams and event-driven architecture allow retailers to capture and react to customer behavior instantly. This enables personalized experiences like dynamic pricing, instant order updates, and tailored recommendations, fostering customer loyalty and driving conversions. Figure 4. Real-time data in the operational data layer for enhanced customer experiences. Atlas change streams and triggers enable real-time data synchronization across retail channels, ensuring consistent inventory information and preventing overselling on both physical and e-commerce platforms. Real-world adoption: CarGurus uses MongoDB Atlas to manage vast amounts of real-time data across its platform and support seamless, personalized user experiences both online and in person. The flexible document model helps them handle diverse data structures required for their automotive marketplace. Scalability & high traffic retail MongoDB Atlas's cloud-native architecture provides automatic horizontal scaling, enabling retailers to manage demand fluctuations like seasonal spikes and product expansions without impacting performance, which is crucial for scaling unified commerce. MongoDB Atlas' auto-scaling and multi-cloud features allow retailers to handle traffic spikes during peak periods(holiday, flash sales) without downtime or performance issues. The platform automatically adjusts resources based on demand, ensuring responsiveness and availability, which is vital for positive customer experiences and maximizing sales. Figure 5. Highly scalable MongoDB Atlas for high-traffic retail. Real-world adoption: Commercetools modernized its composable commerce platform using MongoDB Atlas and MACH architecture and achieved amazing throughput for Black Friday. This demonstrates Atlas's ability to handle high-volume retail events through its scalability features. AI and analytics integration MongoDB Atlas enables retailers to gain actionable insights from unified commerce data by integrating with AI and analytics tools. This facilitates personalized shopping, predictive inventory, and targeted marketing across online and offline channels through data-driven decisions. Personalization is a key driver of customer engagement and conversion in the retail industry. MongoDB Atlas Search , with its full-text and vector search capabilities, enables retailers to deliver intelligent product recommendations, visual search experiences, and AI-powered assistants. By leveraging these advanced search and AI capabilities, retailers can help customers find the products they're looking for quickly and easily, provide personalized recommendations based on their interests and preferences, and create a more intuitive and enjoyable shopping experience. Real-world adoption: L'Oréal improved customer experiences through personalized, inclusive, and responsible beauty across several apps. Retailers on MongoDB Atlas can leverage its unstructured data capabilities, vector search, and AI integrations to create real-time, AI-driven applications. Seamless data integration Atlas offers ETL/CDC connectors and APIs to consolidate diverse retail data into a unified operational layer. This single source of truth combines inventory, customer, transaction, and digital data from legacy systems, enabling consistent omnichannel experiences and eliminating data silos that hinder unified commerce. Figure 6. MongoDB Atlas for unified commerce. Real-world adoption: MongoDB helps global retailers, like Adeo , unify cross-channel data into an operational layer for easy synchronization across online and physical platforms, enabling better customer experiences. Advanced search capabilities MongoDB Atlas provides built-in text and vector search capabilities, enabling retailers to create advanced search experiences for enhanced product discovery and personalization across online and physical channels. Figure 7. Integrated search capabilities in MongoDB. Real-world adoption: MongoDB's data platform with integrated search enables retailers to improve customer experience and unify commerce. Customers like Albertsons use this for both customer-facing and back-office operations. Composable architecture with data mesh principles MongoDB supports a composable architecture that aligns with data mesh principles, enabling retailers to build decentralized, scalable, and self-service data infrastructure. Using a domain-driven design approach, different teams within the organization can manage their own data products (e.g., customers, orders, inventory) as independent services. This approach promotes agility, scalability, and data ownership, allowing teams to innovate and iterate quickly while maintaining data integrity and governance. Figure 7. MongoDB Atlas enables domain-driven design for the retail enterprise data foundation. Global distribution For international retailers using unified commerce, Atlas provides low-latency global data access, ensuring fast performance and data sovereignty compliance across multiple markets. MongoDB Atlas enables retailers to distribute data globally across AWS, Google Cloud, and Azure regions as needed, building distributed and multi-cloud architectures for low-latency customer access worldwide. Figure 8. Serving always-on, globally distributed, write-everywhere apps with MongoDB Atlas global clusters. Use cases: How unified commerce transforms retail Unified commerce streamlines the retail experience by integrating diverse channels into a cohesive system. This approach facilitates customer interactions across online and physical stores, enabling features such as real-time inventory checks, personalized recommendations based on purchase history regardless of the transaction location, and frictionless return processes. The objective is to create a seamless and efficient shopping journey through interconnected and collaborative functionalities using a modern data platform that enables the creation of such a data estate. Always-stocked shelves & knowing what's where: Real-time inventory Offer online ordering with delivery or pickup, providing stock estimates Store staff use real-time inventory to help customers and order, minimizing out-of-stocks Treating customers as individuals is a key aspect of Retail. Retail Enterprises need a unified view of customer data to offer personalized recommendations, offers, and content and offer dynamic pricing based on loyalty and market factors. Engaging customers on their preferred channels with consistent messaging and superior service builds lasting relationships. Seamless order orchestration is crucial, providing flexible fulfillment options (delivery, BOPIS, curbside, direct shipping) and keeping customers informed with real-time updates. Optimizing inventory across stores and warehouses ensures speedy, accurate fulfillment. Along with fulfillment, frictionless returns are vital, offering in-store returns for online purchases, efficient tracking, and immediate refunds. In the digital space, intelligent search and discovery are essential. Advanced search, image-based search, and AI chatbots simplify product discovery and support, boosting conversion rates and brand engagement. Leading retailers leverage MongoDB Atlas for these capabilities, powering AI recommendations, real-time inventory, and seamless omnichannel customer journeys to improve efficiency and satisfaction. The future of unified commerce To remain competitive, retailers should adopt flexible, cloud-based systems. MongoDB Atlas facilitates this transition, enabling unified commerce through real-time data, AI search, and scalable microservices for enhanced customer experiences and innovation. Visit our retail solutions page to learn more about how MongoDB Atlas can accelerate Unified Commerce.

June 26, 2025

Artificial Intelligence

Ready to get Started with MongoDB Atlas?

Start Free