Artificial Intelligence

Introducing voyage-3.5 and voyage-3.5-lite: Improved Quality for a New Retrieval Frontier

May 20, 2025

Building an Interactive Manhattan Guide with Chatbot Demo Builder

Question: Know any good spots? Response: Based on the context provided, I can recommend a few good pizza spots in Manhattan: Joe's Pizza (7 Carmine St) - This is described as "one of the most beloved slice joints in the city" with a "thin, crispy crust and always fresh" pizza, perfect for a quick bite. Prince Street Pizza (27 Prince St) - Known for its "thick, square pepperoni slice with those crispy-edged 'roni cups." $1 Pizza Slices - The context mentions that $1 pizza slices can still be found, especially around 6th Ave and 8th St in Greenwich Village. This is described as a "quintessential NYC food experience." That’s from a conversation with a chatbot we built in about 10 minutes. There was no coding, no database setup—just a PDF and the Chatbot Demo Builder tool. What made the experience so valuable wasn’t just getting a good response, but seeing exactly what was happening behind the scenes after submitting our question. 🤖 Meet Chatbot Demo Builder Chatbot Demo Builder is the newest addition to the Atlas Search Playground . This no-code tool gives users an intuitive way to explore MongoDB’s vector search capabilities by building interactive Q&A bots with sample data or their own uploads. Powered by MongoDB Atlas Vector Search and Voyage AI , the Builder automatically handles index creation, query generation, and embedding, removing the common setup barriers that often slow down RAG ( retrieval-augmented generation ) experimentation. What makes the tool particularly powerful is its accessibility. Like all tools in the Atlas Search Playground, it runs entirely in your browser without requiring a MongoDB Atlas account, cluster, or collection. This means you can test ideas, iterate quickly, and share prototypes with teammates and stakeholders, all without spinning up additional infrastructure. With MongoDB.local NYC coming up on September 17, we thought it was the perfect time to put the Chatbot Demo Builder through its paces. So we decided to create something practical: a Q&A chatbot to help visitors explore Manhattan. 🧑‍💻 Building in the browser The entire process happened without leaving our browser. We started by uploading our Manhattan travel guide PDF to the Chatbot Demo Builder. Next, we configured how the chatbot would process our content using the Data Settings modal. For our chunking strategy, we chose recursive chunking with 500-token chunks and 50-token overlap. This preserves paragraph flow while ensuring important information isn’t split awkwardly across boundaries. For the embedding model, we selected voyage-3-large, which excels at general knowledge retrieval tasks. Once configured, we hit "Create Embeddings" and watched as the Builder processed our guide into a demo document collection containing metadata, chunked text, and vector embeddings. Figure 1. Data Settings modal used for chunking configuration and embedding model selection. 📍 Testing like tourists With embeddings generated, we started asking questions like curious visitors: "Where can I find a public restroom near Central Park?" "What are some good day trip ideas?" "What are some fun facts about New York City?" Each query highlighted the Builder's most powerful feature: complete transparency. When we asked about pizza, we could see the exact vector search query that ran, which chunks scored highest, and how the LLM prompt was constructed. This visibility turned experimentation from guesswork into informed iteration. We could understand not just what answers we got, but also why we got them and how to improve them. Figure 2. Vector search query and scored document results for the pizza recommendation question. 🧐 Optimizing for better results Fine-tuning our retrieval settings produced even better outputs. The Builder made these optimizations easily accessible and provided insight into exactly how they would affect results. We started by modifying numCandidates, which controls how many potential matches the system initially examines before selecting the best results. The recommended setting is at least 20 times higher than the limit value, allowing more potential matches to be evaluated before selecting the best ones; this trades a bit of latency for significantly better recall. For even higher precision, the Builder offers an exact nearest neighbor (ENN) search, which calculates distances to all indexed embeddings. While computationally intensive, it guarantees finding the exact nearest neighbors when accuracy is key. Since our Manhattan guide only had 25 documents, we could afford to use ENN without worrying about performance impact. Figure 3. Retrieval settings panel for adjusting search parameters and enabling exact nearest neighbor search. 💡 Sharing and takeaways Once we were happy with the responses, it was easy to share our work. The Builder generated a snapshot link that let the entire team test the chatbot for themselves without any additional setup. In just a few steps, we transformed a static travel PDF into a conversational guide for exploring Manhattan. Along the way, we saw how decisions about chunking strategies, embeddings, and retrieval settings directly affect answer quality. We also gained visibility into what was happening behind the scenes, giving us the insights we needed to optimize these decisions. Figure 4. Output panel tabs displaying data source, vector index, search query, and prompt details By the end, we had a chatbot capable of providing helpful local insights about Manhattan, from day trip ideas to restaurant recommendations, all while giving us complete visibility into how it generated its answers. 🗽 Beyond the big apple Chatbot Demo Builder makes it easy to explore RAG techniques. Whether you're prototyping a customer support bot, building an internal knowledge assistant, or creating an interactive travel guide, Chatbot Demo Builder allows you to gain a clearer understanding of what works best for your AI use case. Ready to get started? Try the Chatbot Demo Builder in the Atlas Search Playground, or check out the official documentation to learn about the other tools available. And if you're joining us in New York for MongoDB.local , consider this your preview of what the city has to offer—courtesy of a chatbot that knows its way around Manhattan.

September 3, 2025

Artificial Intelligence

MongoDB and Hope AI: Craft Enterprise Code with AI

The world of software development is constantly evolving, and the demand for tools that streamline processes, increase efficiency, and enable developers to easily create robust applications continues to rise. For this context, MongoDB and Bit.dev have teamed up to bring a transformative integration to the table, combining MongoDB’s leading database platform with Bit Cloud’s AI-powered, component-based development platform, featuring the Hope AI agent. Bit Cloud is the platform that brings powerful AI and composability capabilities directly to developers through Hope AI. Designed to support smarter, faster development, Hope AI enables architecture planning with control, code generation, collaborative management, and production-ready output. This partnership showcases how Bit Cloud, with Hope AI, empowers developers to innovate efficiently without sacrificing control. Let’s explore the features of Hope AI and see how it transforms the development process. Figure 1. The release process. Going from concept to plan: AI-generated code architecture One of Hope AI’s standout features is its ability to create a code architecture based on user input. This isn’t just about diving straight into coding—Hope AI first provides developers with a clear, visual plan for implementation. Think of it as having an AI architect that listens to your idea, understands your goals, and crafts a tailored blueprint for your application. At this stage, Hope AI does not generate any code; this enables developers to focus on shaping the architecture to fit their unique requirements. Whether they need to make tweaks or add entirely new elements, this phase is highly customizable. Developers can approve the initial plan at their own pace, knowing that any changes are an integral part of the process. Generating code for new or existing applications Once the architecture is approved, Hope AI seamlessly transitions into code generation. This integration is ideal for developers building new applications as well as those enhancing existing ones. For existing applications, adding features is simple—Hope AI can work off the current application and produce code that integrates directly into the existing framework. This versatility positions Hope AI as an innovative tool for projects in all stages of development, helping developers spend less time working around limitations and more time realizing their creative visions. While its current focus is on web application development, Hope AI plans to expand its capabilities to mobile app development in future iterations, making this partnership even more promising for the broader developer community. Gaining full control and ongoing flexibility A major concern with AI-generated code is the possibility of losing control over what’s created. MongoDB and Hope AI address this head-on—developers maintain complete control and can review every line of the AI-generated code. If edits are required, users can make changes directly within the generated code, ensuring the final product aligns precisely with their vision and requirements. Beyond initial creation, Hope AI remains an active participant throughout the development process. Need additional components or features later? The AI is always available to assist, making sure your code evolves alongside your project. Achieving collaboration at the core One of the most exciting features introduced by Hope AI is the ability to “snap” the code. This functionality enables developers to share and manage code with teammates, promoting seamless collaboration. Teams can work together on the project, implement updates, and review progress without any barriers. This collaborative aspect is crucial for modern development teams that thrive on interconnected workflows. Taking a privacy-first approach to code creation Another remarkable feature of Hope AI is its emphasis on privacy. The code generated by Hope AI is 100% private—no sharing for AI model training, no public access to your work. The data is protected, and only the designated project members can access the code. In today’s data-sensitive landscape, this level of privacy is critical. Developers and organizations can trust that their intellectual property is secure and that the AI agent isn’t repurposing their unique codebase for other uses. Integrating MongoDB: Configuring MongoDB Atlas credentials MongoDB plays a crucial role in this collaboration. Hope AI is designed to use MongoDB Community Edition by default, offering developers access to MongoDB’s fast, efficient, and reliable database. However, developers and organizations have the flexibility to choose MongoDB Atlas if they prefer a fully managed, customizable solution. Hope AI fully supports integration with MongoDB Atlas, allowing seamless configuration for those who opt for it. MongoDB Atlas opens up possibilities for advanced database management, including automatic scaling, global data replication, and powerful analytics features. With this level of customization, developers can create applications that meet even the most complex infrastructure demands. Promoting the future of AI-assisted development The collaboration between MongoDB and Hope AI signals more than just a technical partnership; it represents a vision for the future of development. By harnessing the power of AI to streamline coding and by empowering developers to maintain control over their projects, this integration creates an environment where creativity meets efficiency. As Hope AI expands into mobile app support and MongoDB continues to innovate on its platform capabilities, this partnership will likely set the stage for groundbreaking developments that appeal to large-scale organizations. Choosing MongoDB and Hope AI For IT decision-makers, the MongoDB-Hope AI partnership is a win for teams wanting to build faster without compromising quality or security. Here’s why this collaboration stands out: Rapid prototyping and scaling: The AI-powered architecture design and code generation significantly reduce project timelines while safeguarding customization. Secure code management: Privacy-first code generation is designed to prioritize data protection and support security best practices. Enhanced team collaboration: The Snap functionality ensures teamwork thrives, making Hope AI suitable for distributed teams. On-premises availability: The product extends its flexibility by offering an on-premises deployment option, catering to businesses that require hosting within their own infrastructures. Integrated MongoDB services: MongoDB Atlas credentials provide unparalleled database management flexibility while MongoDB Community Edition caters to developers just getting started. Building the future together The integration of MongoDB and Hope AI empowers developers everywhere. Whether you’re designing from scratch, enhancing an existing app, or scaling your team’s efforts across platforms, this partnership promises the tools and capabilities to bring your ideas to life with unprecedented efficiency and control. MongoDB and Hope AI are building the future—and developers are at the center of this exciting transformation. Ready to unlock the full potential of AI-powered development? Visit the MongoDB AI Learning Hub to learn how to begin building AI applications with MongoDB. Ready to unlock the full potential of AI-powered development? Connect with Hope AI today to see how MongoDB and Hope AI can transform your ideas into reality.

September 3, 2025

Artificial Intelligence

Streamlining Editorial Operations with Gen AI and MongoDB

Are you overwhelmed by the sheer volume of information and the constant pressure to produce content that truly resonates? Audiences constantly demand engaging and timely topics. As the daily influx of information grows massively, it’s becoming increasingly tough to identify what’s interesting and relevant. Consequently, teams are spending more time researching trends, verifying sources, and managing tools than actually creating compelling stories. This is where artificial intelligence enters the media landscape to offer newer possibilities. Tapping into AI capabilities calls for a flexible data infrastructure in order to streamline content workflows, provide real-time insights, and help teams stay focused on what matters most. In this blog, we will explore how combining gen AI with modern databases, such as MongoDB, can efficiently improve editorial operations. Why are your content ideas running dry? Creative fatigue significantly impacts content production. Content leads face constant pressure to generate fresh ideas under tight deadlines, leading to creative blocks. In fact, a recent report from Hubspot, 16% of content marketers struggle with finding compelling new content ideas . This pressure often compromises work quality due to time constraints, leaving little room for delivering authentic content. Another main hurdle is identifying credible and trending topics quickly. In order to find reliable pieces of information, a lot of time is spent on researching and discovery rather than actual creation. This leads to missed opportunities in identifying what’s trending and reduces the audience engagement as well. This presents a clear opportunity for AI, leveraged with modern databases, to deliver a transformative solution. Using MongoDB to streamline content operations MongoDB provides a flexible, unified storage solution through its collections for modern editorial workflows. The need for a flexible data infrastructure Developing an AI-driven publishing tool necessitates a system that can ingest, process, and structure a high volume of diverse content from multiple sources.. Traditional databases often struggle with this complexity. Such a system demands the ability to ingest data from many sources, dynamically categorize content by industry, and perform advanced AI-enabled searches to scale applications. Combining flexible document-oriented databases with embedding techniques transforms varied content into structured, easily retrievable insights. Figure 1 below illustrates this integrated workflow, from raw data ingestion to semantic retrieval and AI-driven topic suggestions. Figure 1. High-level architectural diagram of the Content Lab solution, showing the flow from the front-end through microservices, backend services, and MongoDB Atlas to AI-driven topic suggestions. Raw data into actionable insights We store a diverse mix of unstructured and semi-structured content in dedicated MongoDB collections such as news, Reddit posts, suggestions, userProfiles, and drafts, organized by topic, vertical (e.g., business, health), and source metadata for efficient retrieval and categorization. These collections are continuously updated from external APIs like NewsAPI and Reddit, alongside AI services (e.g., AWS Bedrock, Anthropic Claude) integrated via backend endpoints. By leveraging embedding models, we transform raw content into organised, meaningful data, stored in their specific categories (e.g., business, health) in the form of vectors. MongoDB Atlas Vector Search and Aggregation Pipeline enables fast semantic retrieval, allowing users to query abstract ideas or keywords and get back the most relevant, trending topics ranked by a similarity score. Generative AI services then draw upon these results to automate the early stages of content development, suggesting topics and drafting initial articles to substantially reduce creative fatigue. From a blank page to first draft – With gen AI and MongoDB Once a user chooses a topic, they’re taken to a draft page, as depicted in the third step of Figure 2. Users are then guided by a large language model (LLM)-based writing assistant and supported by Tavily’s search agent, which pulls in additional contextual information. MongoDB continues to handle all associated metadata and draft state, ensuring the user’s entire journey stays connected and fast. Figure 2. Customer flow pipeline & behind-the-scenes. We also maintain a dedicated userProfiles collection, linked to both the drafts and chatbot systems. This enables dynamic personalization so, for example, a Gen Z user receives writing suggestions aligned with their tone and preferences. This level of contextual adaptation improves user engagement and supports editorial consistency. User-generated drafts are stored as new entries in a dedicated drafts collection. This facilitates persistent storage, version control, and later reuse which is essential for editorial workflows. MongoDB’s flexible schema lets us evolve the data model as we add new content types or fields without migrating data. Solving the content credibility challenge Robust data management directly addresses the content credibility. When we generate topic suggestions, we capture and store the source URLs within MongoDB, embedding these links directly into the suggestion cards shown in the UI. This allows users to quickly verify each topic’s origin and reliability. Additionally, by integrating Tavily, we retrieve related contextual information along with their URLs, further enriching each suggestion. MongoDB’s efficient handling of complex metadata and relational data ensures that editorial teams can consistently and confidently vet content sources, delivering trustworthy, high-quality drafts. By combining Atlas Vector Search, flexible collections, and real-time queries, MongoDB assists greatly in building an end-to-end content system that’s agile, adaptable and intelligent. The next section shows how this translates into a working editorial experience. From raw ideas to ready stories: Our system in action With our current solution, the editorial teams can rapidly transition from scattered ideas to structured, AI-assisted drafts, all within a smart, connected system. The combination of generative AI, semantic search, and flexible data handling enables the workflow to become faster, more spontaneous and less dependent on manual effort. Consequently, the system focuses back on creativity as it becomes convenient to discover relevant topics from verified sources and produce personalised drafts. Adaptability and scalability become the essential factors in developing intelligent systems that can produce great results within the content scope. As editorial demands grow constantly, it necessitates an infrastructure that can ingest diverse data, produce insights, and assist in real-time collaboration. This system illustrates how AI coupled with a flexible, document-oriented backend can assist teams to reduce fatigue, enhance quality and accelerate the production without increasing difficulty. It’s not just about automation; it’s about providing a more focused, efficient, and reliable path from idea to publication. Here are a few next steps to help you explore the tools and techniques behind AI-powered editorial systems: Dive Deeper with Atlas Vector Search : Explore our comprehensive tutorial to understand how Atlas Vector Search empowers semantic search and enables real-time insights from your data. Discover Real-World Applications: Learn more about how MongoDB is transforming media operations by reading the AI-Powered Media article. Check out the MongoDB for Media and Entertainment page to learn more about how we meet the dynamic needs of modern media workflows.

August 26, 2025

Artificial Intelligence

New Benchmark Tests Reveal Key Vector Search Performance Factors

Search at scale is challenging. As powerful as vector search is, it can be tough to know how to properly weigh key factors like accuracy, cost, and throughput for larger workloads. We recently released the MongoDB Benchmark for Atlas Vector Search , which outlines crucial performance optimization strategies for vector search, providing a comprehensive guide to achieving optimal results with large-scale datasets. The primary goal of our guide is to significantly reduce friction for your first vector test at scale (>10M vectors) when evaluating performance for Atlas Vector Search. With this new guide, our aim is to provide more context around how to use the benchmark, to explore the dataset (including factors considered), and to summarize and contextualize the results. Let’s take a closer look! A note on benchmarking data Every good presentation includes the requisite safe harbor slide, and the art and science of benchmarking is no different. Embarking on a large-scale vector workload can present significant hurdles stemming from a lack of accurate information and the inherent friction of initial benchmarks. Furthermore, the landscape of vector search and embedding models is rapidly evolving, and information can become outdated quickly, leading users down inefficient or incorrect paths. Without clear, up-to-date guidance, users can struggle to predict system behavior, optimize configurations, and confidently allocate resources. It’s also worth noting that numerous factors (quantization, dimensionality, filtering, search node configuration, concurrency, sharding, and more) interact in complex ways. Understanding these interactions and their specific impact on a particular workload requires deep, accurate insights. Without this, users might optimize one aspect only to inadvertently degrade another. This informational vacuum—coupled with the considerable setup overhead, complex parameter tuning, and the cost of experimentation involved in running the first benchmark—creates a substantial barrier to proving out and scaling a solution. Nonetheless, we feel that these benchmarks provide confidence in POCs for our customers and give them a starting point to work with (as opposed to having no compass to start with). With these factors in mind, let's jump into an overview of the dataset. A look at the dataset The core of this performance analysis revolves around tests conducted on subsets of the Amazon Reviews 2023 dataset, which contained 48M item descriptions across 33 product categories. The dataset was chosen due to the ability to provide a realistic, large-scale e-commerce scenario, as well as offering rich data, including user reviews (ratings, text, helpfulness votes), item metadata (price, images), and detailed item names and descriptions, which are ideal to search over. For the variable dimension tests, subsets of 5.5 million items were used, embedded with voyage-3-large to produce 2048-dimensional vectors. Views were then created to slice these into 1024, 512, and 256-dimensional vectors for testing different dimensionalities. For the large-scale, high-dimensional test, a 15.3 million-item subset—also embedded with 2048-dimensional vectors from voyage-3-large —was used. One of the key takeaways from the report is that at the highest dimensionality (15.3M vectors using voyage-3-large embeddings at 2048 dimensions), Atlas Vector Search with scalar or binary quantization configured retains 90–95% accuracy with less than 50ms of query latency. One item of note is that binary quantization can have higher latency when the number of candidates requested is in the hundreds due to the additional cost of rescoring with full-fidelity vectors, but still might be preferable for many large scale workloads due to cost effectiveness. Figure 1. Binary versus scalar quantization performance. Methodology: Benchmarking with the Amazon reviews dataset Now that we talked a little bit about the data itself and the information included, let’s outline some of the key factors that impact performance for Atlas Vector Search, and how we configured our benchmark to test them. It's also important to acknowledge why these variables are critical: Not every customer will be optimizing their search for the same thing. With that in mind, we will also attempt to identify the interplay and trade-offs between them. While this list is not exhaustive ( see the full report for more details), let’s review some of the key performance factors: Recall : Recall (a measure of search accuracy) is significantly impacted by quantization and vector dimensionality. The report highlights that while scalar quantization generally starts with higher recall, binary quantization can approach similar accuracy levels by increasing numCandidates, though this often incurs higher latency due to an additional rescoring step. Furthermore, higher-dimensional vectors (1024d and 2048d) consistently maintain better recall, especially with larger datasets and quantization, compared to lower dimensions (256d and 512d), which struggle to exceed 70-80% recall. Sizing and cost : The table in the benchmark details the resources required (RAM, storage) and associated costs for different search node tiers based on three different test cases involving varying dataset sizes, vector dimensions, and quantization methods (scalar or binary). The guide provides an example of a sample dataset noting the resource requirements scale linearly, noting how quantization reduces memory requirements substantially. Concurrency and throughput : Throughput is evaluated with multiple requests issued concurrently. Scalar quantization generally achieves higher queries per second (QPS) across various limit values due to less work per query and no rescoring. Concurrency bottlenecks are often observed, indicating that higher latency can occur. Scaling out the number of search nodes or increasing available vCPUs is recommended to resolve these bottlenecks and achieve higher QPS. Figure 2. Node tiers for different test cases. Optimizing your vector search performance This benchmark report thoroughly examines the performance of MongoDB Atlas Vector Search across various configurations and large datasets, specifically the Amazon Reviews 2023 dataset. It explores the impact of factors such as quantization (scalar and binary), vector dimensionality, filtering, search node configurations, binData compression, concurrency, and sharding on recall, latency, and throughput. While there is never a “silver bullet” due to everyone’s definition of search “success” being different, we wanted to highlight some of the various levers to consider, and methods to get the most out of your own deployment. Our goal is to provide some key considerations for how to evaluate and improve your own vector search performance, and help you to properly weigh and contextualize the key factors. Ready to optimize your vector search experience? Explore the guide in our documentation . Run it yourself with our GitHub repo .

August 21, 2025

Artificial Intelligence

Powering Long-Term Memory for Agents With LangGraph and MongoDB

We're excited to introduce the MongoDB Store for LangGraph—a powerful integration that brings flexible and scalable long-term memory to AI agents. This new integration between MongoDB and LangGraph , LangChain’s open-source agent orchestration framework, allows agents to remember and build on previous interactions across multiple sessions instead of only retaining memory for the current session. The result is more intelligent, context-aware agentic systems that learn and improve over time. This new integration complements MongoDB’s existing checkpointer integration, which handles short-term memory and persistent conversation history. Together, the MongoDB Store for LangGraph and MongoDB’s support for checkpointers provide a complete solution for building production-ready, memory-enabled agents. The need for agent memory An AI agent is a system designed to take actions or make decisions based on input, often using tools and reasoning to complete tasks. By default, agents don’t retain memory between conversations, which severely constrains what they can accomplish. Agent memory (and memory management) is a computational exocortex for AI agents. It is a dynamic, systematic process that integrates an agent’s large language model (LLM) memory (context window and parametric weights) with a persistent memory management system to encode, store, retrieve, and synthesize knowledge and experiences. Agent memory is typically divided into two main types: short-term memory and long-term memory. In a memory context, LangGraph uses “threads” to represent individual conversations or sessions. Short-term memory, managed through thread-scoped checkpointers that MongoDB supports, maintains context within a given session. While this preserves conversation continuity and manages history, it doesn’t help agents learn continuously from the past across different conversations to adapt and optimize their behavior over time. This is why we introduced the MongoDB Store for LangGraph, enabling your agents to retain memories across conversations through a cross-thread memory store. Figure 1. Short and long-term memory integration between LangGraph and MongoDB. MongoDB Store: Enabling cross-thread long-term memory The new langgraph-store-mongodb package introduces a MongoDBStore class. Available now through PyPI , this production-ready integration provides: Cross-thread persistence: Store and recall information across different conversation sessions and user interactions, allowing agents to build on previous knowledge. Native JSON structure: LangGraph stores long-term memories as JSON documents, which map directly to MongoDB documents. Each memory is organized using namespaces and a key-value structure. This enables the usage of MongoDB’s native and optimized data formats and search capabilities for efficient retrieval. Vector Search capabilities: Leverage MongoDB Atlas Vector Search for semantic memory retrieval based on meaning, not just keyword matching. Asynchronous support: Support for both synchronous and asynchronous operations for high-performance applications. Automatic connection management: Robust connection pooling and error handling to ensure reliability. Optimized TTL indexes: MongoDB’s Time-to-Live (TTL) indexes are integrated with LangGraph’s TTL system, allowing automatic removal of stale or outdated data. This improves retrieval performance, reduces storage costs, and ensures the system "forgets" obsolete memories efficiently. Ready to give your AI agents persistent long-term memory? The langgraph-store-mongodb package is available now: pip install langgraph-store-mongodb The MongoDB Store for LangGraph enables developers to build more powerful agents for different use cases, including: Customer support agents: Build agents that remember customer preferences, past issues, and resolution patterns across multiple support channels. Personal assistant applications: Build agents that learn user habits and preferences to provide increasingly personalized experiences. Enterprise knowledge management: Create agents that accumulate organizational knowledge and can retrieve relevant information semantically. Multi-agent systems: Enable agent teams to share learned experiences and coordinate through persistent memory. Why MongoDB for agent memory? Effective agentic memory requires comprehensive mechanisms for storing, retrieving, updating, and deleting memories. MongoDB Atlas provides a unified database that meets all these complex requirements: Flexible document model: Store complex, nested memories as rich JSON, matching how agents naturally read, organize, and update evolving information. Semantic search: Native vector search enables retrieval by meaning, not just exact matches. State-of-the-art models: Voyage AI provides embedding models and rerankers for cutting-edge memory retrieval. Scalable architecture: Distributed architecture, workload isolation, autoscaling, and automatic sharding capabilities for scaling AI agent memory. Enterprise security: Fine-grained role-based access control (RBAC) allows precise management of both access scope (specific services or databases) and access type (read-only or read-write). MongoDB Atlas and LangChain: A complete solution for AI agent memory Short-term memory provides an agent with immediate context, current conversation state, prior exchanges within that session, or shared memory for coordination in multi-agent systems. The most common form of short-term memory is working memory—an active, temporary context accessible during a session. MongoDB's integration with LangGraph checkpointers supports this by persisting and restoring conversation states. Other short-term memory implementations include semantic caches, such as using MongoDB's semantic cache integration with LangChain , which stores recent prompts and LLM responses for retrieval when similar queries occur. Shared memory is also used in multi-agent systems to provide a common space for coordination and information sharing. Long-term memory serves as the agent’s knowledge base, storing diverse kinds of information for future use. It includes several functional types, each requiring specific storage and retrieval strategies: Episodic memory: captures specific events and interactions, such as conversation history or summaries of key occurrences with metadata (e.g., timestamps, participants). For instance, a customer support agent can use this to recall a user’s past issues and offer personalized responses. Procedural memory: records instructions or rules for recurring tasks. A typical implementation is a social content generator agent that remembers past feedback on writing style and formatting to improve its process. Semantic memory: remembers general knowledge, facts, and concepts. This is often implemented through retrieval-augmented generation (RAG), where data is stored as vector embeddings and retrieved based on semantic similarity. Associative memory: stores key entities and relationships between different pieces of information, enabling an agent to identify patterns and make inferences by navigating these connections. It's often implemented using graph structures that support efficient exploration of relationships. One practical approach is GraphRAG . The MongoDB Store for LangGraph supports these memory types through flexible filtering and semantic search, making it a versatile approach for building reliable long-term memory in agents. LangChain also provides LangMem, a toolkit featuring pre-built tools designed specifically for extracting and managing procedural, episodic, and semantic memories. LangMem integrates natively with LangGraph, streamlining the memory engineering process. For developers seeking a straightforward approach to using various memory types with MongoDB, explore this comprehensive tutorial for implementing MongoDB alongside LangGraph and LangMem . The future of intelligent agents With the new MongoDB Store for LangGraph, we're enabling developers to build AI agents that can learn and adapt. Agents that remember user preferences, learn from mistakes, and build knowledge over time will transform how we interact with AI systems. The combination of LangGraph's sophisticated orchestration capabilities with MongoDB's flexible, scalable storage creates unprecedented opportunities for building intelligent, persistent AI agents that feel truly alive and responsive. Ready to build memory-enabled agents with LangGraph and MongoDB Atlas? Get started with the documentation .

August 20, 2025

Artificial Intelligence

Building an Agentic AI Fleet Management Solution

Artificial intelligence is revolutionizing the manufacturing and motion industry, with AI-powered solutions now capable of delivering precise, real-time insights that can optimize everything from route planning to predictive maintenance. Modern vehicles can generate an overwhelming amount of data—nearly 25 GB per hour, through a diverse range of sensors, according to an article from S&P Global Mobility. Contextualizing this data with user feedback, maintenance records, and technical knowledge becomes increasingly challenging as the system scales. These complexities can create inefficiencies, introduce overhead while processing data, and drive up operational costs, hindering the full potential of AI-driven systems. An efficient fleet management architecture can address these problems by reducing redundancies, optimizing data retrieval processes, and enabling the seamless integration and use of embeddings. MongoDB’s flexible document model fits perfectly to this approach. Unlike legacy SQL databases, MongoDB excels at managing unstructured, semi-structured, and structured data. This capability allows fleet management software to ingest and process diverse data types, including vehicle signal data, geospatial zones, fleet configurations, query logs, route telemetry, maintenance records, and real-time performance scores. In this post, we will use various MongoDB Atlas features—such as geospatial query operations, time-series collections, Atlas Charts, and aggregation pipelines—to create an agentic AI-powered fleet management system. This system demonstrates how an AI agent can enable intelligent data processing, providing real-time, context-aware responses to user queries in a streamlined manner. Fleet management software with AI overview A traditional fleet management system provides features like resource planning, route optimization, and maintenance scheduling which work together to improve Cost Management, Regulatory Compliance, and Overall Operational Effectiveness (OEE). Our solution harnesses the power of MongoDB's flexible document schema, time-series collections, and geospatial query support to give fleet managers the ability to query, filter and operate on data effectively. Additionally, an AI Agent assists users in obtaining actionable insights through a chat-driven interface. Figure 1. Architecture of the solution. The AI agent has a chatbot UI. The data captured by the agent is used to trigger an orchestration service which then calls various tools as required and gets data from MongoDB in order to complete its task. In Figure 1, the telemetry data from our cars is stored in MongoDB in time series collections via microservices. In addition to the telemetry data we store stationary car information (e.g., brand, model, year, VIN, among others) and user configurations, such as past queries and fleet settings. All of this data is leveraged by the agentic system to answer user queries and provide deeper insights for future references to similar queries. Figure 2 shows the user interface of the agentic system where queries can be submitted directly. Filters allow users to narrow results by fleet, time range, or geozone, while the AI Agent delivers answers using real-time and historical data. Figure 2. Demo chat section. When a user inputs a question into the chat box, the AI Agent analyzes it by embedding the query into metadata and searching for similar prior questions in the historical recommendations collection. Depending on the tools required, the system accesses contextual data across collections, such as time-series metrics, geospatial locations, or maintenance logs, through aggregation pipelines. Once the relevant data is assembled, the AI synthesizes the information into actionable insights, providing the user with an accurate and informative response. MongoDB features for a fleet management system RAG framework with MongoDB Vector Search Agents powered by retrieval-augmented generation (RAG) are transforming fleet management systems by seamlessly integrating real-time contextual information during response generation. MongoDB’s flexible NoSQL model complements RAG by embedding fast, low-latency document data. Combined with Voyage AI’s cost-efficient embedding model, MongoDB accelerates vector search workflows for smarter decision-making MongoDB’s Atlas Vector Search empowers the agent to operate proactively by connecting user queries with relevant insights stored in the database. For instance, when a fleet manager asks about the current positions of vehicles, the agent leverages MongoDB’s vector search to match the query against historical recommendations. If similar queries already exist, the agent retrieves pre-existing results instantly, reducing both latency and operational costs. In situations where no matching results are found, the agent complements vector search by invoking LLMs to dynamically generate answers, ensuring fleet managers receive accurate and actionable responses. This streamlined workflow, powered by MongoDB’s unique combination of vector search and flexible data modeling, allows fleet managers to act on real-time, context-aware insights. From analyzing geospatial patterns to addressing systemic vehicle issues, MongoDB enables the agent to simplify complex decision-making while maintaining efficiency. By combining predictive AI capabilities with an optimized, scalable database, this solution transforms fleet management into a more proactive, data-driven process. Polymorphysm MongoDB’s document model allows storing polymorphic data structures within the same collection, meaning documents can vary in structure and embed other documents. This flexibility enables our demo to optimize workflows by storing application-specific metadata tailored to fleet operations. For instance, the historical_recommendations collection stores query and recommendation histories generated by the system’s AI engine, with the ability to embed metadata dynamically, such as the initial question asked, the tool chosen, and the results it got. This enables improved context for future queries by streamlining read operations, and giving more context for our AI agent. For example, a document in this collection might appear as follows: Figure 3. Document model of historical_recommendations. This variability in structure without sacrificing efficiency enables MongoDB to adapt to dynamic data storage requirements inherent in polymorphic workflows. By embedding detailed context and avoiding null values, the system can streamline read operations and provide richer context to the AI agent for future queries. Time series collections MongoDB's time series collections simplify working with time series data. These specialized collections provide several benefits, including automatic creation of compound indexes for faster data retrieval, reduced disk usage, and lower I/O overhead for read operations. This makes time series collections highly efficient for managing time-stamped data, such as a constant stream of sensor data from vehicles in our application. With these capabilities, fleet managers can enable near real-time access to data, empowering AI agents to rapidly extract actionable insights for fleet management. In this demo, MongoDB optimizes query efficiency in our time series collections using its bucketing mechanism. This mechanism groups multiple data points within the same time range into compressed blocks, reducing the number of documents scanned during queries. This results in documents scanned during queries. By grouping multiple data points within the same time range, bucketing minimizes read operations and disk usage, enabling faster range queries and ensuring sustained, optimized cluster performance, even under a humongous load. GeoSpatial queries MongoDB’s native support for geospatial queries enables seamless integration of robust location-based functionalities. The ability to handle complex geographic data is a powerful tool for industries relying on real-time location-based decision-making. In our demo, this capability is leveraged to locate vehicles under various conditions, such as identifying vehicles near or inside a specified geofence, while being able to filter by maximum or minimum distance. Also, geospatial queries can be incorporated directly into aggregation pipelines, enhancing AI-driven workflows powered by our AI Agent. Key takeaways MongoDB enables fleet managers to efficiently gather, process, and analyze data to uncover actionable insights. These capabilities empower managers to optimize operations, enhance vehicle oversight, and implement smarter, data-driven strategies that drive efficiency and performance. Visit MongoDB Atlas to start modernizing your fleet management system. Ready to transform your fleet management operations? Unlock real-time insights, optimize systems, and make smarter decisions with MongoDB’s advanced features. If you're interested in exploring how MongoDB enables intelligent fleet management, check out our Leafy Fleet GitHub repository. Access the Leafy Fleet on GitHub . Additionally, dive deeper into best practices for modeling connected vehicle signal data and learn how MongoDB’s flexible data model simplifies telemetry management at scale. Read the blog post .

August 19, 2025

Artificial Intelligence

Unlock Multi-Agent AI Predictive Maintenance with MongoDB

The manufacturing sector is navigating a growing number of challenges: evolving customer demands, intricate software-mechanical product integrations, just-in-time global supply chains, and a shrinking skilled labor force. Meanwhile, the entire sector is working under intense pressure to improve productivity, manage energy consumption, and keep costs in check. To stay competitive, the industry is undergoing a digital transformation—and data is at the center of that shift. Data-driven manufacturing offers a powerful answer to many of these challenges. On the shop floor, one of the most critical and high-impact applications of these strategies is predictive maintenance. Downtime isn’t just inconvenient—it’s expensive. For example, every unproductive hour in the automotive sector now costs $2.3 million (according to Siemens "The True Cost of Downtime 2024" report). For manufacturers across all sectors, predictive maintenance is no longer optional. It’s a foundational pillar of operational excellence. At its core, predictive maintenance is about using data to anticipate machine failures before they happen. It began with traditional statistical models, evolved with machine learning, and is now entering a new era. As equipment ages and failure behaviors shift, models must adapt. This has led to the adoption of more advanced approaches, including generative AI with retrieval-augmented generation (RAG) capabilities. But the next frontier is multi-agent systems—AI-powered agents working together to monitor, reason, and act. We’ve explored how generative AI powers predictive maintenance in previous posts. In this blog post, we’ll go deeper into multi-agent systems and how MongoDB makes it easy to build and scale them for smart, responsive maintenance strategies. Advance your data-driven manufacturing strategy with Agentic AI AI agents combine large language models (LLMs) with tools, memory, and logic to autonomously handle complex tasks. On the shop floor, this means agents can automate inspections, reoptimize production schedules, assist with fault diagnostics, and more. According to a LangChain survey , 78% of companies are actively developing AI agents, and over half already have at least one agent in production. Manufacturing companies can especially benefit from agentic capabilities across a great variety of practical use cases, as shown in Figure 1. Figure 1. Agent capabilities and related practical use cases in manufacturing. But leveraging AI agents in industrial environments presents unique challenges. Integration with industrial protocols like Modbus or PROFINET is complex. Governance and security requirements are strict, especially when agents interact with production equipment. Latency is also a concern as AI models need fast, reliable data access to support real-time responses. And with agents generating and consuming large volumes of data, companies need a data foundation that is reliable and can scale without sacrificing performance. Many of these challenges are not new to manufacturers—and MongoDB has a proven track record of addressing them. Industry leaders in manufacturing and automotive trust MongoDB to power critical IoT and telemetry use cases. Bosch , for example, uses MongoDB to store, manage, and analyze huge amounts of data to power its Bosch IoT Insights solution. MongoDB’s flexible document model is ideal for diverse sensor inputs and machine telemetry, while allowing systems to iterate and evolve quickly. It’s important to remember that, at its core, MongoDB was built for change, so when it comes to integrating AI in the shopfloor, it’s no surprise that MongoDB is emerging as the ideal data layer foundation. Companies like Novo Nordisk and Cisco rely on MongoDB to build and scale their AI capabilities, and leading platforms like XMPro APEX AI leverage MongoDB Atlas to create and manage advanced AI agents for industrial applications. MongoDB Atlas makes it easy to build AI Agents and operate them at scale. As both a vector and a document database, Atlas supports various search methods for agentic RAG, while also enabling agents to store short and long-term memory in the same database. The result is a unified data layer that bridges industrial IoT and agentic AI. Predictive maintenance is a perfect example of how these capabilities come together to drive real impact on the shop floor. In the next section, we’ll walk through a practical blueprint for building a multi-agent predictive maintenance system using MongoDB Atlas. Building a multi-agent predictive maintenance system This solution demonstrates how to build a multi-agent predictive maintenance system using MongoDB Atlas, LangGraph, and Amazon Bedrock. This system can streamline complex processes, such as detecting equipment anomalies, diagnosing root causes, generating work orders, and scheduling maintenance. At a high level, this solution leverages MongoDB Atlas as the unified data layer. LangGraph provides the orchestration layer, enabling graph-based coordination among agents, while Amazon Bedrock powers the underlying foundational models used by the agents to reason and make decisions. The architecture follows a supervisor-agent pattern. The supervisor coordinates tasks and delegates to three specialized agents: Failure agent , which performs root cause analysis and generates incident reports. Work order agent , which drafts maintenance work orders with detailed requirements. Planning agent , which identifies the optimal time slot for the maintenance task based on availability and production constraints. Figure 2. High-level architecture of a multi-agent predictive maintenance system. This modular design enables the system to scale easily and adapt to different operational needs. Let’s walk through the full process in four key steps. Step 1: Failure prediction kicks off the agentic workflow The process begins with an alert—something unusual in the machine data or logs that could point to a potential failure. MongoDB provides a unified view of operational data, real-time processing capabilities, and seamless compatibility with machine learning tools. Sensor data is processed in real-time using Atlas Stream Processing integrated with ML inference models. Features like native support for Time Series data and Online Archive facilitate managing telemetry data at scale efficiently. All while the downstream applications remain up to date with the latest notifications and dashboards by using Atlas Triggers , Change Streams , and Atlas Charts . From there, the supervisor agent takes over and coordinates the next steps. Figure 3. End-to-end failure prediction process that generates the alerts. Step 2: Leverage your data for root cause analysis The supervisor notifies the Failure Agent about the alert. Manual diagnostics of a machine can take hours—sifting through manuals, historical logs, and environmental data. The AI agent automates this process. It collects relevant documents, retrieves contextual insights using Atlas vector search, and analyzes environmental conditions stored in the database—like temperature or humidity at the time of failure. With this data, the agent performs a root cause analysis and proposes corrective actions. It generates a concise incident report and shares it with the supervisor agent, which then moves the workflow forward. Figure 4. Failure Agent performing root cause analysis. Step 3: Work order process automation The Work Order Agent receives the incident report and drafts a comprehensive maintenance work order. It pulls from previous similar tasks to estimate time requirements, identify the necessary materials, and ensure the right skill sets are listed. All of this is pre-filled into a standardized work order template and saved back into MongoDB Atlas. This step also includes a human-in-the-loop checkpoint. Technicians or supervisors can review and modify the draft before it is finalized. Figure 5 Work Order Agent is generating a draft work order and routing it for human validation. Step 4: Finding the optimal maintenance schedule Once the work order is approved, the Planning Agent steps in. Its task is to schedule the maintenance activity without disrupting production. The agent queries the production calendar, checks staff shift schedules, and verifies inventory availability for required materials. It considers alert severity and rescheduling constraints to find the most efficient time slot. Once the optimal window is identified, the agent sends the updated plan to the scheduling system. Figure 6. Planning Agent is evaluating constraints to identify the optimal maintenance schedule. While we focused on a predictive maintenance work flow, this architecture can be easily extended. Need agents for compliance reporting, spare parts procurement, or shift planning? No problem. With the right foundation, the possibilities are endless. Unlocking manufacturing excellence with Agentic AI Agentic AI represents a new chapter in the evolution of predictive maintenance, enabling manufacturers to move from reactive responses to intelligent, autonomous decision-making. By combining AI agents with real-time telemetry and a unified data foundation, teams can reduce downtime, cut maintenance costs, and boost equipment reliability. But to work at scale, these systems need flexible, high-performance infrastructure. With native support for time series data, vector search, stream processing, and more, MongoDB makes it easier to build, operate, and evolve multi-agent solutions in complex industrial environments. The result is smarter operations, greater resilience, and a clear path to manufacturing excellence. Clone the GitHub repository if you are interested in trying out this solution yourself. To learn more about MongoDB’s role in the manufacturing industry, please visit our manufacturing and automotive webpage .

August 18, 2025

Artificial Intelligence

How Tavily Uses MongoDB to Enhance Agentic Workflows

As AI agents grow in popularity and are used in increasingly mission-critical ways, preventing hallucinations and giving agents up-to-date context is more important than ever. Context can come from many sources—prompts, documents, proprietary internal databases, and the internet itself. Among these sources, the internet stands out as uniquely valuable, a best-in-class resource for humans and LLMs alike due to its massive scale and constant updates. But how can large language models (LLMs) access the latest and greatest information from the internet? Enter Tavily , one of the companies at the heart of this effort. Tavily provides an easy way to connect the web to LLMs, giving them the answers and context they need to be even more useful. MongoDB had the opportunity to sit down with Rotem Weiss, CEO of Tavily, and Eyal Ben Barouch, Tavily’s Head of Data and AI, to talk about the company’s history, how Tavily uses MongoDB, and the future of agentic workflows. Tavily’s origins Tavily began in 2023 with a simple but powerful idea. "We started with an open source project called GPT Researcher ," Weiss said. "It did something pretty simple—go to the web, do some research, get content, and write a report." That simplicity struck a chord. The project exploded, getting over 20,000 GitHub stars in under two years, signaling to the team that they had tapped into something developers desperately needed. The viral success revealed a fundamental gap in how AI systems access information. "So many use cases today require real-time search, whether it's from the web or from your users," Weiss noted. "And that is basically RAG (retrieval-augmented generation) ." "Developers are slowly realizing not everything is semantic, and that vector search alone cannot be the only solution for RAG," Weiss said. Indeed, for certain use cases, vector stores benefit from further context. This insight, buttressed by breakthrough research around CRAG (Corrective RAG) , pointed toward a future where systems automatically turn to the web to search when they lack sufficient information. Solving the real-time knowledge problem Consider the gap between static training data and our dynamic reality. Questions like "What is the weather today?" or "What was the score of the game last night?" require an injection of real-time information to accurately answer. Tavily's system fills this gap by providing AI agents with fresh, accurate data from the web, exactly when they need it. The challenge Tavily addresses goes beyond information retrieval. “Even if your model ‘knows’ the answer, it still needs to be sent in the right direction with grounded results—using Tavily makes your answers more robust,” Weiss explained. The new internet graph Weiss envisions a fundamental shift in how we think about the architecture of the web. "If you think about the new internet, it’s a fundamentally different thing. The internet used to be between people—you would send emails, you would search websites, etc. Now we have new players, the AI agents, who act as new nodes on the internet graph." These new nodes change everything. As they improve, AI agents can perform many of the same actions as humans, but with different needs and expectations. "Agents want different things than people want," Weiss explained. "They want answers; they don't need fancy UIs and a regular browser experience. They need a quick, scalable system to give them answers in real time. That's what Tavily gives you." The company's focus remains deliberately narrow and deep. "We always want to stick to the infrastructure layer compared to our competitors, since you don't know where the industry is going," Weiss said. "If we focus on optimizing the latency, the accuracy, the scalability, that's what is going to win, and that's what we're focused on." Figure 1. The road to insightful responses for users with TavilyHybridClient. MongoDB: The foundation for speed and scale To build their infrastructure, Tavily needed a database that could meet their ambitious performance requirements. For Weiss, the choice was both practical and personal. "MongoDB is the first database I ever used as a professional in my previous company," he said. "That's how I started, and I fell in love with MongoDB. It's amazing how flexible it is–it's so easy to implement everything." The document model, the foundation upon which MongoDB is built, allowed Tavily to build and scale an enterprise-grade solution quickly. But familiarity alone didn't drive the decision. MongoDB Atlas had the performance characteristics Tavily required. "Latency is one of the things that we always optimize for, and MongoDB delivers excellent price performance," Tavily’s Ben Barouch explained. "The performance is much more similar to a hot cache than a cold cache. It's almost like it's in memory!" The managed service aspect proved equally crucial. "MongoDB Atlas also saves a lot of engineering time," Weiss noted. In a fast-moving startup environment, MongoDB Atlas enabled Weiss to focus on building Tavily and not worry about the underlying data infrastructure. "Today, companies need to move extremely fast, and at very lean startups, you need to only focus on what you are building. MongoDB allows Tavily to focus on what matters most, our customers and our business." Three pillars of success The Tavily team highlighted three specific MongoDB Atlas characteristics that have become essential to their operations: Vector search : Perhaps most importantly for the AI era, MongoDB's vector search capabilities allow it to be "the memory for agents." As Weiss put it, "The only place where a company can have an edge is their proprietary data. Every company can access the best models, every company can search the web, every company can have good agent orchestration. The only differentiation is utilizing your internal, proprietary data and injecting it in the fastest and most efficient way to the prompt." MongoDB, first with Atlas Vector Search and now with Hybrid Search , has effective ways of giving agents performant context, setting them apart from those built with other technologies. Autoscaling : "Our system is built for a very fast-moving company, and we need to scale in a second," Weiss continued. "We don't need to waste time each week making changes that are done automatically by MongoDB Atlas." Monitoring : "We have other systems where we need to do our own monitoring with other cloud providers, and it's a lot of work that MongoDB Atlas takes care of for us," Weiss explained. "MongoDB has great visibility." Betting on proven innovation Tavily has been impressed with the way MongoDB has kept a finger on the pulse of the evolving AI landscape and added features accordingly. “I believed that MongoDB would be up to date quickly, and I was right," Weiss said. "MongoDB quickly thought about vector search, about other features that I needed, and got them in the product. Not having to bolt-on a separate vector database and having those capabilities natively in Atlas is a game changer for us." Ben Barouch emphasized the strategic value of MongoDB’s entire ecosystem, including the community built around the database: "When everyone's offering the same solutions, they become the baseline, and then the things that MongoDB excels at, things like reliability and scalability, are really amplified. The community, especially, is great; MongoDB has excellent developer relations, so learning and using MongoDB is very easy." The partnership between MongoDB and Tavily extends beyond technology to trust. "In this crazy market, where you have new tools every two hours and things are constantly changing, you want to make sure that you're choosing companies you trust to handle things correctly and fast," Weiss said. "I want a vendor where if I have feedback, I'm not afraid to say it, and they will listen." Looking ahead: The multi-agent future As Tavily continues building the infrastructure for AI agents to search the web, Weiss sees the next evolution already taking shape. "The future is going to be thinking about combining these one, two, three, four agents into a workflow that makes sense for specific use cases and specific companies. That will be the new developer experience." This vision of orchestrated AI workflows represents just the beginning. With MongoDB Atlas providing the scalable, reliable foundation they need, Tavily is positioning itself at the center of a fundamental shift in how information flows through our digital world. The internet welcomed people first, then connected them in revolutionary ways. Now, as AI agents join the network, companies like Tavily are building the infrastructure to ensure this next chapter of digital evolution is both powerful and accessible. With MongoDB as their foundation, they're not just adapting to the future—they're building it. Interested in building with MongoDB Atlas yourself? Try it today ! Use Tavily for working memory in this MongoDB tutorial . Explore Tavily’s Crawl to RAG example.

August 5, 2025

Artificial Intelligence

Automotive Document Intelligence with MongoDB Atlas Search

Picture two scenarios happening simultaneously across the automotive industry: In a service bay, a technician searches frantically through multiple systems for the correct procedure to address an unfamiliar warning code. They need safety warnings, torque specifications, and part numbers—immediately. Instead, they’re lost in hundreds of PDF pages, risking safety violations and extending repair times. Meanwhile, a customer sits at home, trying to understand a dashboard warning light. They search their owner’s manual PDF, scroll through forums, and eventually call the dealership—waiting on hold just to ask a simple question about whether they can drive safely to their appointment. Both scenarios represent massive inefficiencies in how automotive documentation is stored, accessed, and delivered. With technician shortages costing shops over $60,000 monthly per unfilled position , and 67% of customers preferring self-service options , the industry faces a critical gap between information availability and accessibility. We prototyped a solution that shows how you can transform static automotive manuals into intelligent, searchable knowledge bases using MongoDB Atlas . By combining flexible document storage with semantic search capabilities, you can create platforms that serve both technicians seeking repair procedures and customers looking for quick answers. Building intelligent documentation systems Automotive technical documentation presents unique challenges. Most existing systems have fixed, unchangeable data formats designed primarily for compliance rather than usability. These systems often vary across locations, lack integration with user profiles, and don’t support rapid data access. Organizations need to build custom ingestion pipelines that can process diverse documentation formats and create intelligent, searchable content. Success requires linking each interaction to user identity and storing information that supports immediate, personalized engagement. MongoDB’s flexible document model enables developers to create highly enriched documentation chunks that go far beyond simple text storage. Each document can contain the original content alongside extensive metadata, including source references, safety classifications, procedural hierarchies, user permissions, version control, and contextual relationships. As your organizational needs evolve, you can add new fields and metadata structures without schema migrations or downtime, enabling documentation systems to adapt to changing business needs. An alternative—or complementary—approach is using contextualized chunk embedding models like voyage-context-3 . Instead of relying on manual metadata or context augmentation, this model generates vector embeddings that inherently capture full-document context for each chunk. It leads to higher retrieval accuracy, reduces sensitivity to chunking strategy, and simplifies the pipeline with no downstream changes. Whether you choose a metadata-rich approach, an embedding-first strategy, or both, MongoDB supports it all. Figure 1. Document processing pipeline. This flexibility proves essential when organizations have multiple documentation sources in different formats. Custom processing pipelines can normalize content from various systems while preserving the unique metadata and relationships that make each source valuable. MongoDB’s document structure naturally accommodates this complexity, storing structured technical specifications alongside unstructured procedural text and user interaction history—all queryable through a single interface. Using a unified search that understands context MongoDB Atlas provides three complementary search capabilities that work together to deliver intelligent responses: MongoDB Atlas Search handles precise queries like part numbers and error codes. Technicians searching for a specific part number instantly find relevant diagnostic procedures, while customers typing “coolant warning light” get clear explanations. MongoDB Atlas Vector Search understands intent and context. A customer asking “Why is my engine making a clicking noise?” finds relevant content even without using technical terminology. This approach enables semantic understanding of automotive diagnostic information, enabling queries to match meaning rather than exact keywords. Hybrid search with $rankFusion combines both approaches, ensuring users find information whether they use technical terms or natural language: { $rankFusion: { input: { pipelines: { textSearch: { $search: ... }, vectorSearch: { $vectorSearch: ... } } }, combination: { weights: { textSearch: 1, vectorSearch: 1 } } } } Setting up scalable architecture for dual-purpose knowledge delivery The same MongoDB knowledge base serves both technicians and customers through tailored interfaces. Technicians access detailed procedures with safety warnings, technical specifications, and shop management system integration, while customers receive plain-language explanations, severity assessments, and service scheduling integration. Figure 2. MongoDB Atlas servicing both the technician interface and the customer portal. Custom-built processing pipelines can transform thousands of manual pages across multiple languages. MongoDB Atlas deployments can handle billions of documents while maintaining subsecond query performance. MongoDB Atlas Search and MongoDB Atlas Vector Search work together across this rich metadata, ensuring that whether users search for an error code or “Why won’t my car start?,” the system uses all available context to return relevant results quickly. Having a real-world impact When organizations replace static manuals with an AI-ready documentation platform, the upside reveals itself almost immediately: Customers find answers faster and adopt apps more readily, technicians spend less time hunting for information and more time generating revenue, and compliance teams rest easier knowing that critical warnings and audit trails live right inside every workflow. Iron Mountain’s new InSight Digital Experience Platform (DXP) , built on MongoDB Atlas and MongoDB Atlas Vector Search, is a great example of these benefits in action. By turning mountains of unstructured physical and digital content into searchable, structured data, Iron Mountain gives its customers powerful semantic search, context-aware recommendations, and AI-driven workflow automation—all while meeting strict regulatory requirements. Whether a user is looking for the latest repair bulletin, a decades-old loan document, or a region-specific compliance record, InSight DXP surfaces the right information instantly and tailors the guidance to each user’s expertise level. Transform your technical documentation today The automotive industry faces a clear inflection point. With McKinsey projecting $80 billion in automotive software market value by 2030 and technician shortages reaching crisis levels, organizations that modernize their documentation systems from a cost center into a competitive advantage will capture disproportionate value. Ready to revolutionize how your organization manages technical knowledge? Explore our automotive solutions and get started with MongoDB Atlas Vector Search today . Visit the MongoDB AI Learning Hub to learn more about building AI applications with MongoDB.

August 4, 2025

Artificial Intelligence

Fine-tune MongoDB Deployments with AppMap’s AI Tools and Diagrams

In a rapidly changing landscape, organizations that adapt for growth, efficiency, and competitiveness will be best positioned to succeed. Central to this effort is the continuous fine-tuning and troubleshooting of existing deployments, enabling companies to deliver high-performance applications that meet their business requirements. Yet, navigating application components often leads to long development cycles and high costs. Developers spend valuable time deciphering various programming languages, frameworks, and infrastructures to optimize their systems. They may have to work with complicated, intertwined code, which makes updates difficult. Moreover, older architectures increase information overload with no institutional memory to understand current workloads. To help organizations overcome these challenges, AppMap partnered with MongoDB Atlas to fine-tune MongoDB deployments and achieve optimal performance, enabling developers to build more modern and efficient applications. The AppMap solution empowers developers with AI-driven insights and interactive diagrams that clarify application behavior, decode complex application architectures, and streamline troubleshooting. This integration delivers personalized recommendations for query optimization, proper indexing, and better database interactions. Complementing these capabilities, MongoDB Atlas offers the flexibility, performance, and security essential for building resilient applications and advancing AI-powered experiences. AppMap’s technology stack Founded in 2020 by CEO Elizabeth Lawler, AppMap empowers developers to visualize, understand, and optimize application behavior. By analyzing applications in action, AppMap delivers precise insights into interactions and performance dynamics, recording APIs, functions, and service behaviors. This information is then presented as interactive diagrams, as shown in Figure 1, which can be easily searched and navigated to streamline the development process. Figure 1. Interactive diagram for a MongoDB query. As shown below, AppMap also features Navie, an AI assistant. Navie offers customers advanced code architecture analysis and customized recommendations, derived from capturing application behavior at runtime. This rich data empowers Navie to deliver smarter suggestions, assisting teams in debugging complex issues, asking contextual questions about unfamiliar code, and making more informed code changes. Figure 2. The AppMap Navie AI assistant. With these tools, AppMap improves the quality of the code running with MongoDB, helping developers better understand the flow of their apps. Using AppMap in a MongoDB application Imagine that your team has developed a new e-commerce application running on MongoDB. But you're unfamiliar with how this application operates, so you'd like to gain insights into its behavior. In this scenario, you decide to analyze your application using AppMap by executing the node package with your standard run command. npx appmap-node npm run dev With this command, you use your application just like you normally would. But now every time your app communicates through an API, it will create records. These records are used to create diagrams that help you see and understand how your application works. You can look at these diagrams to get more insights into your app's behavior and how it interacts with the MongoDB database. Figure 3. Interaction diagram for an e-commerce application. Next, you can use the Navie AI assistant to receive tailored insights and suggestions for your application. For instance, you can ask Navie to identify the MongoDB commands your application uses and to provide advice on optimizing query performance. Navie will identify the workflow of your application and may propose strategies to refine database queries, such as reindexing for improved efficiency or adjusting aggregation framework parameters. Figure 4. Insights provided by the Navie AI assistant. With this framework established, you can seamlessly interact with your MongoDB application, gain insights into its usage, enhance its performance, and achieve quicker time to market. Enhancing MongoDB apps with AppMap Troubleshooting and optimizing your MongoDB applications can be challenging, due to the complexity of related microservices that run your services. AppMap facilitates this process by providing in-depth insights into your application behavior with an AI-powered assistant, helping developers better understand your code. With faster root cause analysis and deeper code understanding, businesses can boost developer productivity, improve application performance, and enhance customer satisfaction. These benefits ultimately lead to greater agility and a stronger competitive position in the market. Enhance your development experience with MongoDB Atlas and AppMap . To learn more about how to fine-tune apps with MongoDB, check out the best practices guide for MongoDB performance and stop by our Partner Ecosystem Catalog to read about our integrations with MongoDB’s ever-evolving partner ecosystem.

July 30, 2025

Artificial Intelligence

Introducing voyage-context-3: Focused Chunk-Level Details with Global Document Context

Note to readers: voyage-context-3 is currently available through the Voyage AI API directly. For access, sign up for Voyage AI . TL;DR : We’re excited to introduce voyage-context-3, a contextualized chunk embedding model that produces vectors for chunks that capture the full document context without any manual metadata and context augmentation, leading to higher retrieval accuracies than with or without augmentation. It’s also simpler, faster, and cheaper, and is a drop-in replacement for standard embeddings without downstream workflow changes, also reducing chunking strategy sensitivity. On chunk-level and document-level retrieval tasks, voyage-context-3 outperforms OpenAI-v3-large by 14.24% and 12.56%, Cohere-v4 by 7.89% and 5.64%, Jina-v3 late chunking by 23.66% and 6.76%, and contextual retrieval by 20.54% and 2.40%, respectively. It also supports multiple dimensions and multiple quantization options enabled by Matryoshka learning and quantization-aware training, saving vectorDB costs while maintaining retrieval accuracy. For example, voyage-context-3 (binary, 512) outperforms OpenAI-v3-large (float, 3072) by 0.73% while reducing vector database storage costs by 99.48%—virtually the same performance at 0.5% of the cost. We’re excited to introduce voyage-context-3, a novel contextualized chunk embedding model, where chunk embedding encodes not only the chunk's own content, but also captures the contextual information from the full document. voyage-context-3 provides a seamless drop-in replacement for standard, context-agnostic embedding models used in existing retrieval-augmented generation (RAG) pipelines, while offering improved retrieval quality through its ability to capture relevant contextual information. Compared to both context-agnostic models with isolated chunking (e.g., OpenAI-v3-large, Cohere-v4) as well as existing methods that add context and metadata to chunks, including overlapping chunks and attaching metadata, voyage-context-3 delivers significant gains in retrieval performance while simplifying the tech stack. On chunk-level (retrieving the most relevant chunk) and document-level retrieval (retrieving the document containing the most relevant chunk), voyage-context-3 outperforms on average: OpenAI-v3-large and Cohere-v4 by 14.24% and 12.56%, and 7.89% and 5.64%, respectively. Context augmentation methods Jina-v3 late 1 chunking and contextual retrieval 2 by 23.66% and 6.76%, and 20.54% and 2.40%, respectively. voyage-3-large by 7.96% and 2.70%, respectively. Chunking challenges in RAG Focused detail vs global context. Chunking—breaking large documents into smaller segments, or chunks—is a common and often necessary step in RAG systems. Originally, chunking was primarily driven by the models’ limited context window (which is significantly extended by, e.g., Voyage’s models lately). More importantly, it allows the embeddings to contain precise fine-grained information about the corresponding passages, and as a result, allows the search system to pinpoint precisely relevant passages. However, this focus can come at the expense of a broader context. Finally, without chunking, users must pass complete documents to downstream large language models (LLMs), driving up costs as many tokens may be irrelevant to the query. For instance, if a 50-page legal document is vectorized into a single embedding, detailed information—such as the sentence “All data transmissions between the Client and the Service Provider’s infrastructure shall utilize AES-256 encryption in GCM mode”—is likely to be buried or lost in the aggregate. By chunking the document into paragraphs and vectorizing each one separately, the resulting embeddings can better capture localized details like “AES-256 encryption.” However, such a paragraph may not contain global context—such as the Client’s name—which is necessary to answer queries like “What encryption methods does Client VoyageAI want to use?” Ideally, we want both focused detail and global context—without tradeoffs . Common workarounds—such as chunk overlaps, context summaries using LLMs (e.g., Anthropic’s contextual retrieval), or metadata augmentation—can introduce extra steps into an already complex AI application pipeline. These steps often require further experimentation to tune, resulting in increased development time and serving cost overhead. Introducing contextualized chunk embeddings We’re excited to introduce contextualized chunk embeddings that capture both focused detail and global context. Our model processes the entire document in a single pass and generates a distinct embedding for each chunk. Each vector encodes not only the specific information within its chunk but also coarse-grained, document-level context, enabling richer and more semantically aware retrieval. The key is that the neural network sees all the chunks at the same time and decides intelligently what global information from other chunks should be injected into the individual chunk embeddings. Full document automatic context aware: Contextualized chunk embeddings capture the full context of the document without requiring the user to manually or explicitly provide contextual information. This leads to improved retrieval performance compared to isolated chunk embeddings, while remaining simpler, faster, and cheaper than other context-augmentation methods. Seamless drop-in replacement and storage cost parity: voyage-context-3 is a seamless drop-in replacement for standard, context-agnostic embedding models used in existing search systems, RAG pipelines, and agentic systems. It accepts the same input chunks and produces vectors with identical output dimensions and quantization—now enriched with document-level context for better retrieval performance. In contrast to ColBERT , which introduces an extensive amount of vectors and storage costs, voyage-context-3 generates the same number of vectors and is fully compatible with any existing vector database. Less sensitive to chunking strategy: While chunking strategy still influences RAG system behavior—and the optimal approach depends on data and downstream tasks—our contextualized chunk embeddings are empirically shown to reduce the system's sensitivity to these strategies, because the model intelligently supplement overly short chunks with global contexts. Contextualized chunk embeddings outperform manual or LLM-based contextualization because neural networks are trained to capture context intelligently from large datasets, surpassing the limitations of ad hoc efforts. voyage-context-3 was trained using both document-level and chunk-level relevance labels, along with a dual objective that teaches the model to preserve chunk-level granularity while incorporating global context. table, th, td { border: 1px solid black; border-collapse: collapse; } th, td { padding: 5px; } Context Preservation Engineering Complexity Retrieval Accuracy Standard Embeddings (e.g., OpenAI-v3-large) None Low Moderate Metadata Augmentation & Contextual Retrieval (e.g., Jina-v3 late chunking) Partial High Moderate-High Contextualized Chunk Embeddings (e.g., voyage-context-3) Full, Principled Low Highest Evaluation details Chunk-level and document-level retrieval For a given query, chunk-level retrieval returns the most relevant chunks, while document-level retrieval returns the documents containing those chunks. The figure below illustrates both retrieval levels across chunks from n documents. The most relevant chunk, often referred to as the “golden chunk,” is bolded and shown in green. Its corresponding parent document is shown in blue. Datasets We evaluate on 93 domain-specific retrieval datasets, spanning nine domains: web reviews, law, medical, long documents, technical documentation, code, finance, conversations, and multilingual, which are listed in this spreadsheet . Every dataset contains a set of queries and a set of documents. Each document consists of an ordered sequence of chunks, which are created by us via a reasonable chunking strategy. As usual, every query has a number of relevant documents with a potential score indicating the degree of relevance, which we call document-level relevance labels and can be used for the evaluation of document-level retrieval. Moreover, each query also has a list of most relevant chunks with relevance scores, which are curated through various ways, including labeling by LLMs. These are referred to as chunk-level relevance labels and are used for chunk-level retrieval evaluation. We also include proprietary real-world datasets, such as technical documentation and documents containing header metadata. Finally, we assess voyage-context-3 across different embedding dimensions and various quantization options, on standard single-embedding retrieval evaluation, using the same datasets as in our previous retrieval-quality-versus-storage-cost analysis . Models We evaluate voyage-context-3 alongside several alternatives, including: OpenAI-v3-large (text-embedding-3-large), Cohere-v4 (embed-v4.0), Jina-v3 late chunking (jina-embeddings-v3), contextual retrieval, voyage-3.5, and voyage-3-large. Metrics Given a query, we retrieve the top 10 documents based on cosine similarities and report the normalized discounted cumulative gain (NDCG@10), a standard metric for retrieval quality and a variant of the recall. Results All the evaluation results are available in this spreadsheet , and we analyze the data below. Domain-specific quality. The bar charts below show the average retrieval quality of voyage-context-3 with full-precision 2048 embeddings for each domain. In the following chunk-level retrieval chart, we can see that voyage-context-3 outperforms all other models across all domains. As noted earlier, for chunk-level retrieval, voyage-context-3 outperforms on average OpenAI-v3-large, Cohere-v4, Jina-v3 late chunking, and contextual retrieval by 14.24%, 7.89%, 23.66%, and 20.54%, respectively. voyage-context-3 also outperforms all other models across all domains in document-level retrieval, as shown in the corresponding chart below. On average, voyage-context-3 outperforms OpenAI-v3-large, Cohere-v4, Jina-v3 late chunking, and contextual retrieval by 12.56%, 5.64%, 6.76%, and 2.40%, respectively. Real-world datasets. voyage-context-3 performs strongly on our proprietary real-world technical documentation and in-house datasets, outperforming all other models. The bar chart below shows chunk-level retrieval results. Document-level retrieval results are provided in the evaluation spreadsheet . Chunking sensitivity . Compared to standard, context-agnostic embeddings, voyage-context-3 is less sensitive to variations in chunk size and delivers stronger performance with smaller chunks. For example, on document-level retrieval, voyage-context-3 shows only a 2.06% variance, compared to 4.34% for voyage-3-large, and outperforms voyage-3-large by 6.63% when using 64-token chunks. Context metadata . We also evaluate performance when context metadata is prepended to chunks. Even with metadata prepended to chunks embedded by voyage-3-large, voyage-context-3 outperforms it by up to 5.53%, demonstrating better retrieval performance without the extra work and resources required to prepend metadata. Matryoshka embeddings and quantization . voyage-context-3 supports 2048, 1024, 512, and 256- dimensional embeddings enabled by Matryoshka learning and multiple embedding quantization options—including 32-bit floating point, signed and unsigned 8-bit integer, and binary precision—while minimizing quality loss. To clarify in relation to the previous figures, the chart below illustrates single-embedding retrieval on documents. Compared with OpenAI-v3-large (float, 3072), voyage-context-3 (int8, 2048) reduces vector database costs by 83% with 8.60% better retrieval quality. Further, comparing OpenAI-v3-large (float, 3072) with voyage-context-3 (binary, 512), vector database costs are reduced by 99.48% with 0.73% better retrieval quality; that’s virtually the same retrieval performance at 0.5% of the cost. Try voyage-context-3 voyage-context-3 is available today! The first 200 million tokens are free. Get started with this quickstart tutorial . You can swap in voyage-context-3 into any existing RAG pipeline you have without requiring any downstream changes. Contextualized chunk embeddings are especially effective for: Long, unstructured documents such as white papers, legal contracts, and research reports. Cross-chunk reasoning , where queries require information that spans multiple sections. High-sensitivity retrieval tasks —such as in finance, medical, or legal domains—where missing context can lead to costly errors. To learn more about building AI applications with MongoDB, visit the MongoDB AI Learning Hub . 1 Jina. “ Late Chunking in Long-Context Embedding Models .” August 22, 2024 2 Anthropic. “ Introducing Contextual Retrieval .” September 19, 2024.

July 23, 2025

Artificial Intelligence

Revolutionizing Inventory Classification with Generative AI

In today's volatile geopolitical environment, the global automotive industry faces compounding disruptions that require a fundamental rethink of data and operations strategy. After decades of low import taxes, the return of tariffs as a tool of economic negotiations has led the global automotive industry to delay model-year transitions and disrupt traditional production and release cycles. As of June 2025, only 3% of US automotive inventory comprises next-model-year vehicles —less than half the number seen at this time in previous years. This severe decline in new-model availability, compounded by a 12.2% year-over-year drop in overall inventory, is pressuring consumer pricing and challenging traditional dealer inventory management. In this environment of constrained supply, better tools are urgently needed to classify and control vehicle, spare part, and raw material inventories for both dealers and manufacturers. Traditionally, dealerships and automakers have relied on ABC analysis to segment and control inventory by value. This widely used method classifies items into Category A, B, or C. For example, Category A items typically represent just 20% of stock but drive 80% of sales, while Category C items might comprise half the inventory yet contribute only 5% to the bottom line. This approach effectively helps prioritize resource allocation and promotional efforts. Figure 1. ABC analysis for inventory classification. While ABC analysis is known for its ease of use, it has been criticized for its focus on dollar usage. For example, not all Category C items are necessarily low-priority, as some may be next-model-year units arriving early or aging stock affected by shifting consumer preferences. Other criteria—such as lead-time, commonality, obsolescence, durability, inventory cost, and order size requirements—have also been recognized as critical for inventory classification. A multi-criteria inventory classification (MCIC) methodology, therefore, adds additional criteria to dollar usage. MCIC can be achieved with methods like statistical clustering or unsupervised machine learning techniques. Yet, a significant blind spot remains: the vast amount of unstructured data that organizations must deal with; unstructured data accounts for an estimated 80% of the world's total. Traditional ABC analysis—and even MCIC—often overlook the growing influence of insights gleaned from unstructured sources like customer sentiment and product reviews on digital channels. But now, valuable intelligence from reviews, social media posts, and dealer feedback can be vectorized and transformed into actionable features using large language models (LLMs). For instance, analyzing product reviews can yield qualitative metrics like the probability of recommending or repurchasing a product, or insights into customer expectations vs. the reality of ownership. This textual analysis can also reveal customers' product perspectives, directly informing future demand. By integrating these signals into inventory classification models, businesses can gain a deeper understanding of true product value and demand elasticity. This fusion of structured and unstructured data represents a crucial shift from reactive inventory management to predictive and customer-centric decision-making. In this blog post, we propose a novel methodology to convert unstructured data into powerful feature sets for augmenting inventory classification models. Figure 2. Transforming unstructured data into features for machine learning models. How MongoDB enables AI-driven inventory classification So, how does MongoDB empower the next generation of AI-driven inventory classification? It all comes down to four crucial steps, and MongoDB provides the robust technology and features to support every single one. Figure 3. Methodology and requirements for gen AI-powered inventory classification. Step 1: Create and store vector embeddings from unstructured data MongoDB Atlas enables modern vector search workflows. Unstructured data like product reviews, supplier notes, or customer support transcripts can be vectorized via embedding models (such as Voyage AI models) and ingested into MongoDB Atlas, where they are stored next to the original text chunks. This data then becomes searchable using MongoDB Atlas Vector Search, which allows you to run native semantic search queries directly inside the database. Unlike solutions that require separate databases for structured and vector data, MongoDB stores them side by side using the flexible document model, enabling unified access via one API. This reduces system complexity, technical debt, and infrastructure footprint—and allows for low-latency semantic searches. Figure 4. Product reviews can be stored as vector embeddings in MongoDB Atlas. Step 2: Design and store evaluation criteria In a gen AI-powered inventory classification system, evaluation criteria are no longer a set of static rules stored in a spreadsheet. Instead, the criteria are dynamic and data-backed, and are generated via an AI agent using structured and unstructured data—and enriched by domain experts using business objectives and constraints. As shown in Figure 5, the criteria for features like “Product Durability” can be defined based on relevant unstructured data stored in MongoDB (product reviews, audit reports) as well as structured data like inventory turnover and sales history. Such criteria are not just instructions or rules, but are knowledge objects with structure and semantic depth. The AI agent uses tools such as generate_criteria and embed_criteria tool and iterates over each product in the inventory. It leverages the LLM to create the criteria definition and uses an embedding model (e.g., voyage-3-large ) to generate embeddings of each definition. MongoDB Atlas is uniquely suited to store these dynamic criteria. Each rule is modeled as a flexible JSON document containing the name of the feature, criteria definition, data sources use, and the embeddings. Since there are different types of products (different car models/makes and different car parts), the documents can evolve over time without requiring schema migrations and be queried and retrieved by the AI agent in real time. MongodB Atlas provides all the necessary tools for this design—a flexible document model database, vector search, and full search tools—that can be leveraged by the AI agent to create the criteria. Figure 5. Unstructured and structured data are used by the AI agent to create criteria for feature generation. Step 3: Create an agentic application to perform transformation based on the criteria In the third step, we have another AI agent that operates over products, criteria, and unstructured data to generate enriched feature sets. This agent iterates over every product and uses MongoDB Atlas Vector Search to find relevant customer reviews to apply the criteria to and calculate a numerical feature score. The new features are added to the original features JSON document in MongoDB. In Figure 6, the agent has created “durability” and “criticality” features from the product reviews. MongoDB Atlas is the ideal foundation for this agentic architecture. Again, it provides the agent the tools it needs for features to evolve, adding new dimensions without requiring schema redesign. This results in an adaptive classification dataset that contains both structured and unstructured data. Figure 6. An AI agent enriches product features with vectorized review data to generate new features. Step 4: Rerun the inventory classification model with new features added As a final step, the inventory classification domain experts can assign or balance weights to existing and new features, choose a classification technique, and rerun inventory classification to find new inventory classes. Figure 7 shows the process where generative AI features are used in the existing inventory classification algorithm. Figure 7. Domain experts can rerun classification after balancing weights. Figure 8 shows the solution in action. The customer satisfaction score is created by LLM a using customer reviews vectorized collection and then utilized in the inventory classification model with a new weight of 0.2. Figure 8. Inventory classification using generative AI. Driving smarter inventory decisions As the automotive industry navigates slowing sales and uneven inventory, traditional inventory classification techniques also need to evolve. Though such techniques provide a solid foundation, they fall short in the face of geopolitical uncertainty, tariff-driven supply shifts, and fast-evolving consumer expectations. By combining structured sales and consumption data with unstructured insights, and enabling agentic AI using MongoDB, the automotive industry can enable a new era of inventory intelligence where products are dynamically classified based on all available data—both structured and unstructured. Clone the GitHub repository if you are interested in trying out this solution yourself. To learn more about MongoDB’s role in the manufacturing industry, please visit our manufacturing and automotive webpage .

July 16, 2025

Artificial Intelligence

Ready to get Started with MongoDB Atlas?

Start Free