MongoDB Developer Blog | MongoDB Blog

Why Multi-Agent Systems Need Memory Engineering

September 11, 2025

Developer Blog

The 10 Skills I Was Missing as a MongoDB User

When I first started using MongoDB, I didn’t have a plan beyond “install it and hope for the best.” I had read about how flexible it was, and it felt like all the developers swore by it, so I figured I’d give it a shot. I spun it up, built my first application, and got a feature working. But I felt like something was missing. It felt clunky. My queries were longer than I expected, and performance wasn’t great; I had the sense that I was fighting with the database instead of working with it. After a few projects like that, I began to wonder if maybe MongoDB wasn’t for me. Looking back now, I can say the problem wasn’t MongoDB, but was somewhere between the keyboard and the chair. It was me. I was carrying over habits from years of working with relational databases, expecting the same rules to apply. If MongoDB’s Skill Badges had existed when I started, I think my learning curve would have been a lot shorter. I had to learn many lessons the hard way, but these new badges cover the skills I had to piece together slowly. Instead of pretending I nailed it from day one, here’s the honest version of how I learned MongoDB, what tripped me up along the way, and how these Skill Badges would have helped. Learning to model the MongoDB way The first thing I got wrong was data modeling. I built my schema like I was still working in SQL– every entity in its own collection, always referencing instead of embedding, and absolutely no data duplication. It felt safe because it was familiar. Then I hit my first complex query. It required data from various collections, and suddenly, I found myself writing a series of queries and stitching them together in my code. It worked, but it was a messy process. When I discovered embedding, it felt like I had found a cheat code. I could put related data together in one single document, query it in one shot, and get better performance. That’s when I made my second mistake. I started embedding everything. At first, it seemed fine. However, my documents grew huge, updates became slower, and I was duplicating data in ways that created consistency issues. That’s when I learned about patterns like Extended References, and more generally, how to choose between embedding and referencing based on access patterns and update frequency. Later, I ran into more specialized needs, such as pre-computing data, embedding a subset of a large dataset into a parent, and tackling schema versioning. Back then, I learned those patterns by trial and error. Now, they’re covered in badges like Relational to Document Model , Schema Design Patterns , and Advanced Schema Patterns . Fixing what I thought was “just a slow query” Even after I got better at modeling, performance issues kept popping up. One collection in particular started slowing down as it grew, and I thought, “I know what to do! I’ll just add some indexes.” I added them everywhere I thought they might help. Nothing improved. It turns out indexes only help if they match your query patterns. The order of fields matters, and whether you cover your query shapes will affect performance. Most importantly, just because you can add an index doesn’t mean that you should be adding it in the first place. The big shift for me was learning to read an explain() plan and see how MongoDB was actually executing my queries. Once I started matching my indexes to my queries, performance went from “ok” to “blazing fast.” Around the same time, I stopped doing all my data transformation in application code. Before, I’d pull in raw data and loop through it to filter, group, and calculate. It was slow, verbose, and easy to break. Learning the aggregation framework completely changed that. I could handle the filtering and grouping right in the database, which made my code cleaner and the queries faster. There was a lot of guesswork in how I created my indexes, but the new Indexing Design Fundamentals covers that now. And when it comes to querying and analyzing data, Fundamentals of Data Transformation is there to help you. Had I had those two skills when I first started, I would’ve saved a lot of time wasted on trial and error. Moving from “it works” to “it works reliably” Early on, my approach to monitoring was simple: wait for something to break, then figure out why. If a performance went down, I’d poke around in logs. If a server stopped responding, I’d turn it off and on again, and hope for the best. It was stressful, and it meant I was always reacting instead of preventing problems. When I learned to use MongoDB’s monitoring tools properly, that changed. I could track latency, replication lag, and memory usage. I set alerts for unusual query patterns. I started seeing small problems before they turned into outages. Performance troubleshooting became more methodical as well. Instead of guessing, I measured. Breaking down queries, checking index use, and looking at server metrics side by side. The fixes were faster and more precise. Reliability was the last piece I got serious about. I used to think a working cluster was a reliable cluster. But reliability also means knowing what happens if a node fails, how quickly failover kicks in, and whether your recovery plan actually works in practice. Those things you can now learn in the Monitoring Tooling , Performance Tools and Techniques, and Cluster Reliability skill badges. If you are looking at deploying and maintaining MongoDB clusters, these skills will teach you what you need to know to make your deployment more resilient. Getting curious about what’s next Once my clusters were stable, I stopped firefighting, and my mindset changed. When you trust your data model, your indexes, your aggregations, and your operations, you get to relax. You can then spend that time on what’s coming next instead of fixing what’s already in production. For me, that means exploring features I wouldn’t have touched earlier, like Atlas Search , gen AI, and Vector Search . Now that the fundamentals are solid, I can experiment without risking stability and bring in new capabilities when a project actually calls for them. What I’d tell my past self If I could go back to when I first installed MongoDB, I’d keep it simple: Focus on data modeling first. A good foundation will save you from most of the problems I ran into. Once you have that, learn indexing and aggregation pipelines. They will make your life much easier when querying. Start monitoring from day one. It will save you a lot of trouble in the long run. Take a moment to educate yourself. You can only learn so much from trial and error. MongoDB offers a myriad of resources and ways to upskill yourself. Once you have established that base, you can explore more advanced topics and uncover the full potential of MongoDB. Features like Vector Search, full-text search with Atlas Search, or advanced schema design patterns are much easier to adopt when you trust your data model and have confidence in your operational setup. MongoDB Skill Badges cover all of these areas and more. They are short, practical, and focused on solving real problems you will face as a developer or DBA, and most of them can be taken over your lunch break. You can browse the full catalog at learn.mongodb.com/skills and pick the one that matches the challenge you are facing today. Keep going from there, and you might be surprised how much more you can get out of the database once you have the right skills in place.

October 2, 2025

Developer Blog

Top Considerations When Choosing a Hybrid Search Solution

Search has evolved. Today, natural language queries have largely replaced simple keyword searches when addressing our information needs. Instead of typing “Peru travel guide” into a search engine, we now ask a large langauge model (LLM) “Where should I visit in Peru in December during a 10-day trip? Create a travel guide.” Is keyword search no longer useful? While the rise of LLMs and vector search may suggest that traditional keyword search is becoming less prevalent, the future of search actually relies on effectively combining both methods. This is where hybrid search plays a crucial role, blending the precision of traditional text search with the powerful contextual understanding of vector search. Despite advances in vector technology, keyword search still has a lot to contribute and remains essential to meeting current user expectations. The rise of hybrid search By late 2022 and particularly throughout 2023, as vector search saw a surge in popularity (see image 1 below), it quickly became clear that vector embeddings alone were not enough. Even as embedding models continue to improve at retrieval tasks, full-text search will always remain useful for identifying tokens outside the training corpus of an embedding model. That is why users soon began to combine vector search with lexical search, exploring ways to leverage both precision and context-aware retrieval. This shift was driven in large part by the rise of generative AI use cases like retrieval-augmented generation (RAG), where high-quality retrieval is essential. Figure 1. Number of vector search vendors per year and type. As hybrid search matured beyond basic score combination, the main fusion techniques emerged - reciprocal rank fusion (RRF) and relative score fusion (RSF). They offer ways to combine results that do not rely on directly comparable score scales. RRF focuses on ranking position, rewarding documents that consistently appear near the top across different retrieval methods. RSF, on the other hand, works directly with raw scores from different sources of relevance, using normalization to minimize outliers and align modalities effectively at a more granular level than rank alone can provide. Both approaches quickly gained traction and have become standard techniques in the market. How did the market react? The industry realized the need to introduce hybrid search capabilities, which brought different challenges for different types of players. For lexical-first search platforms, the main challenge was to add vector search features and implement the bridging logic with their existing keyword search infrastructure. These vendors understood that the true value of hybrid search emerges when both modalities are independently strong, customizable, and tightly integrated. On the other hand, vector-first search platforms faced the challenge of adding lexical search. Implementing lexical search through traditional inverted indexes was often too costly due to storage differences, increased query complexity, and architectural overhead. Many adopted sparse vectors, which represent keyword importance in a way similar to traditional term-frequency methods used in lexical search. Sparse vectors were key for vector-first databases in enabling a fast integration of lexical capabilities without overhauling the core architecture. Hybrid search soon became table stakes and the industry focus shifted toward improving developer efficiency and simplifying integration. This led to a growing trend of vendors building native hybrid search functions directly into their platforms. By offering out-of-the-box support to combine and manage both search types, the delivery of powerful search experiences was accelerated. As hybrid search became the new baseline, more sophisticated re-ranking approaches emerged. Techniques like cross-encoders, learning-to-rank models, and dynamic scoring profiles began to play a larger role, providing systems with additional alternatives to capture nuanced user intent. These methods complement hybrid search by refining the result order based on deeper semantic understanding. What to choose? Lexical-first or vector-first solutions? Top considerations when choosing a hybrid search solution When choosing how to implement hybrid search, your existing infrastructure plays a major role in the decision. For users working within a vector-first database, leveraging their lexical capabilities without rethinking the architecture is often enough. However, if the lexical search requirements are advanced, commonly the optimal solution is served with a traditional lexical search solution coupled with vector search, like MongoDB. Traditional lexical - or lexical-first - search offers greater flexibility and customization for keyword search, and when combined with vectors, provides a more powerful and accurate hybrid search experience. Figure 2. Vector-first vs Lexical-first systems: Hybrid search evaluation. Indexing strategy is another factor to consider. When setting up hybrid search, users can either keep keyword and vector data in separate indexes or combine them into one. Separate indexes give more freedom to tweak each search type, scale them differently, and experiment with scoring. The compromise is higher complexity, with two pipelines to manage and the need to normalize scores. On the other hand, a combined index is easier to manage, avoids duplicate pipelines, and can be faster since both searches run in a single pass. However, it limits flexibility to what the search engine supports and ties the scaling of keyword and vector search together. The decision is mainly a trade-off between control and simplicity. Lexical-first solutions were built around inverted indexes for keyword retrieval, with vector search added later as a separate component. This often results in hybrid setups that use separate indexes. Vector-first platforms were designed for dense vector search from the start, with keyword search added as a supporting feature. These tend to use a single index for both approaches, making them simpler to manage but sometimes offering less mature keyword capabilities. Lastly, a key aspect to take into account is the implementation style. Solutions with hybrid search functions handle the combination of lexical and vector search natively, removing the need for developers to manually implement it. This reduces development complexity, minimizes potential errors, and ensures that result merging and ranking are optimized by default. Built-in function support streamlines the entire implementation, allowing teams to focus on building features rather than managing infrastructure. In general, lexical-first systems tend to offer stronger keyword capabilities and more flexibility in tuning each search type, while vector-first systems provide a simpler, more unified hybrid experience. The right choice depends on whether you prioritize control and mature lexical features or streamlined management with lower operational overhead. How does MongoDB do it? When vector search emerged, MongoDB added vector search indexes to the existing traditional lexical search indexes. With that, MongoDB evolved into a competitive vector database by providing developers with a unified architecture for building modern applications. The result is an enterprise-ready platform that integrates traditional lexical search indexes and vector search indexes into the core database. MongoDB recently released native hybrid search functions to MongoDB Atlas and as part of a public preview for use with MongoDB Community Edition and MongoDB Enterprise Server deployments. This feature is part of MongoDB’s integrated ecosystem, where developers get an out-of-the-box hybrid search experience to enhance the accuracy of application search and RAG use cases. As a result, instead of managing separate systems for different workloads, MongoDB users benefit from a single platform designed to support both operational and AI-driven use cases. As generative AI and modern applications advance, MongoDB gives organizations a flexible, AI-ready foundation that grows with them. Read our blog to learn more about MongoDB’s new Hybrid Search function. Visit the MongoDB AI Learning Hub to learn more about building AI applications with MongoDB.

September 30, 2025

Developer Blog

Endian Communication Systems and Information Exchange in Bytes

Imagine two people trying to exchange phone numbers. One starts from the country code and moves to the last digit, while the other begins at the last digit and works backwards. Both are technically right, but unless they agree on the direction, the number will never connect. Computers face a similar challenge when they talk to each other. Deep inside processors, memory chips, and network packets, data is broken into bytes. But not every system agrees on which byte should come first. Some start with the “big end” of the number, while others begin with the “little end.” This simple difference, known as endianness, quietly shapes how data is stored in memory, transmitted across networks, and interpreted by devices. Whether it’s an IoT sensor streaming temperature values, a server processing telecom call records, or a 5G base station handling billions of radio samples, the way bytes are ordered can determine whether the data makes perfect sense—or complete nonsense. What is endianness? An endian system defines the order in which bytes of a multi-byte number are arranged. Big-endian: The most significant byte (MSB) comes first, stored at the lowest address. Little-endian: The least significant byte (LSB) comes first, stored at the lowest address. For example, the number 0x12345678 would be arranged as: Big-endian → 12 34 56 78 Little-endian → 78 56 34 12 While this looks simple, the implications are huge. If one system sends data in little-endian while another expects big-endian, the values may be misread entirely. To avoid this, networking standards like IP, TCP, and UDP enforce big-endian (network byte order) as the universal convention. Industries where endianness shapes communication From the cell tower to the car dashboard, from IoT devices in our homes to high-speed trading systems, endianness is the silent agreement that keeps industries speaking the same digital language. Endianness may sound like a low-level detail, but it silently drives reliable communication across industries. In telecommunications and 5G, standards mandate big-endian formats so routers, servers, and base stations interpret control messages and packet headers consistently. IoT devices and embedded systems also depend on fixed byte order—sensors streaming temperature, pressure, or GPS data must follow a convention so cloud platforms decode values accurately. The automotive sector is another example: dozens of ECUs from different suppliers must agree on byte order to ensure that speed sensors, braking systems, and infotainment units share correct data. In finance and high-frequency trading, binary protocols demand strict endian rules—any mismatch could distort price feeds or disrupt trades. And in aerospace and defense, radar DSPs, avionics systems, and satellites require exact endian handling to process mission-critical data streams. Across all these domains, endian consistency acts as an invisible handshake, ensuring that machines with different architectures can still speak the same digital language. Use case architecture: From endian to analytics Figure 1. Architecture Diagram for the flow of data. The diagram above illustrates how low-level endian data from IoT devices can be transformed into high-value insights using a modern data pipeline. IoT devices (data sources): Multiple IoT devices (e.g., sensors measuring temperature, vibration, or pressure) generate raw binary data. To remain efficient and consistent, these devices often transmit data in a specific endian format (commonly big-endian). However, not all receiving systems use the same convention, which can lead to misinterpretation if left unhandled. Endian converter: The first processing step ensures that byte ordering is normalized. The endian converter translates raw payloads into a consistent format that downstream systems can understand. Without this step, a simple reading like 25.10°C could be misread as 52745°C—a critical error for industries like telecom or automotive. Apache Kafka (data transport layer): Once normalized, the data flows into Apache Kafka, a distributed streaming platform. Kafka ensures reliability, scalability, and low latency, allowing thousands of IoT devices to stream data simultaneously. It acts as a buffer and transport mechanism, ensuring smooth handoff between ingestion and storage. Atlas Stream Processing (real-time processing): Inside the MongoDB ecosystem, the Atlas Stream Processor consumes Kafka topics and enriches the data. Here, additional transformations, filtering, or business logic can be applied—such as tagging sensor IDs, flagging anomalies, or aggregating multiple streams into one coherent dataset. MongoDB Atlas (storage layer): Processed records are stored in MongoDB Atlas, which provides a flexible, document-oriented database model. This is especially valuable for IoT, where payloads may vary in structure depending on the device. MongoDB’s time-series collections ensure efficient handling of timestamped sensor readings at scale. Analytics & visualization : Finally, the clean, structured data becomes available for analytics tools like Tableau. Business users and engineers can visualize patterns, track equipment health, or perform predictive maintenance, turning low-level binary signals into actionable business intelligence. Endianness may seem like an obscure technicality buried deep inside processors and protocols, but in reality, it is the foundation of digital trust. Without a shared agreement on how bytes are ordered, the vast networks of IoT devices, telecom systems, cars, satellites, and financial platforms would quickly collapse into chaos. What makes this powerful is not just the correction of byte order, but what happens after. With pipelines that normalize, stream, and store data—like the one combining Endian conversion, Kafka, MongoDB Atlas, and Tableau—raw binary signals are elevated into business-ready insights. A vibration sensor’s byte sequence becomes an early-warning alert for machine failure; a packet header’s alignment ensures 5G base stations stay synchronized; a GPS reading, once correctly interpreted, guides a connected car safely on its route. In short, endianness is the invisible handshake between machines. When paired with modern data infrastructure, it transforms silent signals into meaningful stories—bridging the gap between the language of bytes and the language of decisions. To learn more, please check out the video of the prototype I have created. Boost your MongoDB skills by visiting the MongoDB Atlas Learning Hub .

September 25, 2025

Developer Blog

Build AI Agents Worth Keeping: The Canvas Framework

Why 95% of enterprise AI agent projects fail Development teams across enterprises are stuck in the same cycle: They start with "Let's try LangChain" before figuring out what agent to build. They explore CrewAI without defining the use case. They implement RAG before identifying what knowledge the agent actually needs. Months later, they have an impressive technical demo showcasing multi-agent orchestration and tool calling—but can't articulate ROI or explain how it solves actual business needs. According to McKinsey's latest research, while nearly eight in 10 companies report using generative AI, fewer than 10% of use cases deployed ever make it past the pilot stage . MIT researchers studying this challenge identified a " gen AI divide "—a gap between organizations successfully deploying AI and those stuck in perpetual pilots. In their sample of 52 organizations, researchers found patterns suggesting failure rates as high as 95% (pg.3). Whether the true failure rate is 50% or 95%, the pattern is clear: Organizations lack clear starting points, initiatives stall after pilot phases, and most custom enterprise tools fail to reach production. 6 critical failures killing your AI agent projects The gap between agentic AI's promise and its reality is stark. Understanding these failure patterns is the first step toward building systems that actually work. 1. The technology-first trap MIT's research found that while 60% of organizations evaluated enterprise AI tools, only 5% reached production (pg.6)—a clear sign that businesses struggle to move from exploration to execution. Teams rush to implement frameworks before defining business problems. While most organizations have moved beyond ad hoc approaches ( down from 19% to 6% , according to IBM), they've replaced chaos with structured complexity that still misses the mark. Meanwhile, one in four companies taking a true "AI-first" approach—starting with business problems rather than technical capabilities—report transformative results. The difference has less to do with technical sophistication and more about strategic clarity. 2. The capability reality gap Carnegie Mellon's TheAgentCompany benchmark exposed the uncomfortable truth: Even our best AI agents would make terrible employees . The best AI model (Claude 3.5 Sonnet) completes only 24% of office tasks , with 34.4% success when given partial credit . Agents struggle with basic obstacles, such as pop-up windows, which humans navigate instinctively. More concerning, when faced with challenges, some agents resort to deception , like renaming existing users instead of admitting they can't find the right person. These issues demonstrate fundamental reasoning gaps that make autonomous deployment dangerous in real business environments, rather than just technical limitations. 3. Leadership vacuum The disconnect is glaring: Fewer than 30% of companies report CEO sponsorship of the AI agenda despite 70% of executives saying agentic AI is important to their future . This leadership vacuum creates cascading failures—AI initiatives fragment into departmental experiments, lack authority to drive organizational change, and can't break through silos to access necessary resources. Contrast this with Moderna, where CEO buy-in drove the deployment of 750+ AI agents and radical restructuring of HR and IT departments. As with the early waves of Big Data, data science, then machine learning adoption, leadership buy-in is the deciding factor for the survival of generative AI initiatives. 4. Security and governance barriers Organizations are paralyzed by a governance paradox: 92% believe governance is essential, but only 44% have policies (SailPoint, 2025). The result is predictable—80% experienced AI acting outside intended boundaries, with top concerns including privileged data access (60%), unintended actions (58%), and sharing privileged data (57%). Without clear ethical guidelines, audit trails, and compliance frameworks, even successful pilots can't move to production. 5. Infrastructure chaos The infrastructure gap creates a domino effect of failures. While 82% of organizations already use AI agents, 49% cite data concerns as primary adoption barriers (IBM). Data remains fragmented across systems, making it impossible to provide agents with complete context. Teams end up managing multiple databases—one for operational data, another for vector data and workloads, a third for conversation memory—each with different APIs and scaling characteristics. This complexity kills momentum before agents can actually prove value. 6. The ROI mirage The optimism-reality gap is staggering. Nearly 80% of companies report no material earnings impact from gen AI (McKinsey), while 62% expect 100%+ ROI from deployment (PagerDuty). Companies measure activity (number of agents deployed) rather than outcomes (business value created). Without clear success metrics defined upfront, even successful implementations look like expensive experiments. The AI development paradigm shift: from data-first to product-first There's been a fundamental shift in how successful teams approach agentic AI development, and it mirrors what Shawn Wang (Swyx) observed in his influential " Rise of the AI Engineer " post about the broader generative AI space. The old way: data → model → product In the traditional paradigm practiced during the early years of machine learning, teams would spend months architecting datasets, labeling training data, and preparing for model pre-training. Only after training custom models from scratch could they finally incorporate these into product features. The trade-offs were severe: massive upfront investment, long development cycles, high computational costs, and brittle models with narrow capabilities. This sequential process created high barriers to entry—only organizations with substantial ML expertise and resources could deploy AI features. Figure 1. The Data → Model → Product Lifecycle. Traditional AI development required months of data preparation and model training before shipping products. The new way: product → data → model The emergence of foundation models changed everything. Figure 2. The Product → Data → Model Lifecycle. Foundation model APIs flipped the traditional cycle, enabling rapid experimentation before data and model optimization. Powerful LLMs became commoditized through providers like OpenAI and Anthropic. Now, teams could: Start with the product vision and customer need. Identify what data would enhance it (examples, knowledge bases, RAG content). Select the appropriate model that could process that data effectively. This enabled zero-shot and few-shot capabilities via simple API calls. Teams could build MVPs in days, define their data requirements based on actual use cases, then select and swap models based on performance needs. Developers now ship experiments quickly, gather insights to improve data (for RAG and evaluation), then fine-tune only when necessary. This democratized cutting-edge AI to all developers, not just those with specialized ML backgrounds. The agentic evolution: product → agent → data → model But for agentic systems, there's an even more important insight: Agent design sits between product and data. Figure 3. The Product → Agent → Data → Model Lifecycle. Agent design now sits between product and data, determining downstream requirements for knowledge, tools, and model selection. Now, teams follow this progression: Product: Define the user problem and success metrics. Agent: Design agent capabilities, workflows, and behaviors. Data: Determine what knowledge, examples, and context the agent needs. Model: Select external providers and optimize prompts for your data. With external model providers, the "model" phase is really about selection and integration rather than deployment. Teams choose which provider's models best handle their data and use case, then build the orchestration layer to manage API calls, handle failures, and optimize costs. The agent layer shapes everything downstream—determining what data is needed (knowledge bases, examples, feedback loops), what tools are required (search, calculation, code execution), and ultimately, which external models can execute the design effectively. This evolution means teams can start with a clear user problem, design an agent to solve it, identify necessary data, and then select appropriate models—rather than starting with data and hoping to find a use case. This is why the canvas framework follows this exact flow. The canvas framework: A systematic approach to building AI agents Rather than jumping straight into technical implementation, successful teams use structured planning frameworks. Think of them as "business model canvases for AI agents"—tools that help teams think through critical decisions in the right order. Two complementary frameworks directly address the common failure patterns: Figure 4. The Agentic AI Canvas Framework. A structured five-phase approach moving from business problem definition through POC, prototype, production canvas, and production agent deployment. Please see the “Resources” section at the end for links to the corresponding templates, hosted in the gen AI Showcase. Canvas #1 - The POC canvas for validating your agent idea The POC canvas implements the product → agent → data → model flow through eight focused squares designed for rapid validation: Figure 5. The Agent POC Canvas V1. Eight focused squares implementing the product → agent → data → model flow for rapid validation of AI agent concepts. Phase 1: Product validation—who needs this and why? Before building anything, you must validate that a real problem exists and that users actually want an AI agent solution. This phase prevents the common mistake of building impressive technology that nobody needs. If you can't clearly articulate who will use this and why they'll prefer it to current methods, stop here. table, th, td { border: 1px solid black; border-collapse: collapse; } th, td { padding: 5px; } Square Purpose Key Questions Product vision & user problem Define the business problem and establish why an agent is the right solution. Core problem: What specific workflow frustrates users today? Target users: Who experiences this pain and how often? Success vision: What would success look like for users? Value hypothesis: Why would users prefer an agent to current solutions? User validation & interaction User Validation & Interaction Map how users will engage with the agent and identify adoption barriers. User journey: What's the complete interaction from start to finish? Interface preference: How do users want to interact? Feedback mechanisms: How will you know it's working? Adoption barriers: What might prevent users from trying it? Phase 2: Agent design—what will it do and how? With a validated problem, design the agent's capabilities and behavior to solve that specific need. This phase defines the agent's boundaries, decision-making logic, and interaction style before any technical implementation. The agent design directly determines what data and models you'll need, making this the critical bridge between problem and solution. table, th, td { border: 1px solid black; border-collapse: collapse; } th, td { padding: 5px; } Square Purpose Key Questions Agent capabilities & workflow Agent Capabilities & Workflow Design what the agent must do to solve the identified problem. Core tasks: What specific actions must the agent perform? Decision logic: How should complex requests be broken down? Tool requirements: What capabilities does the agent need? Autonomy boundaries: What can it decide versus escalate? Agent interaction & memory Agent Interaction & Memory Establish communication style and context management. Conversation flow: How should the agent guide interactions? Personality and tone: What style fits the use case? Memory requirements: What context must persist? Error handling: How should confusion be managed? Phase 3: Data requirements—what knowledge does it need? Agents are only as good as their knowledge base, so identify exactly what information the agent needs to complete its tasks. This phase maps existing data sources and gaps before selecting models, ensuring you don't choose technology that can't handle your data reality. Understanding data requirements upfront prevents the costly mistake of selecting models that can't work with your actual information. table, th, td { border: 1px solid black; border-collapse: collapse; } th, td { padding: 5px; } Square Purpose Key Questions Knowledge requirements & sources Identify essential information and where to find it. Essential knowledge: What information must the agent have to complete tasks? Data sources: Where does this knowledge currently exist? Update frequency: How often does this information change? Quality requirements: What accuracy level is needed? Data collection & enhancement strategy Plan data gathering and continuous improvement. Collection strategy: How will initial data be gathered? Enhancement priority: What data has the biggest impact? Feedback loops: How will interactions improve the data? Integration method: How will data be ingested and updated? Phase 4: External model integration—which provider and how? Only after defining data needs should you select external model providers and build the integration layer. This phase tests whether available models can handle your specific data and use case while staying within budget. The focus is on prompt engineering and API orchestration rather than model deployment, reflecting how modern AI agents actually get built. table, th, td { border: 1px solid black; border-collapse: collapse; } th, td { padding: 5px; } Square Purpose Key Questions Provider selection & prompt engineering Choose external models and optimize for your use case. Provider evaluation: Which models handle your requirements best? Prompt strategy: How should you structure requests for optimal results? Context management: How should you work within token limits? Cost validation: Is this economically viable at scale? API integration & validation Build orchestration and validate performance. Integration architecture: How do you connect to providers? Response processing: How do you handle outputs? Performance testing: Does it meet requirements? Production readiness: What needs hardening? Figure 6. The Agent POC Canvas V1 (Detailed). Expanded view with specific guidance for each of the eight squares covering product validation, agent design, data requirements, and external model integration. Unified data architecture: solving the infrastructure chaos Remember the infrastructure problem—teams managing three separate databases with different APIs and scaling characteristics? This is where a unified data platform becomes critical. Agents need three types of data storage: Application database: For business data, user profiles, and transaction history Vector store: For semantic search, knowledge retrieval, and RAG Memory store: For agent context, conversation history, and learned behaviors Instead of juggling multiple systems, teams can use a unified platform like MongoDB Atlas that provides all three capabilities—flexible document storage for application data, native vector search for semantic retrieval, and rich querying for memory management—all in a single platform. This unified approach means teams can focus on prompt engineering and orchestration rather than model infrastructure, while maintaining the flexibility to evolve their data model as requirements become clearer. The data platform handles the complexity while you optimize how external models interact with your knowledge. For embeddings and search relevance, specialized models like Voyage AI can provide domain-specific understanding, particularly for technical documentation where general-purpose embeddings fall short. The combination of unified data architecture with specialized embedding models addresses the infrastructure chaos that kills projects. This unified approach means teams can focus on agent logic rather than database management, while maintaining the flexibility to evolve their data model as requirements become clearer. Canvas #2 - The production canvas for scaling your validated AI agent When a POC succeeds, the production canvas guides the transition from "it works" to "it works at scale" through 11 squares organized following the same product → agent → data → model flow, with additional operational concerns: Figure 7. The Productionize Agent Canvas V1. Eleven squares guiding the transition from validated POC to production-ready systems, addressing scale, architecture, operations, and governance. Phase 1: Product and scale planning Transform POC learnings into concrete business metrics and scale requirements for production deployment. This phase establishes the economic case for investment and defines what success looks like at scale. Without clear KPIs and growth projections, production systems become expensive experiments rather than business assets. table, th, td { border: 1px solid black; border-collapse: collapse; } th, td { padding: 5px; } Square Purpose Key Questions Business case & scale planning Translate POC validation into production metrics. Proven value: What did the POC validate? Business KPIs: What metrics measure ongoing success? Scale requirements: How many users and interactions? Growth strategy: How will usage expand over time? Production requirements & constraints Define performance standards and operational boundaries. Performance standards: Response time, availability, throughput? Reliability requirements: Recovery time and failover? Budget constraints: Cost limits and optimization targets? Security needs: Compliance and data protection requirements? Phase 2: Agent architecture Design robust systems that handle complex workflows, multiple agents, and inevitable failures without disrupting users. This phase addresses the orchestration and fault tolerance that POCs ignore but production demands. The architecture decisions here determine whether your agent can scale from 10 users to 10,000 without breaking. table, th, td { border: 1px solid black; border-collapse: collapse; } th, td { padding: 5px; } Square Purpose Key Questions Robust agent architecture Design for complex workflows and fault tolerance. Workflow orchestration: How do you manage multi-step processes? Multi-agent coordination: How do specialized agents collaborate? Fault tolerance: How do you handle failures gracefully? Update rollouts: How do you update without disruption? Production memory & context systems Implement scalable context management. Memory architecture: Session, long-term, and organizational knowledge? Context persistence: Storage and retrieval strategies? Cross-session continuity: How do you maintain user context? Memory lifecycle management: Retention, archival, and cleanup? Phase 3: Data infrastructure Build the data foundation that unifies application data, vector storage, and agent memory in a manageable platform. This phase solves the "three database problem" that kills production deployments through complexity. A unified data architecture reduces operational overhead while enabling the sophisticated retrieval and context management that production agents require. table, th, td { border: 1px solid black; border-collapse: collapse; } th, td { padding: 5px; } Square Purpose Key Questions Data architecture & management Build a unified platform for all data types. Platform architecture: Application, vector, and memory data? Data pipelines: Ingestion, processing, and updates? Quality assurance: Validation and freshness monitoring? Knowledge governance: Version control and approval workflows? Knowledge base & pipeline operations Maintain and optimize knowledge systems. Update strategy: How does knowledge evolve? Embedding approach: Which models for which content? Retrieval optimization: Search relevance and reranking? Operational monitoring: Pipeline health and costs? Phase 4: Model operations Implement strategies for managing multiple model providers, fine-tuning, and cost optimization at production scale. This phase covers API management, performance monitoring, and the continuous improvement pipeline for model performance. The focus is on orchestrating external models efficiently rather than deploying your own, including when and how to fine-tune. table, th, td { border: 1px solid black; border-collapse: collapse; } th, td { padding: 5px; } Square Purpose Key Questions Model strategy & optimization Manage providers and fine-tuning strategies. Provider selection: Which models for which tasks? Fine-tuning approach: When and how to customize? Routing logic: Base versus fine-tuned model decisions? Cost controls: Caching and intelligent routing? API management & monitoring Handle external APIs and performance tracking. API configuration: Key management and failover? Performance Tracking: Accuracy, latency, and costs? Fine-tuning pipeline: Data collection for improvement? Version control: A/B testing and rollback strategies? Phase 5: Hardening and operations Add the security, compliance, user experience, and governance layers that transform a working system into an enterprise-grade solution. This phase addresses the non-functional requirements that POCs skip but enterprises demand. Without proper hardening, even the best agents remain stuck in pilot purgatory due to security or compliance concerns. table, th, td { border: 1px solid black; border-collapse: collapse; } th, td { padding: 5px; } Square Purpose Key Questions Security & compliance Implement enterprise security and regulatory controls. Security implementation: Authentication, encryption, and access management? Access control: User and system access management? Compliance framework: Which regulations apply? Audit capabilities: Logging and retention requirements? User experience & adoption Drive usage and gather feedback. Workflow integration: How do you fit existing processes? Adoption strategy: Rollout and engagement plans? Support systems: Documentation and help channels? Feedback integration: How does user input drive improvement? Continuous improvement & governance Ensure long-term sustainability. Operational procedures: Maintenance and release cycles? Quality gates: Testing and deployment standards? Cost management: Budget monitoring and optimization? Continuity planning: Documentation and team training? Figure 8. The Productionize Agent Canvas V1 (Detailed). Expanded view with specific guidance for each of the eleven squares covering scale planning, architecture, data infrastructure, model operations, and hardening requirements. Next steps: start building AI agents that deliver ROI MIT's research found that 66% of executives want systems that learn from feedback , while 63% demand context retention (pg.14). The dividing line between AI and human preference is memory, adaptability, and learning capability. The canvas framework directly addresses the failure patterns plaguing most projects by forcing teams to answer critical questions in the right order—following the product → agent → data → model flow that successful teams have discovered. For your next agentic AI initiative: Start with the POC canvas to validate concepts quickly. Focus on user problems before technical solutions. Leverage AI tools to rapidly prototype after completing your canvas. Only scale what users actually want with the production canvas. Choose a unified data architecture to reduce complexity from day one. Remember: The goal isn't to build the most sophisticated agent possible—it's to build agents that solve real problems for real users in production environments. For hands-on guidance on memory management, check out our webinar on YouTube, which covers essential concepts and proven techniques for building memory-augmented agents. Head over to the MongoDB AI Learning Hub to learn how to build and deploy AI applications with MongoDB. Resources Download POC Canvas Template (PDF) Download Production Canvas Template (PDF) Download Combined POC + Production Canvas (Excel) - Get both canvases in a single excel file, with example prompts and blank templates. Full reference list McKinsey & Company . (2025). "Seizing the agentic AI advantage." ttps://www.mckinsey.com/capabilities/quantumblack/our-insights/seizing-the-agentic-ai-advantage MIT NANDA . (2025). "The GenAI Divide: State of AI in Business 2025." Report Gartner . (2025). "Gartner Predicts Over 40% of Agentic AI Projects Will Be Canceled by End of 2027." https://www.gartner.com/en/newsroom/press-releases/2025-06-25-gartner-predicts-over-40-percent-of-agentic-ai-projects-will-be-canceled-by-end-of-2027 IBM . (2025). "IBM Study: Businesses View AI Agents as Essential, Not Just Experimental." https://newsroom.ibm.com/2025-06-10-IBM-Study-Businesses-View-AI-Agents-as-Essential,-Not-Just-Experimental Carnegie Mellon University . (2025). "TheAgentCompany: Benchmarking LLM Agents." https://www.cs.cmu.edu/news/2025/agent-company Swyx . (2023). "The Rise of the AI Engineer." Latent Space. https://www.latent.space/p/ai-engineer SailPoint . (2025). "SailPoint research highlights rapid AI agent adoption, driving urgent need for evolved security." https://www.sailpoint.com/press-releases/sailpoint-ai-agent-adoption-report SS&C Blue Prism . (2025). "Generative AI Statistics 2025." https://www.blueprism.com/resources/blog/generative-ai-statistics-2025/ PagerDuty . (2025). "State of Digital Operations Report." https://www.pagerduty.com/newsroom/2025-state-of-digital-operations-study/ Wall Street Journal . (2024). "How Moderna Is Using AI to Reinvent Itself." https://www.wsj.com/articles/at-moderna-openais-gpts-are-changing-almost-everything-6ff4c4a5

September 23, 2025

Developer Blog

Modernizing Core Insurance Systems: Breaking the Batch Bottleneck

Modernizing your legacy database to Java + MongoDB Atlas doesn’t have to mean sacrificing batch performance. By leveraging bulk operations, intelligent prefetching, and parallel execution, we built an optimization framework that not only bridges the performance gap but, in many cases, surpasses legacy systems. In workloads where jobs were previously running 25–30x slower before using this framework, it brought execution times back on par and, in some cases, delivered 10–15x better performance. For global insurance platforms, significantly improved batch performance has become an added technical benefit to potentially support newer functionality. The modernization dilemma For organizations modernizing their core platforms, catering to significant user workload and revenue-generating applications, moving from a legacy RDBMS to a modern application stack with Java + MongoDB unlocks several benefits: Flexible document model: PL/SQL code tightly couples business logic with the database, making even small changes risky and time-consuming. MongoDB Atlas , with its flexible document model and application-driven logic, enables teams to evolve schemas and processes quickly, a huge advantage for industries like insurance, where regulations, products, and customer expectations change rapidly. Scalability and resilience: Legacy RDBMS platforms were never designed for today’s scale of digital engagement. MongoDB’s distributed architecture supports horizontal scale-out , ensuring that core insurance workloads can handle growing customer bases, high-volume claims, and peak-time spikes without major redesigns. Cloud-native by design: MongoDB is built to thrive in the cloud. Features like global clusters, built-in replication, and high availability with reduced infrastructure complexity, while enabling deployment flexibility across hybrid and multi-cloud environments. Modern developer ecosystem: Decouples database and business logic dependencies, accelerating feature delivery. Unified operational + analytical workloads: Modern insurance platforms demand more than transactional processing; they require real-time insights. MongoDB’s ability to support both operational workloads and analytics on live data reduces the gap between claims processing and decision-making. However, alongside these advantages, one of the first hurdles they encounter is batch jobs performance, the jobs that are meant to run daily/weekly/monthly, like an ETL process. PL/SQL thrives on set-based operations within the database engine. But when the same workloads are reimplemented with a separate application layer and MongoDB, they can suddenly become unpredictable, slow, and even time out. In some cases, processes that ran smoothly for years started running 25–30x slower after a like-for-like migration. The majority of the issues can be factored into the following broad categories: High network round-trips between the application and the database. Inefficient per-record operations replacing set-based logic. Under-utilization of database bulk capabilities. Application-layer computation overhead when transforming large datasets. For teams migrating complex ETL-like processes, this wasn’t just a technical nuisance—it became a blocker for modernization at scale. The breakthrough: A batch job optimization framework We designed an extensible, multi-purpose & resilient batch optimization framework purpose-built for high-volume, multi-collection operations in MongoDB. The framework focuses on minimising application-database friction while retaining the flexibility of Java services. Key principles include: Bulk operations at scale: Leveraging MongoDB’s native ```bulkWrite``` (including multi-collection bulk transactions in MongoDB 8) to process thousands of operations in a single round trip. Intelligent prefetching: Reducing repeated lookups by pre-loading and caching reference data in memory-friendly structures. Parallel processing: Partitioning workloads across threads or event processors (e.g., Disruptor pattern) for CPU-bound and I/O-bound steps. Configurable batch sizes: Dynamically tuning batch chunk sizes to balance memory usage, network payload size, and commit frequency. Pluggable transformation modules: Modularized data transformation logic that can be reused across multiple processes. Technical architecture The framework adopts a layered and orchestrated approach to batch job processing, where each component has a distinct responsibility in the end-to-end workflow. The diagram illustrates the flow of a batch execution: Trigger (user / cron job): The batch process begins when a user action or a scheduled cron job triggers the Spring Boot controller. Spring boot controller: The controller initiates the process by fetching the relevant records from the database. Once retrieved, it splits the records into batches for parallel execution. Database: Acts as the source of truth for input data and the destination for processed results. It supports both reads (to fetch records) and writes (to persist batch outcomes). Executor framework: This layer is responsible for parallelizing workloads. It distributes batched records, manages concurrency, and invokes ETL tasks efficiently. ETL process: The ETL (Extract, Transform, Load) logic is applied to each batch. Data is pre-fetched, transformed according to business rules, and then loaded back into the database. Completion & write-back: Once ETL operations are complete, the executor framework coordinates database write operations and signals the completion of the batch. Figure 1. The architecture for the layered approach. From bottleneck to advantage The results were striking. Batch jobs that previously timed out are now completed predictably within defined SLAs, and workloads that had initially run 25–30x slower after migration were optimized to perform on par with legacy RDBMSs and in several cases even deliver 10–15x better performance. What was once a bottleneck became a competitive advantage, proving that batch processing on MongoDB can significantly outperform legacy PL/SQL when implemented with the right optimization framework. Caveats and tuning tips While the framework is adaptable, its performance depends on workload characteristics and infrastructure limits: Batch size tuning: Too large can cause memory pressure; too small increases round-trips. Transaction boundaries: MongoDB transactions have limits (document size, total operations), plan batching accordingly. Thread pool sizing: Over-parallelization can overload the database or network. Index strategy: Even with bulk writes, poor indexing can cause slowdowns. Prefetch scope: Balance memory usage against lookup frequency. In short, it’s not one size fits all. Every workload is different, the data you process, the rules you apply, and the scale you run at all shape how things perform. What we’ve seen though is that with the right tuning, this framework can handle scale reliably and take batch processing from being a pain point to something that actually gives you an edge. If you’re exploring how to modernize your own workloads, this approach is a solid starting point. You can pick and choose the parts that make sense for your setup, and adapt as you go. Ready to modernize your applications? Visit the modernization page to learn about the MongoDB Application Platform.

September 18, 2025

Developer Blog

2 “Lightbulb Moments” for Better Data Modeling

We all know that feeling: the one where you have been wrestling with a complex problem for hours, maybe even days, and suddenly, it just clicks. It is a rush of clarity, a "lightbulb moment" that washes away the frustration and replaces it with pure, unadulterated excitement. It's the moment you realize a solution is not just possible, it is elegant. This blog post is dedicated to that feeling—the burst of insight when you discover a more intuitive, performant, and powerful way to work. We have spoken with hundreds of developers new to MongoDB to understand where they get stuck, and we have distilled their most common "Lightbulb Moments" to help you get there faster. We found that the key to getting the best performance from MongoDB is to adjust the way you think about data. Once developers recognize that the flexible document model gives them more control, not less, they become unstoppable. In this inaugural post of our new “Lightbulb Moments” blog series, we will walk you through two essential data modeling tips on schema validation and versioning and the the Single Collection Pattern. These concepts will help you structure your data for optimal performance in MongoDB and lead to your own "Lightbulb Moments," showing you how to build fast, efficient, and scalable applications. 1. Schema validation and versioning: Flexibility with control A common misconception about MongoDB is that its underlying document model is “schemaless.” With MongoDB, your schema is flexible and dependent on the needs of your application. If your workload demands a more structured schema, you can create validation rules for your fields with schema validation. If your schema requires more flexibility to adapt to changes over time, you can apply the schema versioning pattern. db.createCollection("students", { validator: { $jsonSchema: { bsonType: "object", title: "Student Object Validation", required: [ "address", "major", "name", "year" ], properties: { name: { bsonType: "string", description: "'name' must be a string and is required" }, year: { bsonType: "int", minimum: 2017, maximum: 3017, description: "'year' must be an integer in [ 2017, 3017 ] and is required" }, gpa: { bsonType: [ "double" ], description: "'gpa' must be a double if the field exists" } } } } } ) MongoDB has offered schema validation since 2017, providing as much structure and enforcement as a traditional relational database. This feature allows developers to define validation rules for their collections using the industry-standard JSON Schema. Schema validation gives you the power to: Ensure every new document written to a collection conforms to a defined structure. Specify required fields and their data types, including for nested documents. Choose the level of strictness, from off to strict , and whether to issue a warning or an error when a document fails validation. The most powerful feature, however, is schema versioning . This pattern allows you to: Gradually evolve your data schema over time without downtime or the need for migration scripts. Support both older and newer document versions simultaneously by defining multiple schemas as valid within a single collection using the oneOf operator. db.contacts.insertOne( { _id: 2, schemaVersion: 2, name: "Cameron", contactInfo: { cell: "903-555-1122", work: "670-555-7878", instagram: "@camcam9090", linkedIn: "CameronSmith123" } } ) Performance problems can stem from poor schema design—not the database. One example from a financial services company showed that proper schema design improved query speed from 40 seconds to 377 milliseconds. The ability to enforce a predictable structure while maintaining the flexibility of the document model gives developers the best of both worlds. For a deeper dive: See an example of schema optimization in MongoDB Design Reviews: how applying schema design best practices resulted in a 60x performance improvement by Staff Developer Advocate, Graeme Robinson. Learn how to use MongoDB Compass to analyze, export, and generate schema validation rules. 2. The Single Collection Pattern: Storing data together for faster queries Many developers new to MongoDB create a separate collection for each entity. If someone were building a new book review app, they might have collections for books , reviews , and users . While this seems logical at first, it can lead to slow queries that require expensive $lookup operations or multiple queries to gather all the data for a single view, which can slow down your overall app and increase your database costs. A more efficient approach is to use the Single Collection Pattern. This pattern helps model many-to-many relationships when embedding is not an option. It lets you store all related data in a single collection, avoiding the need for data duplication when the costs outweigh the benefits. This approach adheres to three defining characteristics: All related documents that are frequently accessed together are stored in the same collection. Relationships between the documents are stored as pointers or other structures within the document. An index is built on a field or array that maps the relationships between documents. Such an index supports retrieving all related documents in a single query without database join operations. When using this pattern, you can add a docType field (e.g., book, review, user) and use a relatedTo array to link all associated documents to a single ID. This approach offers significant advantages: Faster queries: A single query can retrieve a book, all of its reviews, and user data. No joins: This eliminates the need for expensive joins or multiple database trips. Improved performance: Queries are fast because all the necessary data lives in the same collection. For developers struggling with slow MongoDB queries, understanding this pattern is a crucial step towards better performance. For more details and examples, see: Building with Patterns: The Single Collection Pattern by Staff Senior Developer Advocate, Daniel Coupal. Take a free, one-hour MongoDB Skills course on Schema Design Optimization . Get started with MongoDB Atlas for free today. Start building your MongoDB skills through the MongoDB Atlas Learning Hub .

September 15, 2025

Developer Blog

Scalable Automation Starts Here: Meet Stagehand and MongoDB Atlas

While APIs deliver clean, structured data on a silver platter, the most valuable insights for AI applications often hide in the messy, unstructured corners of the web. AI's potential is immense, and many organizations face a significant challenge: their existing data infrastructure isn't ready for the scale needed by AI. The web is a vast ocean of data, and ~80% of it is unstructured. But what if you could reliably automate web interactions, extract complex data, and seamlessly integrate it into a database that offers a variety of query and search methods? This is where the powerful combination of Stagehand (by Browserbase ) and MongoDB Atlas redefines what's possible for building AI applications. Figure 1. Stagehand and MongoDB Atlas: Seamlessly automating web data collection and AI-ready integration. Stagehand: An SDK for developers to write automations with natural language The browser is a powerful tool for collecting data, but it's hard to control and scale. Traditional browser automation tools like Playwright, Puppeteer, and Selenium often force developers to write fragile code that breaks with even slight UI changes on a website. This makes maintaining scripts on live websites a significant pain point. Stagehand, however, is designed specifically for the AI era. Stagehand allows you to automate browsers using a combination of natural language and code. It's built to be more reliable and adaptable than legacy frameworks, enhancing Playwright's determinism with large language models (LLMs) to account for page changes and volatility. This means you can write code once, and it adapts to that website if it changes. Key capabilities of Stagehand include: Reading and interacting with page content: Stagehand can read page content by parsing the DOM using accessibility trees, interact with, and continue to work even when the page changes. Natural language operations: You can use natural language to extract data or instruct the browser to take actions. For instance, page.extract("the price of the first cookie") or page.act("add the first cookie to cart")

September 12, 2025

Developer Blog

Building a Scalable Document Processing Pipeline With LlamaParse, Confluent Cloud, and MongoDB

As organisations generate increasing volumes of data, extracting meaningful insights from unstructured documents has become a significant challenge. This blog presents an advanced architecture that leverages cloud storage, streaming technology, machine learning, and a database to deliver a robust and efficient document processing pipeline. Introduction: Modern document processing challenges Organisations today are drowning in documents. PDFs, reports, contracts, and other text-heavy files contain valuable information, but extracting that knowledge efficiently presents significant challenges. Traditional document processing approaches often suffer from several limitations: Scalability issues when processing large volumes of documents. Limited intelligence in parsing complex document structures. Batch processing delays hinder timely, real-time insights. Difficulty integrating processed data into downstream applications. Modern businesses require solutions capable of processing documents at scale, extracting their semantic meaning, and making this information immediately available for applications like search, recommendation systems, or business intelligence. Our architecture meets these challenges through a streaming-based approach that leverages advanced, cutting-edge technologies. Architecture overview: From raw files to structured data Figure 1. Reference architecture. Our solution features a sophisticated real-time data processing pipeline that combines cloud storage, stream processing, machine learning, and a persistent database to create a comprehensive system for document enrichment and analysis. At its core, the architecture follows a streaming data pattern where information flows continuously from source to destination, being transformed and enriched along the way. Let's walk through the key components: AWS S3 bucket: Serves as the primary data lake, storing raw PDF documents. Python ingestion script: Reads files from S3 and coordinates document processing. LlamaParse: Provides intelligent document parsing and chunking. Confluent: Acts as the central nervous system with two main topics: "raw": Contains parsed document chunks. "summary_embedding": Stores processed chunks with embeddings. Apache flink: Processes streaming data and generates embeddings using ML. Confluent schema registry: Handles data contracts, ensuring consistent data formats. MongoDB: Stores the final processed documents with their embeddings. This architecture excels in scenarios requiring real-time document processing with machine learning enrichment, such as semantic search applications, content classification, or knowledge management systems. Data ingestion: Efficient document chunking with LlamaParse The journey begins with PDF documents stored in an AWS S3 bucket. The ingestion layer, built with Python, handles the following tasks: Document retrieval: The Python script connects to AWS S3 using configured credentials to access stored PDF documents. Intelligent parsing with LlamaParse: The system fundamentally transforms how PDFs are processed. Instead of a simple extraction that treats these complex documents as mere sequences of text, it harnesses the power of LlamaParse. This sophisticated document parsing tool goes beyond simple character recognition, offering an intelligent understanding of document structure and layout. LlamaParse meticulously identifies and interprets critical formatting elements such as: Tables: Accurately distinguishing rows, columns, and cell content, preserving tabular data integrity. Images: Identify images in text, including additional context based on where the image is in the overall layout. Headers: Recognising hierarchical headings and subheadings is crucial for organising a s document effectively. Other formatting elements: Including lists, bolded text, italics, and various layout components, ensuring that the semantic meaning and visual presentation are preserved during parsing. By leveraging LlamaParse, the system ensures that we don’t lose context over the document, by employing a parsing strategy that can make use of classic OCR, as well as LLMs and LVMs.. The following Python code demonstrates how to initialise your parser and extract relevant information. # Initialize LlamaParse with your API key parser = LlamaParse( api_key=os.getenv("LLAMA_PARSE_API_KEY"), result_type="json" # Get JSON output ) # Parse the PDF with file name in extra_info parsed_doc = parser.parse(file_bytes, extra_info={"file_name": file_name}) Document chunking: LlamaParse breaks down documents into manageable chunks, typically at the page level, while preserving the context and metadata of each chunk. This chunking approach provides several benefits: Better processing efficiency for large documents. More precise context for embedding generation. Improved search granularity in the final application. The processed chunks are then ready for the next stage of the pipeline. The Python script handles any parsing errors gracefully, ensuring the pipeline remains robust even when encountering malformed documents. Streaming infrastructure: Leveraging Confluent Cloud Confluent Cloud, a fully managed Apache Kafka service, serves as the backbone of our architecture. This streaming platform offers several advantages: Decoupled components: Kafka separates data producers (document parsers) from consumers (processing engines), allowing each to operate at its own pace. Similarly, LlamaParse, Flink, and MongoDB all process and ingest at different throughputs independently with decoupling. Scalability: The platform handles high throughput with configurable partitioning (6 partitions per topic in our implementation). Data resilience: Kafka's replication ensures no document chunks are lost during processing. Schema management: Confluent Schema Registry provides strong guarantees for schema evolution (forward and backwards compatibility). Our implementation uses two main Kafka topics: raw: Contains the parsed document chunks from LlamaParse. summary_embedding: Stores the processed chunks with their generated embeddings. The Avro schema for the embedding messages ensures consistency: { "type": "record", "name": "summary_embedding_value", "namespace": "org.apache.flink.avro.generated.record", "fields": [ { "name": "content", "type": ["null", "string"], "default": null }, { "name": "embeddings", "type": ["null", {"type": "array", "items": ["null", "float"]}], "default": null } ] } This schema defines the structure for each message, containing the original text content and its corresponding vector embeddings. Once document chunks are flowing through Apache Kafka, the real magic happens in the processing layer. Apache Flink, a powerful stream processing framework, consumes data from the raw topic and applies transformations to generate embeddings. Flink continuously processes the stream of document chunks from Kafka and continuously produces the enriched summary_embedding stream back to Kafka. Embeddings are numerical vector representations that capture the semantic meaning of text. They enable powerful capabilities like: Semantic search (finding documents by meaning, not just keywords). Document clustering and classification. Similarity detection between documents. Foundation for sophisticated AI applications. Our implementation uses AWS Bedrock for embedding generation through Flink SQL: -- Create the embedding model CREATE MODEL AWSBedrockEmbedding INPUT (text STRING) OUTPUT (embeddings ARRAY<FLOAT>) WITH ( 'bedrock.connection' = 'bedrock-connection', 'task' = 'embedding', 'provider' = 'BEDROCK' ); -- Create the destination table CREATE TABLE summary_embedding ( content STRING, embeddings ARRAY<FLOAT> ); -- Insert transformed data INSERT INTO summary_embedding SELECT CAST(val as STRING), embeddings FROM raw, LATERAL TABLE (ML_PREDICT('AWSBedrockEmbedding', CAST(val as STRING))); This SQL defines how Flink should: Connect to AWS Bedrock for ML capabilities. Define the destination structure for embeddings. Transform incoming text by generating embeddings through the ML_PREDICT function. The result is a continuous stream of document chunks paired with their semantic vector representations. Data consumption: Avro deserialization for schema evolution On the consumption side, a dedicated consumer application reads from the embedded_data topic. This application handles several important tasks: Message consumption: Efficiently reads messages from Kafka with proper offset management. Avro deserialization: Converts the binary Avro format back to usable objects using the Schema Registry. Error handling and retries: Manages potential failures in consumption or processing. Avro deserialization is particularly important for maintaining compatibility as the pipeline evolves. The Schema Registry ensures that even if the schema changes over time (adding new fields, for example), the consumer can still correctly interpret messages produced with older schemas. The consumer application is implemented with multi-threading to maximise throughput, allowing parallel processing of messages from different partitions. Storage strategy: MongoDB for flexible document storage The final destination for our processed document chunks is MongoDB, a document-oriented database well-suited for storing complex, nested data structures, including vector embeddings. MongoDB offers several advantages for this architecture: Flexible schema: Accommodates varying document structures and metadata. Vector storage: Efficiently stores and indexes high-dimensional embedding vectors. Query capabilities: Supports semantic search through vector similarity queries. Scalability: Handles large document collections through sharding. Integration options: Easily connects with downstream applications and visualisation tools. The consumer application inserts each processed document chunk into MongoDB, preserving both the original text content and the generated embeddings. This makes the data immediately available for applications that need to search or analyse the document collection. How MongoDB differentiates from other vector databases MongoDB stands out as a versatile choice for vector storage, especially when compared to specialized vector databases. Here's why: Native integration: MongoDB's core strength lies in its ability to store and manage both structured and unstructured data, including vector embeddings, within a single platform. Unlike standalone vector databases that often require separate data synchronization and management, MongoDB Atlas Vector Search allows you to store your original data and its corresponding embeddings together in the same document. This eliminates data silos and simplifies your architecture. Flexible data model: MongoDB's document model provides unparalleled flexibility. You can store your raw text, metadata, and vector embeddings in a single JSON-like document. This means you don't need to normalize your data across multiple tables or systems, making schema evolution easier and reducing development complexity. Comprehensive query capabilities: Beyond simple vector similarity searches, MongoDB allows you to combine vector search with other powerful query operators, such as filtering by metadata, geospatial queries, or full-text search. This enables more nuanced and precise retrieval of information, which is crucial for advanced AI applications. Operational maturity and ecosystem: MongoDB is a mature, battle-tested database with a robust ecosystem of tools, drivers, and integrations. It offers enterprise-grade features like scalability, high availability, security, and a rich set of developer tools. Specialized vector databases, while effective for their niche, may lack the broader operational capabilities and community support of a general-purpose database like MongoDB. Cost-effectiveness and simplification: By consolidating your data storage and vector search capabilities into a single database, you can reduce operational overhead and cost. You avoid the need to manage and scale separate database systems, simplifying your infrastructure and streamlining your development workflow. In essence, while dedicated vector databases excel at one specific task, MongoDB provides a holistic solution that handles your entire data lifecycle, from ingestion and storage to advanced querying and analytics, all within a single, integrated platform. By implementing this architecture, organisations can transform their document processing capabilities from static, batch-oriented systems to dynamic, real-time pipelines that extract meaningful insights from unstructured content. The combination of cloud storage, streaming processing, machine learning, and flexible storage creates a powerful foundation for document-centric applications that drive business value. Get started today by exploring the complete implementation in the GitHub repo . New to MongoDB? Deploy a free instance of MongoDB Atlas to see how it can effortlessly power your AI-driven applications. Not yet a Confluent customer? Start your free trial of Confluent Cloud today and receive $400 to spend during your first 30 days! Sign up to LlamaCloud to get started with LlamaParse, LlamaExtract and more! Want to build your own custom agentic workflow? Check out LlamaIndex .

September 10, 2025

Developer Blog

Real-Time Materialized Views With MongoDB Atlas Stream Processing

For developers coming from a relational database background, the concept of a "join" is a fundamental part of the data model. But in a document database like MongoDB, this approach is often an anti-pattern. While MongoDB offers the $lookup aggregation stage, relying on it for every read operation can lead to performance bottlenecks and create fragile architectures. From relational thinking to document thinking In relational databases, query optimization often revolves around minimizing or optimizing joins. In a normalized relational schema, data is split into multiple related tables, and the optimizer’s job is to figure out how to pull that data together efficiently at query time. Because joins are built into the relational model and cost is reduced through indexing strategies, database architects spend a lot of effort making joins fast. In MongoDB—and most NoSQL document databases—the philosophy is different: it is perfectly acceptable, and often recommended, to duplicate data when it serves an application's query patterns. Instead of joining related entities on every read, MongoDB encourages data models where data that is accessed together is stored together in the same document, pre-aggregated or denormalized to match access needs. Avoiding excessive joins (or $lookup stages) isn’t a limitation—it’s a deliberate design choice that trades some extra storage for dramatically reduced query latency, simpler query logic, and predictable performance at scale. This is why patterns like materializing “query‑optimized” collections, whether batch‑computed or stream‑updated, are so powerful in the MongoDB world: instead of computing the relationship at read time, you store the relationship in exactly the form your application needs. In the past, a common solution for generating data summaries was to precompute and store them (often called “materializing” the data) through a scheduled batch job. This pre-aggregated data was useful for reports but suffered from data staleness, making it unsuitable for real-time applications. So, how do you handle complex data relationships and serve real-time insights without a heavy reliance on expensive joins or stale, batch-processed data? The answer lies in a modern approach that leverages event-driven architecture and a powerful pattern: continuously updated, query‑optimized collections (essentially materializing enriched data in real time) with MongoDB Atlas Stream Processing . The core problem: Why we avoid joins in MongoDB MongoDB is built on the principle that "data that is accessed together should be stored together." This means that denormalization is not only acceptable but is often the recommended practice for achieving high performance. When data for a single logical entity is scattered across multiple documents, developers often face two key problems: 1. The "fragmented data" problem As applications evolve, their data models can become more complex. What starts as a simple, one-document-per-entity model can turn into a web of interconnected data. To get a complete picture of an entity, your application must perform multiple queries or a $lookup to join these pieces of information. If you need to update this complete entity, you're faced with the complexity of a multi-document transaction to ensure data consistency. This overhead adds latency to both read and write operations, slowing down your application and complicating your code. The diagram below illustrates this problematic, join-heavy architecture, which we consider an anti-pattern in MongoDB. Figure 1. Data fragmentation across multiple collections. 2. The "microservice" problem In a microservice architecture, each service often owns its own database, oftentimes running in its own MongoDB cluster, promoting autonomy and decoupling. But this creates a new challenge for data sharing. One service might need data owned by another, and the classic join approach doesn't work. While federated databases might seem like a solution for joining data across different clusters, they are typically designed for ad-hoc queries and analytics rather than high-performance, real-time application queries. MongoDB's $lookup stage has a crucial constraint: it cannot work across different databases, which is a common scenario in this architecture. This forces developers into inefficient synchronous API calls or complex, manual data synchronization pipelines. The solution: Enabling CQRS with event processing The answer to these problems—the reliance on joins, fragmented data, and microservice coupling—lies in an event-driven architecture paired with the power of materialized views. This architectural pattern is a well-established concept known as Command Query Responsibility Segregation (CQRS). At its core, CQRS separates a system's "Command" side (the write model) from its "Query" side (the read model). Commands are actions that change the state of the system, such as CreateOrder or UpdateProduct . Queries are requests that retrieve data, like GetOrderDetails . By embracing this pattern, your core application can handle commands with a transactional, write-focused data model. For all your read-heavy use cases, you can build separate, highly optimized read models. This is where event processing becomes the crucial link. When a command is executed, your write model publishes an event to a stream (e.g., OrderCreated ). This event is an immutable record of what happened. MongoDB Atlas's native change streams are the perfect tool for generating this event stream directly from your primary data collection. A separate process—our stream processor—continuously listens to this event stream. It consumes the OrderCreated event, along with any other related events such as changes in Product, Customers and Payments, and uses this information to build a denormalized, query-optimized collection — effectively materializing enriched data for fast queries. This view becomes the query side of your CQRS pattern. As you can see in the diagram below, this modern CQRS architecture directly solves the problems of fragmentation and joins by separating the read and write concerns entirely. Figure 2. Data access with the CQRS pattern. For a microservice architecture, here is what the CQRS pattern would look like: Figure 3. CQRS pattern applied across multiple services. The tangible benefits: Why this approach matters Embracing this event-driven, real-time data materialization pattern is not just about elegant architecture—it provides real, measurable benefits for your applications and your bottom line. Blazing-fast queries: Since the data is already pre-joined and shaped to your query requirements, there is no need for expensive, real-time $lookup or aggregation operations. Queries against this continuously updated, query‑optimized collection become simple, fast reads, resulting in significantly lower latency and a more responsive user experience. Reduced resource consumption: By offloading the computational work of joins and aggregations to the continuous stream processor, you dramatically reduce the workload on your primary operational database. The highly efficient queries against the materialized view consume far less CPU and RAM. More economical deployments: The reduced resource consumption translates directly into more economical deployments. Since your primary database is no longer burdened with complex read queries, you can often run on smaller, less expensive instances. You are trading more costly CPU and RAM for cheaper storage, which is a highly favorable economic trade-off in modern cloud environments. Improved performance predictability: With a consistent and low-resource query model, you eliminate the performance spikes and variability that often come with complex, on-demand queries. This leads to a more stable and predictable application load, making it easier to manage and scale your services. How MongoDB Atlas Stream Processing can help Building a custom stream processing pipeline from scratch can be complex and expensive, requiring you to manage multiple systems like Apache Flink or Kafka Streams. MongoDB Atlas Stream Processing is a fully managed service that brings real-time stream processing directly into the MongoDB ecosystem. It takes on the responsibility of the "Query Side" of your CQRS architecture, allowing you to implement this powerful pattern without the operational overhead. With MongoDB Atlas Stream Processing, you can use a continuous aggregation pipeline to: Ingest events from a MongoDB Change Stream or an external source like Apache Kafka. Transform and enrich the data on the fly. Continuously materialize the data into a target collection using the $merge stage . Crucially, $merge can either create a new collection or continuously update an existing one. It intelligently inserts new documents including starting with an initial sync , replaces or updates existing ones, and even deletes documents from the target collection when a corresponding deletion event is detected in the source stream. The diagram below provides a detailed look at the stream processing pipeline, showing how data flows through various stages to produce a final, query-optimized target collection. Figure 4. Atlas stream processing pipeline. This gives you an elegant way to implement a CQRS pattern, ensuring your read and write models are decoupled and optimized for their specific tasks, all within the integrated MongoDB Atlas platform. It allows you to get the best of both worlds: a clean, normalized model for your writes and a high-performance, denormalized model for all your reads, without the manual complexity of a separate ETL process. It’s a modern way to apply the core principles of MongoDB data modeling at scale, even across disparate databases. By moving away from join-heavy query patterns and embracing real-time, query‑optimized collections with MongoDB Atlas Stream Processing, you can dramatically cut query latency, reduce resource usage, and build applications that deliver consistent, lightning‑fast experiences at scale. Whether you’re modernizing a monolith, scaling microservices, or rethinking your data model for performance, Atlas Stream Processing makes it easy to adopt a CQRS architecture without the overhead of managing complex stream processing infrastructure yourself. Start building your continuously updated, high-performance read models today with MongoDB Atlas Stream Processing .

September 9, 2025

Developer Blog

Multi-Agentic Ticket-Based Complaint Resolution System

In the AI-first era of customer experiences, financial institutions face increasing pressure to deliver faster, smarter, and more cost-effective service, all while maintaining transparency and trust. This blog introduces a multi-agentic architecture, co-developed with MongoDB and Confluent Data Streaming Platform , designed to automate ticket-based complaint resolution. It leverages the AI-native capabilities of both platforms within a real-time event streaming environment. With continuously active agents orchestrated across the entire system, financial institutions can automatically resolve common customer issues in seconds, significantly improving resolution times and boosting customer satisfaction. Specifically, this blog demonstrates the automation of customer complaint handling for a retail bank using a sequence of specialized AI agents. Customers interact through a simple ticket, while behind the scenes, Confluent Cloud for Apache Flink ® powers intent classification, MongoDB Voyage AI -backed semantic search, and contextual reasoning against live customer data to propose resolutions in real time. Structured events flow through a Confluent Cloud for Apache Kafka ® backbone to automate downstream ticketing and update audit logs, with Atlas Stream Processing (ASP) ensuring sub-second updates to the system-of-record database. Business context & challenge Retail banks face an immense operational challenge in managing the sheer volume of customer inquiries they receive on a monthly basis, often numbering in the millions. A substantial portion of these interactions, frequently ranging from 20% to 30%, is dedicated to addressing relatively routine and common issues that could potentially be automated or handled more efficiently. These routine inquiries typically revolve around a few key areas: Card declines or transaction failures: Customers frequently contact their banks when a credit or debit card transaction fails, or when a card is declined. This can be due to various reasons such as insufficient funds, security holds, incorrect PINs, or technical glitches. OTP / authentication problems: Issues related to One-Time Passwords (OTPs) or other authentication methods are a common source of customer frustration and inquiries. This includes not receiving an OTP, OTPs expiring, or difficulties with multi-factor authentication processes when trying to access accounts or authorize transactions. Billing or statement discrepancies: Customers often reach out to clarify charges on their statements, dispute transactions, or report discrepancies in their billing summaries. These inquiries can range from unrecognized transactions to incorrect fees or interest calculations. Addressing these high-volume, routine inquiries efficiently is crucial for retail banks. Streamlining these processes, perhaps through self-service options, AI-powered chatbots, or improved internal tools for customer service representatives, could significantly reduce operational costs, improve customer satisfaction, and free up resources to handle more complex or sensitive customer issues. Current pain points: Long holds or call-backs frustrate customers and drive support costs. The support team must manually sift through past cases and customer records to recommend next steps. Inconsistent audit trails hamper compliance and SLA reporting. Solution overview Our multi-agentic ticket-based complaint resolution system automates first‑line support by orchestrating a series of AI agents, built on MongoDB and Confluent technologies. Why a multi-agentic system? Traditional AI systems often employ a singular, all-encompassing model to handle a broad spectrum of tasks. However, this architecture proposes a more sophisticated and efficient approach: a multi-agentic system. Instead of a monolithic AI, this design breaks down complex responsibilities into smaller, manageable units. Each of these specialized AI agents is dedicated to a distinct, yet crucial, function. This division of labor allows for greater precision, scalability, and resilience within the overall system, as each agent can be optimized for its specific role without impacting the performance of others. This modularity also simplifies debugging and updates, as changes to one agent do not necessitate a re-evaluation of the entire system. Figure 1. Architecture layers & data flow. Architecture overview Ingestion & classification: The initial triage The initial phase of this customer service system is dedicated to efficiently receiving and categorizing incoming customer complaints. This is where raw, unstructured customer input is transformed into actionable intelligence. Ticket (web/mobile) - The customer's voice: This is the primary entry point for customer grievances. Whether submitted via a web portal, a dedicated mobile application, or other digital channels, customers provide their feedback as free-text complaints. This flexibility allows customers to articulate their issues naturally, without being confined to pre-defined categories. The system is designed to handle the nuances of human language, capturing the essence of the problem as expressed directly by the customer. Intent classification agent - Understanding the "why": Technology stack: At the heart of this classification lies a powerful combination of Confluent Flink for real-time stream processing and an LLM ( large language model ). Confluent Flink provides the robust framework necessary to process incoming tickets with low latency, ensuring that customer complaints are analyzed almost instantaneously. The integrated LLM is the intelligence engine, trained on vast datasets of customer interactions to accurately discern the underlying intent behind the free-text complaint. Core functionality: The primary role of this agent is to classify the complaint into specific, pre-defined intents. For example, a customer typing "My credit card isn't working at the ATM" would be classified under an intent like "card not working." Beyond just intent, the LLM is also adept at extracting explicit parameters embedded within the complaint. In the "card not working" example, it would extract the "card ID" if provided, or other relevant identifiers that streamline the resolution process. This automated extraction of key information significantly reduces the need for manual data entry and speeds up the subsequent steps. Semantic retrieval agent - Learning from the past: Technology stack: This agent leverages the advanced capabilities of Voyage AI and Atlas Vector Search . Voyage AI provides sophisticated embedding generation and re-ranking algorithms that are crucial for understanding the semantic meaning of the classified intent. This semantic understanding is then fed into MongoDB Atlas Vector Search, which is purpose-built for efficient similarity searches across vast datasets. Core functionality: Once the intent is classified and parameters extracted, the semantic retrieval agent leverages MongoDB’s Voyage AI and then re-ranks the embeddings to ensure the most relevant matches are prioritized. This process enables the system to efficiently retrieve the ‘N’ most similar past cases or knowledge base entries. This retrieval is critical for two reasons. Historical context: It provides agents (human or automated) with insights into how similar issues were resolved previously, offering potential solutions or precedents. Knowledge base integration: It pulls relevant articles, FAQs, or troubleshooting guides from the knowledge base, equipping the system with immediate information to address the customer's query. This proactive retrieval of information is a cornerstone of efficient problem resolution. Contextual reasoning: Assembling the puzzle With the initial classification and historical context in hand, the system moves to contextual reasoning, where all available data is brought together to formulate the most appropriate resolution. Contextual reasoning agent - The decision maker: Technology stack: MongoDB Atlas serves as the central data platform for this crucial stage. Its flexible document model and real-time capabilities are ideal for aggregating and querying diverse datasets quickly. Core functionality: This agent performs several critical functions: Live data fetching: It connects to and fetches real-time customer profiles and transaction data from various systems. This includes information such as account status, recent transactions, past interactions, and communication preferences. Access to live data ensures that the suggested resolution is based on the most current customer situation. Data merging: The fetched live data is then seamlessly merged with the semantic results obtained from the previous stage (i.e., the relevant past cases and knowledge base entries). This creates a holistic view of the customer's problem, enriched with both their current context and historical solutions. Resolution suggestion: Armed with this comprehensive dataset, the agent applies pre-defined business rules or leverages lightweight LLM prompts to select the best resolution suggestion. These business rules could involve decision trees, conditional logic, or specific protocols for certain complaint types. For more complex or nuanced situations, lightweight LLM prompts can be used to analyze the combined data and suggest a human-like, contextually appropriate resolution. This stage aims to automate as much of the resolution process as possible, only escalating to human intervention when necessary. Execution & eventing: Actioning the resolution The final phase focuses on executing the determined resolution and ensuring that the necessary downstream systems are updated. Resolution execution agent - Turning suggestions into action: Technology stack: This stage uses a scalable and event-driven infrastructure to process the resolution suggestion and convert it into a structured event. Core functionality: This agent takes the suggested action from the Contextual Reasoning Agent (e.g., "issue new card") and converts it into a structured resolution event. This event is a standardized, machine-readable message that contains all the necessary details to trigger the actual resolution in downstream systems. The structured format ensures consistency and interoperability across the ecosystem. Confluent Cloud: Technology stack: Confluent Cloud Core forms the backbone of the eventing architecture. It acts as a high-throughput, low-latency event bus, facilitating the seamless flow of resolution events across the enterprise. Core functionality: Confluent Cloud carries these structured resolution events to various downstream systems. A critical component here is the Confluent Schema Registry, which ensures that all events adhere to governed schemas. This schema enforcement is vital for data integrity and compatibility, preventing errors and ensuring that all consuming systems correctly interpret the event data. The event bus acts as the central nervous system, broadcasting resolution information to all interested parties. Downstream consumer - Closing the loop: This section describes the various systems that consume the resolution events from the Confluent Cloud event bus, ensuring a fully integrated and closed-loop system. Confluent MongoDB Sink Connector: This connector plays a crucial role in maintaining an audit trail and storing detailed resolution information. It efficiently logs all audit and resolution details into MongoDB, providing a historical record of all customer interactions and the actions taken to resolve their issues. This data is invaluable for analytics, reporting, and future system improvements. Confluent MongoDB Source Connector: This connector facilitates closed-loop integrations by streaming any external database changes back into Kafka. This means that if an action taken in a separate, external system (e.g., a manual card activation by a human agent) results in a database change, this change is streamed back into the event bus. This allows other systems within the ecosystem to react to these external changes, ensuring complete synchronization and a truly integrated customer service experience. MongoDB Atlas Stream Processing (ASP): ASP is designed for real-time data processing and updates. It actively listens on Kafka (the underlying technology of Confluent Cloud) for specific status-change events. When such an event occurs (e.g., a card replacement is initiated), ASP processes it and writes low-latency updates back to the Customer Profile collection in MongoDB Atlas. This ensures that the customer's profile is always up-to-date with the latest status of their issue, providing a consistent view for both customers and internal teams. Figure 2. Key benefits & metrics. Why this matters to management Cost reduction & operational efficiency: By automating routine complaint workflows and first-line resolutions, the system significantly reduces call center headcount and manual support costs. Leveraging serverless AI agents and managed services ensures predictable, elastic operational costs that scale with demand rather than fixed staffing. Faster, contextual resolutions with AI: End-to-end AI-assisted decisioning powered by intent classification, semantic search, and real-time customer context drives faster, more accurate resolutions. Personalized outcomes improve customer satisfaction while reducing friction across the support journey. Enterprise-grade observability & compliance: Every decision and action is governed by enforcement via the Confluent Schema Registry schema, time-stamped, and logged in MongoDB Atlas. This ensures a complete audit trail for internal governance and external regulatory audits streamlining compliance without added overhead. Trusted, scalable platforms: Built on MongoDB and Confluent, the architecture uses proven, enterprise-grade platforms for data, events, and real-time AI workflows. This provides reliability, observability, and high availability by default. Future-proof & extensible design: The modular design allows seamless onboarding of new AI agents for emerging use cases like loan servicing, KYC checks, or investment guidance without re-architecting. This adaptability ensures long-term value as business requirements evolve. Competitive advantage through personalization: Delivering intelligent, real-time resolutions is no longer optional, it’s a differentiator. The system enhances customer loyalty and satisfaction by resolving issues instantly, accurately, and in context. Closing thoughts Agentic architectures are going to be a large part of the future of customer service automation, and they require infrastructure that can reason in real time. MongoDB and Confluent provide a proven, production-ready foundation for deploying multi-agent systems that are cost-effective, explainable, and deeply personalized using the latest AI techniques. Whether you're in financial services or another vertical, this reference architecture offers a start for strategic thinking or even a tangible leap forward. Not yet a MongoDB user? Deploy a free instance of MongoDB Atlas and explore how Atlas can power your AI-driven applications with ease. Not yet a Confluent customer? Start your free trial of Confluent Cloud today ! New users receive $400 to spend during their first 30 days. Kickstart your journey with MongoDB Atlas and Confluent by trying out Confluent Cloud + MongoDB Atlas in real-world gen AI use cases with these Quick Starts: Small Business Loan Agent Chatbot on AWS Medical Knowledge Database on Google Cloud

September 4, 2025

Developer Blog

The Difference a (Field) Name Makes: Reduce Document Size and Increase Performance

It sometimes feels like I have the perfect job for a software geek who loves to travel. I was recently at a customer event in Greece. During one of the breaks, I joined a group of developers who were having an animated debate on the terrace. The big question was whether it's better to use camelCase or snake_case for field names in MongoDB documents. The first part of my response was that it's mainly a style decision, likely influenced by the conventions of the programming language you're using for your application. The second part was that this decision does have a right answer, and your decision will impact performance. This article will answer that question and demonstrate how other design decisions regarding data representation within a document impact the performance of your application. So which is it— camelCase or snake_case ? How MongoDB stores documents in its cache The MongoDB database has a built-in LRU (Least Recently Used) cache that holds an in-memory copy of recently accessed documents. The documents are stored in BSON format. Let's take this document as an example: { _id: 81873, color: "Red", size: "Small", shape: "Cylinder", props: { edge: 2, face: 3 }, coords: [ 2.2, 5.1] } This is how that document is stored in the cache: Figure 1. A table with a row for each of the fields in the document seen above. For each document, the BSON contains: The type of the field (int32, string, document, etc.). The name of the field. The length of the value (if the field type doesn't have a fixed size). The field's value. Why should you care about document size? The purpose of the database cache is to speed up queries. If the requested document is already in the cache, then the query isn't slowed down by having to fetch the data from disk. The smaller the size of each document, the more documents can fit in the cache (memory isn't infinite). The more documents that fit in the cache, the higher the probability that the document(s) your application requests are already in the cache. Smaller documents also reduce the volume of data that needs to be sent over the network between the database and your application. Optimizing document size I start with a baseline document and find its size. I then step through a number of optimizations, and for each one, I measure how the document size changes. In none of the steps do I reduce how much information is held in the document. Baseline Our initial document has this form: { ... "top_level_name_1_middle_level_name_1_bottom_level_name_1": "Your data goes here", "top_level_name_1_middle_level_name_1_bottom_level_name_2": "", "top_level_name_1_middle_level_name_1_bottom_level_name_3": "Your data goes here", "top_level_name_1_middle_level_name_1_bottom_level_name_4": "", ... "top_level_name_2_middle_level_name_5_bottom_level_name_5": "Your data goes here", "top_level_name_2_middle_level_name_5_bottom_level_name_6": "", "top_level_name_2_middle_level_name_5_bottom_level_name_7": "Your data goes here", "top_level_name_2_middle_level_name_5_bottom_level_name_8": "", "top_level_name_2_middle_level_name_5_bottom_level_name_9": "Your data goes here", ... "top_level_name_10_middle_level_name_10_bottom_level_name_9": "Your data goes here", "top_level_name_10_middle_level_name_10_bottom_level_name_10": "" } The document contains 1,000 fields, all at the top level of the document. Half of the fields contain the string "Your data goes here," while the other half contain an empty string. My collection contains a single document with this structure. I use MongoDB Compass to check the size of this document in the cache (note that the disk copy (Storage Size in the capture below) will be smaller as it's compressed): Figure 2. MongoDB Compass showing that schema 1 contains 1 document with a document size of 72.82 KB. Compass shows that the document size (in memory/cache) is 72.82 KB. This is our baseline measurement. Adding hierarchy For the first optimization, I added extra structure to the document. Rather than having 1,000 fields at the top level of the document, I have 10 fields that each contain 10 sub-fields. Those sub-fields in turn contain 10 fields, all of which are strings: { ... "top_level_name_1": { "middle_level_name_1": { "bottom_level_name_1": "Your data goes here", "bottom_level_name_2": "", "bottom_level_name_3": "Your data goes here", "bottom_level_name_4": "", "bottom_level_name_5": "Your data goes here", "bottom_level_name_6": "", "bottom_level_name_7": "Your data goes here", "bottom_level_name_8": "", "bottom_level_name_9": "Your data goes here", "bottom_level_name_10": "" }, ... "middle_level_name_10": { ... "bottom_level_name_9": "Your data goes here", "bottom_level_name_10": "" } }, ... "top_level_name_10": { ... } } Note that we haven't lost any information from the field names. Instead of having a field named top_level_name_1_middle_level_name_1_bottom_level_name_1 , we have one named top_level_name_1.middle_level_name_1.bottom_level_name_1 (note the "dot notation" used to indicate field levels within the document hierarchy.) A quick check with Compass shows that this more structured document has reduced the document size: Figure 3. MongoDB Compass showing that schema 2 contains 1 document with a document size of 38.46 KB. The more organized document uses 38.46 KB of memory. That's almost a 50% reduction in the size of the original document. That means almost twice as many documents will fit in the database cache. The reason that the document has shrunk is that we're storing shorter field names. No context or information has been lost, and we have documents that are easier for people to understand. Replace empty strings with null 50% of the lowest-level fields contain an empty string. What happens if we store null instead? { ... "top_level_name_1": { "middle_level_name_1": { "bottom_level_name_1": "Your data goes here", "bottom_level_name_2": null, "bottom_level_name_3": "Your data goes here", "bottom_level_name_4": null, "bottom_level_name_5": "Your data goes here", "bottom_level_name_6": null, "bottom_level_name_7": "Your data goes here", "bottom_level_name_8": null, "bottom_level_name_9": "Your data goes here", "bottom_level_name_10": null }, ... "middle_level_name_10": { ... "bottom_level_name_9": "Your data goes here", "bottom_level_name_10": null } }, ... "top_level_name_10": { ... } } Compass shows us a further small reduction in document size: Figure 4. MongoDB Compass showing that schema 3 contains 1 document with a document size of 35.96 KB. The size has been reduced from 38.46 KB to 35.96 KB . The saving comes because we no longer need to store the length or value of the empty strings. Removing null fields The polymorphic nature of MongoDB collections means that different documents in the same collection can contain different fields. Rather than storing fields that contain null values, we can remove those fields altogether (note that querying a missing field yields the same result as querying on one set to null): { ... "top_level_name_1": { "middle_level_name_1": { "bottom_level_name_1": "Your data goes here", "bottom_level_name_3": "Your data goes here", "bottom_level_name_5": "Your data goes here", "bottom_level_name_7": "Your data goes here", "bottom_level_name_9": "Your data goes here", }, ... "middle_level_name_10": { ... "bottom_level_name_9": "Your data goes here", } }, ... "top_level_name_10": { ... } } This means that we no longer have to store those empty fields (including the field names). Compass confirms a notable saving: Figure 5. MongoDB Compass showing that schema 4 contains 1 document with a document size of 25.36 KB. The new document consumes 25.36 KB of memory compared to 35.96 KB for the document with null fields. camelCase vs. snake_case Finally, we get to answer the question as to which is more performant: camelCase or snake_case: { ... "topLevelName1": { "middleLevelName1": { "bottomLevelName1": "Your data goes here", "bottomLevelName3": "Your data goes here", "bottomLevelName5": "Your data goes here", "bottomLevelName7": "Your data goes here", "bottomLevelName9": "Your data goes here", }, ... "middleLevelName10": { ... "bottomLevelName9": "Your data goes here", } }, ... "topLevelName10": { ... } } Compass shows that camelCase has it! Figure 6. MongoDB Compass showing that schema 5 contains 1 document with a document size of 23.53 KB. The new document uses 23.53 KB , which is a 7% saving over the 25.36 KB when using snake_case. In total, the schema changes have reduced the document size by 67.7% . As you've probably figured out by now, camelCase results in smaller documents because the field names are shorter—we no longer have to waste cache space storing thousands of _ characters. Taking it too far In the steps taken so far, no information has been lost. The field names are just as instructive as they were in our original document. If you structure your hierarchy around the natural structure of your data (e.g., have a field representing an address with sub-fields for street number, street name, city...), then the final document is easier for humans to understand than the original. The hierarchy also makes it faster to find documents using unindexed keys. It's a win-win-win. What about taking things to the extreme? If shorter field names mean we can fit more documents into cache, then should we opt for this? { ... "a1": { "b1": { "c1": "Your data goes here", "c3": "Your data goes here", "c5": "Your data goes here", "c7": "Your data goes here", "c9": "Your data goes here", }, ... "b10": { ... "c9": "Your data goes here", } }, ... "a10": { ... } } Compass confirms that it delivers a significant saving: Figure 7. MongoDB Compass showing that schema 6 contains 1 document with a document size of 15.02 KB. At 15.02 KB , this is a 36% saving over the already-optimized document. However, this has come at a cost. The shorter field names have made it much more difficult for people to read the document and understand its contents. Shortening field names can produce smaller documents, which can speed up your application. However, you need to be able to maintain your application, so it doesn't make sense to go too far. Pick names that are concise but still convey the meaning of the field's value. Summary If you'd like to reproduce the results and perform your own experiments, then you can use this MongoDB VS Studio playground to recreate the results: // This is a playground for MongoDB for VS Code Extension. // It creates 6 collections that each contain 1 document. All 6 collections // contain the same information but with different field names. This is to // demonstrate the different ways that MongoDB can store the same data and what // impact your design decisions have on document size (and therefore how many // documents can fit in the database cache → performance). use('FieldNames'); const schema1 = db.getCollection('schema1'); const schema2 = db.getCollection('schema2'); const schema3 = db.getCollection('schema3'); const schema4 = db.getCollection('schema4'); const schema5 = db.getCollection('schema5'); const schema6 = db.getCollection('schema6'); schema1.drop(); schema2.drop(); schema3.drop(); schema4.drop(); schema5.drop(); schema6.drop(); let doc = {}; for (let outer = 1; outer <= 10; outer++) { for(let middle = 1; middle <= 10; middle++) { for(let inner = 1; inner <= 10; inner++) { doc[`top_level_name_${outer}_middle_level_name_${middle}_bottom_level_name_${inner}`] = (inner % 2 !== 0) ? "Your data goes here" : ""; } } } schema1.insertOne(doc); doc = {}; for (let outer = 1; outer <= 10; outer++) { let middleLevel = {}; for(let middle = 1; middle <= 10; middle++) { let innerLevel = {}; for(let inner = 1; inner <= 10; inner++) { innerLevel[`bottom_level_name_${inner}`] = (inner % 2 !== 0) ? "Your data goes here" : ""; } middleLevel[`middle_level_name_${middle}`] = innerLevel; } doc[`top_level_name_${outer}`] = middleLevel; } schema2.insertOne(doc); doc = {}; for (let outer = 1; outer <= 10; outer++) { let middleLevel = {}; for(let middle = 1; middle <= 10; middle++) { let innerLevel = {}; for(let inner = 1; inner <= 10; inner++) { innerLevel[`bottom_level_name_${inner}`] = (inner % 2 !== 0) ? "Your data goes here" : null; } middleLevel[`middle_level_name_${middle}`] = innerLevel; } doc[`top_level_name_${outer}`] = middleLevel; } schema3.insertOne(doc); doc = {}; for (let outer = 1; outer <= 10; outer++) { let middleLevel = {}; for(let middle = 1; middle <= 10; middle++) { let innerLevel = {}; for(let inner = 1; inner <= 10; inner++) { if (inner % 2 !== 0) { innerLevel[`bottom_level_name_${inner}`] = "Your data goes here"; } } middleLevel[`middle_level_name_${middle}`] = innerLevel; } doc[`top_level_name_${outer}`] = middleLevel; } schema4.insertOne(doc); doc = {}; for (let outer = 1; outer <= 10; outer++) { let middleLevel = {}; for(let middle = 1; middle <= 10; middle++) { let innerLevel = {}; for(let inner = 1; inner <= 10; inner++) { if (inner % 2 !== 0) { innerLevel[`bottomLevelName${inner}`] = "Your data goes here"; } } middleLevel[`middleLevelName${middle}`] = innerLevel; } doc[`topLevelName${outer}`] = middleLevel; } schema5.insertOne(doc); doc = {}; for (let outer = 1; outer <= 10; outer++) { let middleLevel = {}; for(let middle = 1; middle <= 10; middle++) { let innerLevel = {}; for(let inner = 1; inner <= 10; inner++) { if (inner % 2 !== 0) { innerLevel[`c${inner}`] = "Your data goes here"; } } middleLevel[`b${middle}`] = innerLevel; } doc[`a${outer}`] = middleLevel; } schema6.insertOne(doc); Learn more about MongoDB design reviews Design reviews are a chance for a design expert from MongoDB to advise you on how best to use MongoDB for your application. The reviews are focused on making you successful using MongoDB. It's never too early to request a review. By engaging us early (perhaps before you've even decided to use MongoDB), we can advise you when you have the best opportunity to act on it. This article explained how designing a MongoDB schema that matches how your application works with data can meet your performance requirements without needing a cache layer. If you want help to come up with that schema, then a design review is how to get that help. Would your application benefit from a review? Schedule your design review .

September 3, 2025

Developer Blog

Converged Datastore for Agentic AI

As artificial intelligence evolves from simple query-response systems to sophisticated autonomous agents, a critical architectural challenge emerges: traditional data architectures weren't designed for AI systems that need to perceive, reason, and act continuously. Most organizations approach this challenge by retrofitting AI capabilities onto existing systems, creating fragmented architectures where structured business data lives separately from unstructured AI insights. This architectural disconnect becomes particularly evident in data-intensive industries like insurance. When a major storm generates 100,000+ damage photos overnight, traditional systems process these images through one pipeline while policy data flows through another, creating bottlenecks that transform what should be intelligent automation into expensive manual reconciliation. The solution lies in a fundamental shift: building converged datastores that unify structured and unstructured data into cohesive intelligence platforms. Just as the human brain seamlessly integrates memories, sensory input, and decision-making into unified cognition, modern data architectures must converge diverse data types to support truly agentic AI systems. The case for unified intelligence infrastructure Consider an insurance claims system that doesn't just store data—it understands it. When a policyholder uploads storm damage photos, an AI agent immediately analyzes images, cross-references policy coverage, calculates estimates, and routes claims to appropriate specialists while maintaining complete audit trails and regulatory compliance. This transformation requires more than new tools; it demands a new approach to data architecture. By combining MongoDB's flexible document model with vector search capabilities and event-driven processing, organizations can build systems where AI agents don't just query data—they inhabit it, maintaining persistent memories across interactions and leveraging semantic understanding to provide contextual intelligence. Figure 1. Traditional vs converged architecture. The business impact is transformative: claim processing times drop from days to minutes, accuracy improves through cognitive decision-making with human-in-the-loop, and customer experiences become remarkably responsive and intelligent. Cognitive data architecture: Core principles The cognitive memory architecture represents a fundamental shift in data system design for AI applications. Figure 2. Core cognitive principles. This approach integrates traditionally separate data concerns into a unified intelligence framework based on five core principles: Unified context: Business entities serve as memory anchors, enriched with embeddings, conversation history, and behavioral patterns. Rather than scattering related information across multiple systems, all context remains accessible within cohesive document structures. Semantic intelligence: Atlas Vector Search enables meaning-based retrieval that mirrors human associative memory. AI agents can find relevant information based on semantic similarity rather than exact keyword matches. Autonomous reasoning: Event-driven architectures allow agents to perceive data changes and initiate actions without human prompting. Systems become proactive rather than reactive. Persistent state: Unlike stateless large language models, agents maintain memories and context across interactions, enabling sophisticated multi-turn conversations and long-term relationship building. Tool integration: Seamless connections to external systems enable agents to act on their decisions, completing end-to-end workflows autonomously. Architecture overview The converged datastore architecture centers on MongoDB Atlas as the cognitive core, supporting four primary layers that enable truly agentic AI systems: Data integration layer Multiple input channels: Mobile applications, IoT sensors, document uploads, and external APIs feed data into the system. Unified ingestion: Common processing pipeline normalizes formats and ensures consistency. Real-time processing: MongoDB Atlas Stream Processing empowers you to ingest and transform diverse data streams into actionable operational business entities—enhanced with AI-driven enrichments—all in real time. MongoDB Atlas - Converged datastore layer Document collections: Store rich business entities with embedded context and semantic relationships. Vector search engine: Enable semantic retrieval across billions of embeddings with co-located storage alongside business entities, hybrid search capabilities, semantic caching for optimized response times, and real-time synchronization for agent-driven meaning-based queries. Agent memory store: Maintain persistent conversation state, learning history, and contextual intelligence. Tool registry: Provide a secure access catalog for business systems and external services. Semantic intelligence: Transform raw data into actionable insights through integrated AI capabilities. Agentic AI layer Domain-specific agents: Specialized AI agents for claims processing, customer service, and compliance monitoring. Autonomous decision making: Agents analyze semantic intelligence and persistent context to make independent decisions. Action execution: Direct integration with business tools enables agents to complete end-to-end workflows. Continuous learning: Agents update their memory and improve decision-making based on outcomes. Business systems layer Intelligent applications: Claims processing, customer service, and workflow automation receive intelligent actions rather than raw data. Business outcomes: Consolidated results from agent actions drive measurable business value. System integration: Seamless connectivity with notifications, analytics, and external enterprise systems. Figure 3. Cognitive data architecture. This architecture delivers sub-second response times for vector searches across billions of embeddings, supports thousands of concurrent agent sessions, and maintains ACID compliance for critical business transactions. The key innovation lies in how AI agents operate directly on the converged datastore, eliminating the traditional complexity of integrating separate systems for structured data, vector storage, and application logic. Schema architecture for agentic AI The fundamental shift from relational to document-based data architecture represents more than a technical upgrade—it's an architectural revolution that aligns data structures with the way AI agents naturally process and reason about business information. This transformation enables organizations to move from fragmented, table-based data models to consolidated business entities that serve as the foundation for intelligent automation. Root domain entity design MongoDB's document model transforms the traditional relational approach of spreading business entities across multiple tables into consolidated, rich business objects that serve as root domain entities. This architectural shift proves particularly advantageous for agentic AI workflows where agents require comprehensive access to complete business contexts. Access pattern optimization In relational systems, a single insurance claim might span multiple tables, requiring complex joins and multiple database round-trips. For AI agents that need to reason across complete business contexts, this fragmentation creates performance bottlenecks and incomplete data views. MongoDB's document model consolidates these fragments into a single, rich business object: { "_id": ObjectId("789xxxxxxxxxxxxxxxxxx092"), "claimId": "claim_123", "policyId": "policy_abc", "status": "processing", "vectorEmbedding": [0.23, -0.45, 0.67, 0.12, -0.89, 0.34, 0.78, -0.56], "createdAt": ISODate("2024-01-15T08:00:00.000Z"), "lastModified": ISODate("2024-01-15T14:30:00.000Z"), "securityClassification": "confidential", "customer": { "customerId": "cust_456", "name": "John Doe", "preferences": { "communicationChannel": "email", "language": "en-US", "timezone": "America/Toronto", "notifications": { "claimUpdates": true, "paymentAlerts": true, "policyRenewals": false } }, "contactHistory": [ { "interactionId": "int_001", "timestamp": ISODate("2024-01-15T08:15:00.000Z"), "channel": "mobile_app", "type": "claim_submission", "summary": "Initial claim submission with photos" }, { "interactionId": "int_002", "timestamp": ISODate("2024-01-15T10:30:00.000Z"), "channel": "phone", "type": "status_inquiry", "summary": "Customer called for claim status update" } ] }, "coverage": { "coverageType": "comprehensive", "limits": { "propertyDamage": 50000, "collision": 25000, "comprehensive": 30000, "liability": 100000 }, "deductibles": { "collision": 500, "comprehensive": 250 }, "policyEffectiveDate": "2023-06-01" }, "damageAssessment": { "damagePhotos": [ { "photoId": "photo_001", "s3Url": "s3://claim-photos/789xxx092/storm_damage_front.jpg", "aiAnalysis": "Significant front bumper damage with scratches extending to hood. Estimated repair complexity: moderate. No structural damage visible.", "vectorEmbedding": [0.78, -0.23, 0.45, 0.67, -0.12, 0.89, -0.34, 0.56], "confidenceScore": 0.94, "processedAt": ISODate("2024-01-15T08:45:00.000Z") }, { "photoId": "photo_002", "s3Url": "s3://claim-photos/789xxx092/storm_damage_side.jpg", "aiAnalysis": "Side panel dent approximately 6 inches diameter. Paint scratched but no rust visible. Repairable without panel replacement.", "vectorEmbedding": [0.34, 0.78, -0.45, 0.23, 0.67, -0.89, 0.12, -0.56], "confidenceScore": 0.89, "processedAt": ISODate("2024-01-15T08:47:00.000Z") } ], "inspectionReports": [ { "reportId": "insp_001", "inspectorId": "adj_storm_specialist_42", "scheduledDate": ISODate("2024-01-17T10:00:00.000Z"), "status": "scheduled", "reportType": "damage_assessment" } ], "estimatedCost": 3250.75, "severity": "moderate" }, "agentInteractions": { "interactions": [ { "agentId": "agent_storm_specialist", "interactionType": "analysis", "context": "Customer reported damage after Storm Xavier hit Toronto area. Initial damage assessment from uploaded photos.", "timestamp": ISODate("2024-01-15T08:45:00.000Z"), "toolsUsed": ["vector_search", "damage_analysis", "policy_lookup"], "agentMemory": { "conversationHistory": [ { "exchange": 1, "input": "Analyze uploaded damage photos for claim_123", "output": "Detected moderate front-end and side panel damage. Estimated repair cost $3,250. No structural issues identified.", "confidence": 0.91 } ], "contextualInsights": [ "Storm Xavier caused widespread vehicle damage in Toronto area", "Customer has comprehensive coverage with $250 deductible", "Similar damage patterns seen in 15 other storm-related claims today" ], "reasoningPath": [ "Image classification identified vehicle damage type", "Vector similarity search matched against historical claims database", "Policy coverage verification confirmed comprehensive coverage applies", "Cost estimation model applied based on damage severity and location" ], "customerPreferences": { "preferredRepairShops": ["Downtown Auto Body", "Metro Collision Center"], "communicationStyle": "detailed_updates", "schedulingPreference": "morning_appointments" } } }, { "agentId": "agent_customer_service", "interactionType": "escalation", "context": "Customer called requesting expedited processing due to car being primary work vehicle", "timestamp": ISODate("2024-01-15T14:15:00.000Z"), "toolsUsed": ["policy_lookup", "workflow_management"], "agentMemory": { "conversationHistory": [ { "exchange": 1, "input": "Customer requesting priority processing - primary work vehicle", "output": "Escalated to priority queue. Assigned to senior adjuster. Inspection scheduled within 48 hours.", "confidence": 0.95 } ], "contextualInsights": [ "Customer profile indicates self-employed contractor", "Vehicle essential for work operations", "Customer has excellent payment history and no prior claims" ], "reasoningPath": [ "Analyzed customer profile and claim urgency", "Verified eligibility for priority processing", "Assigned to available senior adjuster with storm expertise" ] } } ] }, "workflowState": { "currentState": "awaiting_adjuster_review", "availableActions": [ "schedule_inspection", "request_additional_documentation", "approve_estimate", "escalate_to_manager" ], "escalationRules": { "costThreshold": 5000, "timeInState": 48, "customerPriority": "high", "autoEscalationEnabled": true }, "stateChangedAt": ISODate("2024-01-15T14:20:00.000Z"), "assignedTo": "adj_sarah_williams_senior" } } Figure 4. Document model structure. Performance benefits for AI workloads This consolidated approach delivers significant advantages for agentic systems: Single database operation: Agents retrieve the complete business context in one operation rather than executing multiple conditional joins. JSON-native processing: AI tools natively consume JSON, eliminating transformation overhead. Real-time consistency: Multiple agents can access and update the same root entity with immediate consistency. Reduced latency: Fewer network hops and database calls accelerate agent decision-making cycles. Unified security model Security is extended to both business logic as well as AI data and AI pipelines. MongoDB's field-level security ensures that agent access permissions align with business requirements. Vector embeddings, agent memories, and workflow configurations inherit the same security model as the root business entity, creating a unified security posture that spans traditional data and AI-generated insights. Enterprise root domain entities Considering, as an example for insurance organizations, four core root domain entities serve as the data processing engines that power agentic workflows: Customer: Complete customer profiles, including interaction history and AI-derived insights. Policy: Comprehensive policy documents with embedded coverage rules and agent configurations. Claim: Rich claim objects containing structured data, unstructured evidence, and agent analysis. Submission: Underwriting submissions with risk assessments and agent recommendations. These root domain entities provide the stable foundation upon which autonomous agents can operate while maintaining the flexibility to evolve as AI capabilities advance. Intelligent workflow automation AI agents navigate complex decision paths autonomously. When new damage photos are uploaded: Immediate detection: Atlas Streams trigger agent activation upon data changes. Semantic analysis: Vector search compares damage patterns against historical claims. Business rule application: Agents apply policy coverage rules encoded in documents. Intelligent routing: The workflow engine assigns claims based on complexity and expertise requirements. Continuous learning: Agent decisions inform future processing through feedback loops. Figure 5. Agentic AI workflow process. Performance optimization The system employs several strategies to maintain performance at scale: Compound indexing: Optimize for both structured queries and vector searches. Intelligent caching: Agent memories cached near processing layers reduce latency. Parallel processing: Image analysis and entity enrichment occur simultaneously. Dynamic scaling: MongoDB sharding distributes load across multiple nodes as data volumes grow. Strategic outlook and future developments The convergence of data architectures and AI capabilities represents just the beginning of a broader transformation: Short-term: Multi-modal agents processing voice, image, and text Automated regulatory compliance checking 95% straight-through processing rates Medium-term: Predictive claim prevention systems Cross-industry agent collaboration 25% reduction in claim frequency through proactive intervention Long-term: Autonomous insurance ecosystems Self-healing policy structures Industry-wide transformation toward prevention-focused models Getting started The transition from traditional data silos to cognitive architectures begins with understanding that agentic AI requires fundamentally different data patterns. Organizations can start by: Evaluating current architecture: Assess how well existing systems support AI agent requirements. Piloting converged approaches: Implement small-scale projects using MongoDB Atlas vector search. Building agent capabilities: Develop AI agents with persistent memory and tool access. Scaling intelligence: Expand successful patterns across broader business processes. MongoDB Atlas provides the foundation for this transformation with its unified data platform supporting document storage, vector search, real-time processing, and enterprise-grade security. Conclusion The future belongs to organizations that recognize data architecture as the nervous system of artificial intelligence. By building converged datastores that support truly agentic AI systems, companies can transform from reactive data processors into proactive intelligence platforms. The tools are ready, the patterns are proven, and early adopters are already realizing significant competitive advantages. The question isn't whether this transformation will happen, but whether your organization will lead or follow in the age of agentic AI. The shift from static systems to cognitive architectures starts with a single choice: build for tomorrow’s intelligent agents, not yesterday’s data silos. Start transforming your AI applications with MongoDB Atlas —deploy vector search in minutes, integrate intelligent workflows effortlessly, and future-proof your data strategy. Read more on Agentic Workflows .

August 21, 2025

Developer Blog

Ready to get Started with MongoDB Atlas?

Start Free