c

3099 results

Building for Developers—Not Imitators

At MongoDB we believe in fair competition, open collaboration, and innovation that empowers developers. Our work has popularized the document database and enabled millions of developers to build modern, scalable applications that power some of the world’s leading companies. We welcome competition that drives progress and expands developer choice. But that only works when everyone plays fair, which, unfortunately, isn’t always the case. On May 16th, we asked FerretDB to stop engaging in unfair business practices that harm both MongoDB and the broader developer community. We believe that FerretDB has crossed two distinct lines: FerretDB misleads and deceives developers by falsely claiming that its product is a “replacement” for MongoDB “in every possible way 1 .” FerretDB compounds the issue by using MongoDB’s name and branding in ways that mislead developers and falsely suggest affiliation and equivalence, which is untrue. FerretDB has infringed upon MongoDB’s patents. Re-implementing MongoDB commands and functionality relies on unauthorized misappropriation of MongoDB’s intellectual property without permission or investment. Specifically, FerretDB is infringing multiple MongoDB patents that cover how aggregation pipelines are processed and optimized, and other MongoDB functionality that increases the reliability of write operations. Ferret is trying to hide behind being an open source alternative, but at the end of the day, this isn’t about open source, it’s about imitation, theft, and misappropriation masquerading as compatibility. FerretDB selectively claims compatibility to deceptively attract developers, while omitting key features and limiting deployment when it suits its purposes 2 . In fact, FerretDB’s CEO has acknowledged the confusion developers face when evaluating their product 3 —and rather than clarifying, FerretDB leans into that ambiguity, seeking to exploit it at MongoDB’s expense. Rather than investing in new ideas, FerretDB has attempted to capitalize on over 15 years of MongoDB’s research and development— copying core innovations, misusing our name, and misrepresenting their product to developers as a drop-in replacement for MongoDB. Developers deserve better. They deserve clarity, transparency, and truly innovative tools they can rely on. MongoDB takes no pleasure in raising these concerns. We remain committed to open development principles and welcome healthy competition. While we had sincerely hoped FerretDB would choose to compete fairly, their continued action has left us no choice but to protect our reputation and intellectual property through legal action. 1 https://www.ferretdb.com/ 2 For example: FerretDB positions itself as an open-source alternative to MongoDB, offering wire protocol compatibility to support MongoDB drivers and tools on a PostgreSQL backend. However, FerretDB acknowledges it does not offer full feature parity and advises developers to verify compatibility before migrating ( Migration Guide ). 3 https://www.contributor.fyi/ferretdb

May 23, 2025

Future-Proof Your Apps with MongoDB and WeKan

We build where it matters most—filling capability gaps and evolving alongside our customers. When ecosystem partners offer strong solutions, we listen and adjust. As part of this focus, MongoDB will phase out Atlas Device Sync (ADS) by September 30, 2025. This shift allows us to double down on what we do best: delivering a secure, high-performance database platform that supports the next generation of modern applications. Together, MongoDB and WeKan will help organizations navigate this transition seamlessly, offering structured guidance to ensure a smooth migration to advanced, future-proof solutions. This is an opportunity for organizations to future-proof their applications, maintain operational excellence, and adopt cutting-edge technology for the next generation of business needs. Navigating next steps: Choosing the right path with WeKan WeKan is a leading application modernization consultancy, and since YEAR has been MongoDB’s trusted partner for mobile and IOT. WeKan’s team of expert MongoDB engineers have supported complex Realm and Atlas Device Sync implementations for some of MongoDB’s largest enterprise customers. Today, WeKan is actively guiding organizations through the Realm end-of-life transition—helping them assess, validate, and migrate to alternative technologies like Ditto, PowerSync, ObjectBox, and HiveMQ, among others. In many cases, WeKan also supports customers in building custom sync solutions tailored to their unique needs. “MongoDB strategically invested in WeKan for their deep expertise in complex Edge and Mobile environments,” said Andrew Davidson, Senior Vice President of Products at MongoDB. “Since MongoDB’s acquisition of Realm in 2019, WeKan has played a pivotal role in modernizing mobile and IoT workloads for some of our largest enterprise customers using Atlas Device Sync and Realm. As ADS approaches end-of-life, the specialized knowledge they’ve developed positions them as the ideal partner to support customers in their migration to alternative sync solutions.” Here’s how WeKan supports and streamlines organizations’ transition from Realm and ADS. They provide services in the following areas: Assessment and consultancy Proof-of-concept development Structured migration plan Technical support and training Let’s dive into each area! Assessment and consultancy A successful migration begins with a deep understanding of the current application landscape. WeKan’s experts conduct a two-week assessment—involving discovery workshops and technical deep dives—to evaluate dependencies, security, performance, scalability, and an organization’s future needs. The result is a customized migration roadmap with recommended solutions, architecture diagrams, and a strategy aligned with business goals. Figure 1. Summary of activities from WeKan’s Assessment Framework. Proof-of-concept service To ensure the migration strategy, WeKan provides a structured proof-of-concept (POC) service. This phase allows businesses to test solutions like PowerSync or Ditto against success criteria by building a sample mobile app that simulates the new environment. Through planning, implementation, testing, and documentation, WeKan assesses performance, integration, and feasibility—enabling informed decisions before proceeding with the full-scale migration. Figure 2. WeKan’s steps for POC & technical validation. Structured migration plan Once the assessment and POC phases are complete, WeKan executes a structured migration plan designed to minimize disruptions. The migration process is broken into key phases, including integrating new SDKs, optimizing data models, transitioning queries, and deploying a pilot before a full rollout. WeKan ensures that all code changes, data migrations, security configurations, and performance optimizations are handled efficiently, enabling a seamless transition with minimal downtime. Figure 3. Sample WeKan migration plan and activities. Technical support and training Post-migration support is essential for a smooth transition, and WeKan provides dedicated technical assistance, training, and documentation to ensure teams can effectively manage and optimize their new systems. This support includes hands-on guidance for development teams, troubleshooting assistance, and best practices for maintaining the new infrastructure. With ongoing support, businesses can confidently adapt to their upgraded environment while maximizing the benefits of the migration. Start your migration journey with confidence As Atlas Device Sync approaches end-of-life, now is the time to act. WeKan brings deep expertise and a structured migration approach to help you transition seamlessly—whether you choose Ditto, PowerSync, or another alternative. This is more than a technology shift. It’s an opportunity to embrace digital transformation and build a future-ready, high-performance infrastructure that evolves with your business needs. Partner with WeKan and MongoDB to ensure a smooth, expert-led migration from legacy sync solutions. With proven methodologies and deep technical know-how, WeKan minimizes disruption while maximizing long-term impact. Learn how WeKan can simplify your migration and set you up for scalable success. Let’s future-proof your digital foundation—together. Contact us today ! Boost your MongoDB Atlas skills today through our Atlas Learning Hub !

May 22, 2025

Strategic Database Architecture for AI - Unified vs. Split

Key takeaways: A unified architecture significantly reduces development complexity by eliminating synchronization challenges between separate vector and operational databases. Data consistency is guaranteed through atomic transactions in unified systems, preventing "ghost documents" and other split architecture failures. The total cost of ownership is typically lower with unified architectures due to consolidated infrastructure and reduced maintenance burden. Developer velocity increases with unified approaches as teams can focus on building features rather than integration code and error handling. MongoDB Atlas provides future-proofing benefits with integrated AI capabilities like vector search, automatic quantization and more. AI demands more from databases, and the architectural decisions organizations make today directly affect their time‑to‑market and competitive edge. In the generative AI era, your database must support both high‑dimensional vector searches and fast transactional workloads to keep pace with rapid business and technological change. In this piece, we examine the architectural considerations technology leaders and architects should consider when managing AI applications’ diverse data requirements, including high-dimensional vector embeddings for semantic search alongside traditional operational data (user profiles, content metadata, etc.). This dichotomy presents two distinct architectural approaches— split versus unified —each with significant implications for application performance, consistency, and developer experience. Note: For technical leaders who want to equip their teams with the nuts and bolts details—or who need solid evidence to win over skeptical developers—we've published a comprehensive implementation guide . While this article focuses on the strategic considerations, the guide dives into the code-level realities that your development team will appreciate. Why data architecture matters Building successful AI products and features involves thinking ahead about the speed and cost of intelligence at scale . Whether you’re implementing semantic search for a knowledge base or powering a real-time recommendation engine, your database architecture underpins how quickly and reliably you can bring those features to market. In the AI era, success no longer hinges solely on having innovative algorithms—it's fundamentally determined by output accuracy and relevancy. This represents a profound shift: data architecture, once relegated to IT departments, has become everyone's strategic concern. It directly influences how quickly your developers can innovate ( developer velocity ), how rapidly you can introduce new capabilities to the market ( time-to-market ), and how reliably your systems perform under real-world conditions ( operational reliability ). In essence, your data architecture has become the foundation upon which your entire AI strategy either thrives or falters. Your data architecture is your data foundation. Unlike traditional applications that dealt mostly with structured data and simple CRUD queries, AI applications generate and query vector representations of unstructured data (like text, images, and audio) to find “similar” items. These vectors are often stored in dedicated vector databases or search engines optimized for similarity search. At the same time, applications still need traditional queries (exact lookups, aggregations, transactions on business data). This raises a key fundamental architectural question: Do we use separate specialized databases for these different workloads and data structures, or unify them in one system? Let’s also take the opportunity to briefly address the concept of an “ AI database ” that has emerged to describe a system that handles both standard operational workloads and AI-specific operations such as vector search . In short, behind AI Search capabilities in modern AI applications, are AI retrieval techniques enabled by databases optimized for AI workloads. Split architecture: Integrating a separate vector store In a split architecture, vector operations and transactional data management are delegated to separate, specialized systems. A general purpose database (e.g., MongoDB, PostgreSQL) maintains operational data, while a dedicated vector store (e.g., Elasticsearch, Pinecone) manages embeddings and similarity search operations. On the surface, this divide and conquer approach lets each system do what it's best at. The search engine or dedicated vector store can specialize in vector similarity queries, while the operational database handles updates and persistence. This leverages specialized optimizations in each system but introduces synchronization requirements between data stores. Many AI teams have implemented semantic search and other AI functionalities this way, using an external vector index alongside their application database, with both systems kept in sync through custom middleware or application-level logic. Split architecture characteristics: Specialized systems: Each database is optimized for its role (e.g. the operational DB ensures fast writes, ACID transactions, and rich queries; the vector search engine provides efficient similarity search using indexes like HNSW for approximate nearest neighbor). Data duplication: Vector embeddings (and often some identifiers or metadata) are duplicated in the vector store. The primary ID or key exists in both systems to link results. Synchronization logic: The application must handle synchronization – for every create/update/delete of a record, you need to also update or delete the corresponding vector entry in the search index. This can be done via event streams, change capture, or in application code calling two systems. Data querying: Multi-stage query patterns requiring cross-system coordination Example stack: An example is using MongoDB as the source of truth for product documents, and Elasticsearch as a vector search engine for product description embeddings. The app writes to MongoDB, then indexes the embedding into Elasticsearch, and at query time does a vector search in Elasticsearch, then fetches the full document from MongoDB by ID. This system pattern is what we hear from a number of AI teams that leverage MongoDB and... well, just about anything else that promises to make vectors dance faster. It's the architectural equivalent of wearing both a belt and suspenders—sure, your pants aren't falling down, but you're working awfully hard to solve what could be a simpler problem. These teams often find themselves building more synchronization code than actual features, turning what should be AI innovation into a complex juggling act of database coordination. Figure 1. Split architecture: MongoDB operational database + Elasticsearch vector store. Putting belts and suspenders aside, the notable point is that splitting the architecture comes at a cost . You now have two sources of truth that need to stay in sync. Every time you add or update data, you must index the vector in the search engine. Every query involves multiple round trips – one to the search service to find relevant items, and another to the database to fetch full details. This added complexity can slow development and introduces potential points of failure. Operating a split system introduces challenges, as we’ll discuss, around consistency (e.g. “ghost” records when the two systems get out of sync) and added complexity in development and maintenance. In extremely high-scale or ultra-low-latency use cases (e.g., >1B vectors or <1 ms NN SLAs), a dedicated vector engine such as FAISS or Milvus may still outperform a general-purpose database on raw similarity-search throughput. However, MongoDB Atlas’s Search Nodes isolate vector search workloads onto separate, memory-optimized instances—allowing you to scale and tune search performance independently of your database nodes, often delivering the low-latency guarantees modern AI applications require. Unified architecture with MongoDB Atlas: One platform for AI data In a unified architecture , a single database platform handles both operational data and vector search functionalities. MongoDB Atlas Vector Search integrates vector indexing and search directly into the MongoDB database. This architectural pattern simplifies the data model by storing embeddings alongside associated data in the same document structure. The database system internally manages vector indexing (using algorithms like HNSW) and provides integrated query capabilities across both vector and traditional data patterns. In practice, this means your application can execute one query (to MongoDB) that filters and finds data based on vector similarity, without needing a second system. This means all data – your application’s documents and their vector representations – live in one place, under one ACID-compliant transactional system for your AI workload. Unified architecture characteristics: Single source of truth: Both the raw data and the vector indexes reside in one database. For example, MongoDB Atlas allows storing vector fields in documents and querying them with integrated vector search operators. There is no need to duplicate or sync data between different systems. Atomic operations: Updates to a document and its vector embedding occur in one atomic transaction or write operation. This guarantees strong consistency – your vector index can’t drift from your document data. If a transaction fails, none of the changes (neither the document nor its embedding) are committed. This eliminates issues like “ghost documents” (we’ll define this shortly) because it's impossible to have an embedding without its corresponding document in the same database. Unified query capabilities: The query language (e.g. MongoDB’s MQL) can combine traditional filters, full-text search, and vector similarity search in one query. This hybrid search capability means you can, for instance, find documents where category = "Tech" and embedding is similar to a query vector – all in one go. You don’t have to do two queries in different systems and then merge results in your application. Operational simplicity: There’s only one system to manage, secure, scale, and monitor. In a managed cloud platform like MongoDB Atlas, you get a fully managed service that handles both operational and vector workloads, often with features to optimize each (for example, dedicated “search nodes” that handle search indexing and queries so that heavy vector searches don’t impact transactional workload performance). Figure 2. Unified Architecture: MongoDB Atlas with integrated Vector Search. MongoDB Atlas integrates an Atlas Vector Search engine (built on Apache Lucene, same technology used in some dedicated vector search engines) directly into the database. This allows developers to store high-dimensional vectors in documents and run similarity searches using indexes powered by algorithms like HNSW (Hierarchical Navigable Small World graphs) for approximate nearest neighbor (ANN) search. Additional features like vector quantization (to compress vectors for efficiency) and hybrid search (combining vector and text searches) are supported out-of-the-box and constructed with the MongoDB Query Language (MQL). All of this occurs under the umbrella of the MongoDB Atlas database’s transaction engine and security architecture. In short, the unified approach aims to provide the best of both worlds – the rich functionality of a specialized vector store and the reliability/consistency of a single operational datastore. A strategic consideration for decision makers For technical leaders managing both innovation and budgets, the unified approach presents a compelling financial case alongside its technical merits. If your organization is already leveraging MongoDB as your operational database—as thousands of enterprises worldwide do—the path to AI enablement becomes remarkably streamlined. Rather than allocating budget for an entirely new vector database system, with all the associated licensing, infrastructure, and staffing costs, you can extend your existing MongoDB investment to handle vector workloads. Your teams already understand MongoDB's architecture, security model, and operational characteristics. Adding vector capabilities becomes an incremental skill addition rather than a steep learning curve for an entirely new system. For projects already in flight, migrating vector data or generating new embeddings within your existing MongoDB infrastructure can be accomplished without disrupting ongoing operations. Technical overview of split vs. unified architecture To illustrate the practical implications of each architecture, let’s observe high level implementation and operational considerations for a knowledge base question-answering application. Both approaches enable vector similarity search, but with notable differences in implementation complexity and consistency guarantees. Figure 3. Split Architecture: The Hidden Cost In a split architecture (e.g. using MongoDB + Elasticsearch) : We store the article content and metadata in MongoDB, and store the embedding vectors in an Elasticsearch index. At query time, we’ll search the Elasticsearch index by vector similarity to get a list of top article IDs, then retrieve those articles from MongoDB by their IDs. There are several key operations that are involved in a dual database architecture: Creation: During document creation, the application must coordinate insertions across both systems. First, the document is stored in MongoDB, then its vector embedding is generated and stored in Elasticsearch. If either operation fails, manual rollback logic is needed to maintain consistency. For example, if the MongoDB insertion succeeds but the Elasticsearch indexing fails, developers must implement custom cleanup code to delete the orphaned MongoDB document. Read: Vector search becomes a multi-stage process in a split architecture. The application first queries Elasticsearch to find similar vectors, retrieves only the document IDs, then makes a second round-trip to MongoDB to fetch the complete documents matching those IDs. This introduces additional network latency and requires error handling for cases where documents exist in one system but not the other. Update: Updating content presents significant synchronization challenges. After updating a document in MongoDB, the application must also update the corresponding vector in Elasticsearch. If the Elasticsearch update fails after the MongoDB update succeeds, the systems become out of sync, with the vector search returning outdated or incorrect results. There's no atomic transaction spanning both systems, requiring complex recovery mechanisms. Deletion: Deletion operations face similar synchronization issues. When a document is deleted from MongoDB but the corresponding deletion in Elasticsearch fails, "ghost documents" appear in search results - vectors pointing to documents that no longer exist. Users receive search results they cannot access, creating a confusing experience and potential security concerns if sensitive information remains indirectly accessible through preview content stored in Elasticsearch. Each of these operations requires careful error handling, retry mechanisms, monitoring systems, and background reconciliation processes to maintain consistency between the two databases. And notably, the complexity compounds over time, with synchronization issues becoming more difficult to detect and resolve as the data volume grows, ultimately impacting both developer productivity and user experience. Figure 4. CRUD operations in a unified architecture: MongoDB Atlas with vector search. In a unified architecture (using MongoDB Atlas Vector Search): We store both the article data and its embedding vector in a single MongoDB document. An Atlas Vector Search index on the embedding field allows us to perform a similarity search directly within MongoDB using a single query. The database will internally use the vector index to find nearest neighbors and return the documents. Let's examine how the same operations simplify dramatically in a unified architecture: Creation: Document creation becomes an atomic operation. The application stores both the document and its vector embedding in a single MongoDB document with one insert operation. Either the entire document (with its embedding) is stored successfully, or nothing is stored at all. There's no need for custom rollback logic or cleanup code since MongoDB's transaction guarantees ensure data integrity without additional application code. Read: Vector search is streamlined into a single step. Using MongoDB's aggregation pipeline with Atlas Vector Search, the application queries for similar vectors and retrieves the complete documents in a single round-trip. There's no need to coordinate between separate systems or handle inconsistencies, as the vector search is directly integrated with document retrieval, substantially reducing both latency and code complexity. Update: Document updates maintain perfect consistency. When updating a document's content, the application can atomically update both the document and its vector embedding in a single operation. MongoDB's transactional guarantees ensure that either both are updated or neither is, eliminating the possibility of out-of-sync data representations. Developers no longer need to implement complex recovery mechanisms for partial failures. Deletion: The ghost document problem vanishes entirely. When a document is deleted, its vector embedding is automatically removed as well, since they exist in the same document. There's no possibility of orphaned vectors or inconsistent search results. This ensures that search results always reflect the current state of the database, improving both reliability and security. This unified approach eliminates the entire category of synchronization challenges inherent in split architectures. Developers can focus on building features rather than synchronization mechanisms, monitoring tools, and recovery processes. The system naturally scales without increasing complexity, maintaining consistent performance and reliability even as data volumes grow. Beyond the technical benefits, this translates to faster development cycles, more reliable applications, and ultimately a better experience for end users who receive consistently accurate search results. The vector search and document retrieval happen in one round-trip to the database, which fundamentally transforms both the performance characteristics and operational simplicity of AI-powered applications. Syncing data: Challenges and "ghost documents" One of the biggest challenges with the split architecture is data synchronization. Because there are two sources of truth (the operational DB and the vector index), any change to data must be propagated to both. In practice, perfect synchronization is hard — network glitches, bugs, or process failures can result in one store updating while the other doesn't. This can lead to inconsistencies that are difficult to detect and resolve. A notorious example in a split setup is the "ghost document" scenario. A ghost document refers to a situation where the vector search returns a reference to a document that no longer exists (or no longer matches criteria) in the primary database. For instance, suppose an article was deleted or marked private in MongoDB but its embedding was not removed from Elasticsearch. A vector search might still retrieve its ID as a top result – leading your application to try to fetch a document that isn't there or shouldn't be shown. From a user's perspective, this could surface a result that is broken or stale. Let's go back to our practical scenario earlier: imagine a knowledge base system for customer support where articles are constantly being updated and occasionally removed when they become outdated. When a support agent deletes an article about a discontinued product, the deletion occurs in MongoDB successfully, but due to a network timeout, the corresponding vector deletion in Elasticsearch fails. And yes, that happens, especially with applications handling millions of requests daily. Later, when a customer searches for solutions related to that discontinued product, the vector search in Elasticsearch identifies the now deleted article as highly relevant and returns its ID. When the application attempts to fetch the full content from MongoDB using this ID, it discovers the document no longer exists. The customer sees a broken link or an error message instead of helpful content, creating a confusing and frustrating experience. What's particularly insidious about this problem is that it can manifest in various ways across the application. Beyond complete document deletion issues, you might encounter: Stale embeddings: A document is updated in MongoDB with new content, but the vector in Elasticsearch still represents the old version, causing search results that don't match the actual content. Permission inconsistencies: A document's access permissions change in MongoDB (e.g., from public to private), but it still appears in vector search results for users who shouldn't access it. Partial updates: Only some fields get updated across the systems, leading to mismatched metadata between what's shown in search previews versus the actual document. In production environments, development teams often resort to implementing complex workarounds to mitigate these synchronization issues: Background reconciliation jobs that periodically compare documents across both systems and repair inconsistencies Outbox patterns where operations are logged to a separate store and retried until successful Custom monitoring systems specifically designed to detect and alert on cross-database inconsistencies Manual intervention processes for support teams to address user-reported discrepancies All these mechanisms represent significant development effort that could otherwise be directed toward building features that deliver real business value. They also introduce additional points of failure and operational complexity. Crucially, a unified architecture avoids this entire class of problems . Since there is only one database, a document that is deleted is automatically removed from any associated indexes within the same transaction . A unified data model makes it relatively impossible to have a vector without its document , because they are one and held in the same document. As a result, issues like ghost documents, stale vector references, or needing to catch up two datastores simply go away. No synchronization needed – when documents and embeddings live in one database, you'll reduce the risk of ghost documents or inconsistent reads. Trade-offs and considerations There are several key trade-offs you have to weigh when comparing split and unified architectures for AI data. As mentioned, your choice will affect system complexity, performance characteristics, scalability, cost, and development agility. For AI project leads and Enterprise AI leaders it's vital to understand these considerations, below are a few: Figure 4. Trade-Off Comparison: Split vs. MongoDB Unified Architecture System Complexity vs. Data Consistency: Maintaining consistency in a split setup requires additional logic and increases system complexity. Every piece of data is effectively handled twice, introducing opportunities for inconsistency and complex failure modes. In a unified architecture, ACID transactions ensure that updates to data and its embedding vector occur together or not at all, simplifying the design and reducing custom error handling code. Operational Overhead vs. Performance: A split architecture can leverage specialized engines optimized for similarity queries, but introduces network latency with multiple round trips and increases operational overhead with two systems to monitor. Unified architectures eliminate the extra network hop, potentially reducing query latency. MongoDB Atlas offers optimizations like vector quantization and dedicated search processing nodes that can match or exceed the performance of separate search engines. Scalability vs. Cost Efficiency: Split architectures allow independent scaling of components but come with infrastructure cost duplication and data redundancy. A unified architecture consolidates resources while still enabling workload isolation through features like Atlas Search Nodes . This simplifies capacity planning and helps avoid over-provisioning multiple systems. Maintenance Burden vs. Developer Velocity: Split architectures require substantial "glue code" for integration, dual writes, and synchronization, slowing development and complicating schema changes. Unified architectures let developers focus on application logic with fewer moving parts and a single query language, potentially accelerating time-to-market for AI features. Future-Proofing: Simpler unified architectures make it easier and faster to adopt new capabilities as AI technology evolves. Split systems accumulate technical debt with each component upgrade, while unified platforms can incorporate new features transparently without redesigning integration points. While some organizations may initially choose a split approach due to legacy systems or specialized requirements, MongoDB's unified architecture with Atlas Vector Search now addresses many historical reasons for separate search engines, offering hybrid search capabilities, accuracy options, and optimization tools within a single database environment. Choosing the right architecture for AI workloads When should you choose a split architecture, and when does a unified architecture make more sense? The answer ultimately depends on your specific requirements and constraints . Consider a Split Architecture if you already have significant infrastructure built around a specialized search or vector database and it’s meeting your needs. In some cases, extremely high-scale search applications might be deeply tuned on a separate engine, or regulatory requirements might dictate separate data stores. A split approach can also make sense if one type of workload far outstrips the other (e.g., you perform vector searches on billions of items, but have relatively light transactional operations – though even then, a unified solution with the right indexing can handle surprising scale). Just be prepared to invest in the tooling and engineering effort to keep the two systems in harmony. If you go this route, design your sync processes carefully and consider using change streams or event buses to propagate changes reliably. Also, weigh the operational cost: maintaining expertise in two platforms and the integration between them is non-trivial. Consider a Unified Architecture if you are building a new AI-powered application or modernizing an existing one, and you want simplicity, consistency, and speed of development . If avoiding the pitfalls of data sync and reducing operational complexity are priorities, unified is a great choice. A unified platform shines when your application needs tight integration between operational and vector data – for example, performing a semantic search with runtime filters on metadata, or updating content and immediately reflecting it in search results. With a solution like MongoDB’s modern data platform , you get a fully managed, cloud-ready database that can handle both your online application needs and AI search needs under one roof. This leads to faster development cycles (since your team can work with one system and one query language) and greater confidence that your search results reflect the true state of your data at any moment. Figure 5. Unified architecture benefits in MongoDB Atlas Vector Search Looking ahead, a unified architecture is arguably the more future-proof approach. AI capabilities evolve at an accelerated pace, so having your data in one place allows you to leverage new features immediately. We work with AI customers building sophisticated AI applications, and one key observation is the requirement to streamline data processing operations within AI applications that leverage RAG pipelines or Agentic AI. Critical operations include chunking, embedding generation, vector search operation, and reranking. We've also brought in Voyage AI ’s state-of-the-art embedding models and rerankers to MongoDB. Soon, these models will reside within MongoDB Atlas and enable the conversion of data objects into embeddings and enforce an additional layer of data management in retrieval pipelines will all be within MongoDB Atlas. This step is one of the key ways MongoDB continues to bring intelligence to the data layer and creating a truly intelligent data foundation for AI applications. MongoDB's Atlas platform is continually expanding its AI-focused features – from vector search improvements to integration with data streams and real-time analytics – all while ensuring the core database guarantees (like ACID transactions and high availability) remain solid. This means you don't have to re-architect your data layer to adopt the next big advancement in AI; your existing platform grows to support it. Understandably, the split vs unified architecture debate is a classic example of balancing specialization against simplicity . Split systems can offer best-of-breed components for each task, but at the cost of complexity and potential inconsistency. Unified systems offer elegance and ease, bundling capabilities in one place, and have rapidly closed the gap in terms of features and performance. Let’s end on this, MongoDB was built for change , and that ethos is exactly what organizations need as they navigate the AI revolution. By consolidating your data infrastructure and embracing technologies that unify capabilities, you equip your teams with the freedom to experiment and the confidence to execute. The future will belong to those who can harness AI and data together seamlessly . It’s time to evaluate your own architecture and make sure it enables you to ride the wave of AI innovation, and not be washed away by it. In an AI-first era, the ability to adapt quickly and execute with excellence is what separates and defines leaders. The choice of database infrastructure is a pivotal part of that execution. Choose wisely – your next breakthrough might depend on it. Try MongoDB Atlas for free today , or head over to our Atlas Learning Hub to boost your MongoDB Atlas skills!

May 22, 2025

Agentic Workflows in Insurance Claim Processing

In 2025, agentic AI is transforming the insurance industry, enabling autonomous systems to perceive, reason, and act independently to achieve complex objectives. Insurers are heavily investing in these technologies to overcome legacy system limitations, deliver hyper-personalized customer experiences, and to capitalize on the $79.86 billion AI insurance market projected by 2032 . Central to this transformation is efficient claim processing. AI tools like natural language processing, image classification, and vector embedding help insurers effectively manage claim-related data. These capabilities generate precise catastrophe impact assessments, expedite claim routing with richer metadata, prevent litigation through better analysis, and minimize financial losses using more accurate risk evaluations. Because AI’s promises often sound compelling—but fall short when moving from experimentation to real-world production—this post explores how an AI agent can manage a multi-step claim processing workflow. In this workflow, the agent manages accident photos, assesses damage, and verifies insurance coverage to enhance process efficiency and improve customer satisfaction. This system employs large language models (LLMs) to analyze policy information and related documents provided by MongoDB Atlas Vector Search, with the outcomes stored in the Atlas database. Creating a work order for claim handlers The defining characteristic of AI agents, which is what sets them apart from simply prompting an LLM, is autonomy. The ability to be goal-driven and to operate without precise instructions makes AI agents powerful allies for humans, who can now delegate tedious tasks like never before. But each agent has a different degree of autonomy, and building such systems is a tradeoff between reliability and prescriptiveness. Since LLMs—which can be thought of as the agent's brain—tend to hallucinate and behave nondeterministically, developers need to be very cautious. Too much “freedom” can lead to unexpected outcomes. On the other hand, including too many constraints, instructions, or hardcoded steps defeats the purpose of building agents. To help agents understand their context, it is important to craft a prompt that describes their scope and goals. This is part of the prompt we’ve used for this exercise: "You are a claims handler assistant for an insurance company. Your goal is to help claim handlers understand the scope of the current claim and provide relevant information to help them make an informed decision. In particular, based on the description of the accident, you need to fetch and summarize relevant insurance guidelines so that the handler can determine the coverage and process the claim accordingly. Present your findings in a clear and extremely concise manner.” In addition to the definition of the tasks, it is also important to give instructions on the tools available to the agent and how to use them. Our system is pretty basic, featuring only two tools: Atlas Vector Search and write to the database (see Figure 1). Figure 1. Agentic workflow. The Vector Search step maps the vectorized image description to the vectorized related policy, which also contains the description of the coverages for that class of accident. The policy and the related coverages are used by the agent to figure out the recommended next actions and assign a work order to a claim handler. This information is persisted in the database using the second tool, write to the database. Figure 2. Claim handler workflow. What does the future hold? In our example, the degree of autonomy is quite low, and for the agent, it boils down to deciding when to use which tool. In real-life scenarios, such systems, even if simple, can save a lot of manual work. They eliminate the need for claim handlers to manually locate related policies and coverages, a cumbersome and error-prone process that involves searching multiple systems, reading lengthy PDFs, and summarizing all their findings. Agents are still in their infancy and require handholding, but they have the potential to act with a degree of autonomy never before seen in software. AI agents can reason, perceive, and act—and their performance is improving at a breakneck pace. The insurance industry (like everybody else!) needs to make sure it’s ready to start experimenting and to embrace change. This can only happen if systems and processes are aligned on one imperative: “ make the data easier to work with .” To learn more about integrating AI into insurance systems with MongoDB, check out the following resources: Github repository for insurance solutions. The MongoDB Ebook: Innovate With AI: The Future Enterprise The MongoDB Blog: AI-Powered Call Centers: A New Era of Customer Service The MongoDB Youtube Channel: Unlock PDF Search in Insurance with MongoDB & SuperDuperDB

May 21, 2025

Innovating with MongoDB | Customer Successes, May 2025

Welcome back to MongoDB’s bi-monthly roundup of customer success stories! In this series, we’ll share inspirational examples of how organizations around the globe are working with MongoDB to succeed and address critical challenges in today’s multihyphenate (fast-paced, ever-evolving, always-on) world. This month’s theme—really, it could be every month’s theme—is adaptability. It’s almost cliché but true: adaptability has never been more essential to business success. Factors like the increasing amount of data in the world (currently almost 200 zettabytes) and the rise of AI means that organizations everywhere have to adapt to fundamental changes—in what work looks like, how software is developed and managed, and what end-users expect. So this issue of “Innovating With MongoDB” includes stories of MongoDB customers leveraging our database platform’s flexible schema, seamless scalability, and fully integrated AI capabilities to adapt to what’s next, and to build the agile foundations needed for real-time innovation and dynamic problem-solving. Read on to learn how MongoDB customers like LG U+, Citizens Bank, and L’Oreal aren’t just adapting to change—they’re leading it. LG U+ LG U+ , a leader in mobile, internet, and AI transformation, operates one of Korea's largest customer service centers, handling 3.5 million calls per month. To tackle inefficiencies and improve consultation quality, LG U+ developed Agent Assist on MongoDB Atlas . Leveraging MongoDB Atlas Vector Search , LG U+ integrates vector and operational data, unlocking real-time insights such as customer intent detection and contextual response suggestions. Within four months, LG U+ increased resource efficiency by 30% and reduced processing time per call by 7%, resulting in smoother interactions between agents and customers. By paving the way for intelligent AI solutions, LG U+ can deliver more reliable and personalized experiences for its customers. Citizens Bank Citizens Bank , a 200-year-old financial institution, undertook a significant technological transformation to address evolving fraud challenges. In 2023, the bank initiated an 18-20 month overhaul of its fragmented fraud management systems, shifting from legacy, batch-oriented processes to a comprehensive, cloud-based platform on MongoDB Atlas on AWS . This transition enables real-time fraud detection, significantly reducing losses and false positives. Importantly, the new platform provides Citizens Bank customers with enhanced security and a smoother, more reliable banking experience. With Atlas’ flexible schema and cloud-based capabilities, Citizens Bank can quickly implement new fraud prevention strategies in minutes instead of weeks. The bank is now experimenting with MongoDB Atlas Search and generative AI to improve predictive accuracy and stay ahead of emerging fraud patterns. Through our partnership with The Stack, learn how our customers are achieving extraordinary results with MongoDB. This exclusive content could spark the insights you need to drive your business forward. BioIntelliSense BioIntelliSense is revolutionizing patient monitoring. Their BioButton® wearable device continuously captures vital signs and transmits the data to the BioDashboard™. This platform allows clinicians to monitor patients, access patient information, and receive near real-time alerts about potential medical conditions. After outgrowing its legacy SQL database, BioIntelliSense reengineered the end-to-end architecture of BioDashboard™ using MongoDB Atlas on AWS, Atlas Search, and MongoDB Time Series Collections . The new system now scales to support hundreds of thousands of concurrent patients while ensuring 100% uptime. By optimizing their use of MongoDB 8.0 , BioIntelliSense also identified 25% of their spend that can be redirected to support future innovation. Enpal Enpal , a German start-up, is addressing climate change by developing one of Europe's largest renewable energy networks through solar panels, batteries, and EV chargers. Beyond infrastructure, Enpal fosters a community interconnected through data from over 65,000 devices. By utilizing MongoDB Atlas with native time series collections, Enpal efficiently manages 200+ real-time data streams from these devices. This innovative approach forms a virtual power plant that effectively supports the energy transition and is projected to reduce processing costs by nearly 60%. MongoDB enables Enpal to manage large data volumes cost-effectively while providing precise, real-time insights that empower individuals to make informed energy decisions. Video spotlight: L’Oreal Before you go, be sure to watch one of our recent customer videos featuring the world's largest cosmetics company, L’Oreal. See why L'Oréal's Tech Accelerator team says migrating to MongoDB Atlas was like “switching from a family car to a Ferrari.” Want to get inspired by your peers and discover all the ways we empower businesses to innovate for the future? Visit our Customer Success Stories hub to see why these customers, and so many more, build modern applications with MongoDB.

May 20, 2025

Unlocking Literacy: Ojje’s Journey With MongoDB

In the rapidly evolving landscape of education technology, one startup is making waves with a bold mission to revolutionize how young minds learn to read. Ojje is redefining literacy education by combating one of the most pressing issues in education today—reading proficiency. To do so, Ojje leverages groundbreaking technology to ensure every child can access the world of stories, at their own pace, in their own language. That transformative change is powered by a strategic partnership with MongoDB . Meet Ojje: A vision beyond words From electric cars to diabetes apps, Adrian Chernoff has been at the forefront of breakthrough innovations. Now, as the Founder and CEO of Ojje , he's channeling his passion for invention and entrepreneurship into something deeply personal and universally important—literacy. At its core, Ojje is an adaptive literacy learning platform that offers stories in 15 different reading levels, available in both English and Spanish. Grounded in the science of reading, it features elements like read-aloud functionality and dyslexia-friendly fonts to engage every learner. Ojje is not just a tool—it’s a gateway to personalized literacy education. Ojje's mission is to reach every learner by providing materials that are leveled, accessible, and engaging. By doing so, Ojje aims to vastly improve reading outcomes across K-12 education. Solving a literacy crisis with innovative solutions With literacy rates in the U.S. alarmingly low—almost 70% of low-income fourth grade students cannot read at a basic level according to the National Literacy Institute— Ojje's mission couldn't be more crucial. Chernoff and his team developed their platform in response to teachers' complaints about the stark lack of appropriate reading materials available to students. Schools needed a tool that could effortlessly cater to varying reading abilities within a single classroom. Ojje fills this gap by offering a dynamic platform that adapts to individual students’ needs, allowing educators to personalize instruction. The potential to genuinely connect with every student is realized through Ojje’s innovative use of technology. Powered by MongoDB At the root of every great tech innovation is an infrastructure that allows it to flourish. For Ojje, MongoDB is that foundation. As a startup, speed and adaptability are vital, and MongoDB’s flexible document model provides just that. It allows the Ojje team to launch rapidly, scale efficiently, and to handle a variety of data structures seamlessly—all without the cumbersome need for rigid schemas. “MongoDB handles everything from structured data to student performance tracking, without unnecessary overhead,” Chernoff said. “The platform scales with our needs, and the built-in monitoring tools give our team confidence as usage grows.” Why MongoDB? For Ojje, it was about the flexibility to handle educational content, ensure secure data handling for students, and to offer scalability for thousands of classrooms. MongoDB proved to be the perfect fit, offering a balance of adaptability and comprehensive data management. Working with MongoDB also offered Ojje access to the MongoDB for Startups program, providing essential Atlas credits, valuable technical resources, and access to our vast network of partners. This support played a crucial role during Ojje’s developmental stages and early launch, helping to position the company for successful growth and innovation. What’s next for Ojje? With an eye towards broadening their impact, the Ojje team plans to expand its library to include STEM materials and engaging biographies, alongside enhancing existing content. Additionally, Ojje will introduce tools for educators to track each reader’s progress in real time, further personalizing instruction. “We believe every student deserves the chance to love reading—and every teacher deserves tools that make that possible,” Chernoff said. “That’s why we’re building Ojje: To make literacy more accessible, engaging, and joyful. When students can learn to read and read to learn, it transforms not only their K–12 experience but their entire future.” In an exciting development, Ojje will soon unveil Ojje at Home. This initiative aims to extend literacy support beyond the classroom, providing families with valuable resources to join their children on the journey to literacy. Building a future where every child reads Ojje's combination of strategic foresight, cutting-edge technology, and genuine passion for educational impact make it a standout player in the education sector. By partnering with MongoDB, the company has created a robust, adaptive platform that not only meets the demands of today’s classrooms but is poised to address future literacy challenges. As the digital landscape continues to evolve, so must our methods of teaching and learning. Ojje is leading the charge, ensuring that every child has the opportunity to love reading and reap the lifelong benefits it brings. Interested in MongoDB but not sure where to start? Check out our quick start guides for detailed instructions on deploying and using MongoDB.

May 15, 2025

MongoDB Atlas is Now Available as a Microsoft Azure Native Integration

Since 2019, MongoDB and Microsoft Azure have striven to make it easy for enterprises to launch cutting-edge, modern applications. Key to this effort—and to enabling organizations everywhere to make an impact with AI—has been our work integrating MongoDB Atlas with the Microsoft Intelligent Data Platform. Our aim is to give developers a streamlined, fully integrated experience that they’ll love to use. So I’m very happy to announce the public preview of MongoDB Atlas as an Azure Native Integration (ANI). This latest step in MongoDB’s collaboration with Microsoft means that enterprise customers will be able to easily create and manage MongoDB Atlas organizations while also consolidating billing for Atlas within the Azure console, empowering them to interact with MongoDB Atlas as if it were a first-party service from Azure. I am also pleased to announce MongoDB Atlas on Azure Service Connector, one of several new integrations set to follow directly from the MongoDB Atlas as an Azure Native Integration announcement. Azure Service Connector makes it easy for developers to securely connect Azure compute services to backing services like databases, now including MongoDB Atlas. MongoDB’s mission has always been to empower our customers to move fast with data. With MongoDB Atlas as a native integrated service to Azure, we’re unlocking new possibilities for organizations to harness real-time insights, scale globally, and to accelerate their AI-driven roadmaps—all while reducing operational overhead. With Azure’s robust ecosystem of AI and analytics tools, teams can build and innovate with greater confidence, ultimately transforming how they serve their customers and shaping the future of software. "Integrating MongoDB Atlas as a Microsoft Azure Native Integration marks a significant milestone in our partnership with MongoDB. This integration empowers our customers to seamlessly manage their MongoDB Atlas resources within the Azure ecosystem, including unified billing and robust security features,” said Sandy Gupta, Vice President, Global Software Companies Ecosystem, Microsoft. “By simplifying operations and reducing technical complexity, we are enabling organizations to innovate faster and deliver exceptional value to their customers." Why this matters: Accelerated development & seamless operations This streamlined approach reduces technical and organizational complexity, with organizations benefiting from integrated billing, consolidated support, and simplified deployment. Connecting a database platform to external services typically requires juggling multiple portals, credentials, and security configurations. Starting today, with MongoDB Atlas as an Azure Native Integration, organizations can: Create and manage Atlas organizations directly within Azure, including the Azure Portal UI and CLI/SDK/ARM. Enjoy consolidated billing for both Azure and MongoDB Atlas. Access Azure’s AI services, data analytics, and more—all while harnessing the flexible, scalable power of MongoDB Atlas. It’s worth dwelling for a minute on the simplified onboarding and billing component of ANI, one of the biggest benefits of this integration for customers. As an Azure Native Integration, users can create their MongoDB Atlas organization and select their company billing plan directly from Azure, automatically applying the Azure billing plan to the Atlas Organization. This is made possible by leveraging Azure's comprehensive suite of billing and cost management tools, providing enterprises with enhanced control and visibility over their expenditures. Benefits of using MongoDB Atlas and Microsoft Azure together This latest MongoDB Atlas integration on Azure builds on a strong foundation of technical collaboration. Together, MongoDB Atlas on Azure already delivers a powerful set of integrations that offer customers and development teams a wide range of benefits, including: Unified workloads: MongoDB Atlas offers a single platform that supports a range of workloads, from transactional, time series, and search, to real-time analytics. With native integration on Azure, teams can quickly build across a wide variety of data-driven use cases. This can range from e-commerce transactions to generative AI applications, all without any re-architecting. Streamline AI integration: Accelerate machine learning (ML) workflows and generative AI projects with minimal configuration. Organizations can connect to Azure AI Foundry, Azure OpenAI Service, Microsoft Fabric, or Azure Databricks for advanced analytics, and MongoDB Atlas automatically scales in response to dynamic workloads. End-to-end security and compliance: MongoDB Atlas integrates with Microsoft Entra ID (formerly Azure AD), Azure Key Vault, and Azure private link for secure single sign-on, encryption key management, and private networking, respectively. With Microsoft Purview, organizations can meet stringent governance and compliance requirements, and teams remain agile without sacrificing enterprise-grade security. Scalability and global footprint: Azure’s extensive regional coverage enables organizations to deploy MongoDB Atlas in 40+ Azure regions worldwide. This ensures data remains close to users for low-latency, high-performance applications. How to deploy MongoDB Atlas as an Azure Native Integration 1. Search for MongoDB Atlas in the Azure Portal and the Azure Marketplace. 2. Create a MongoDB Atlas Organization and choose an Azure billing plan. That’s it! You’ve successfully created an Atlas Organization. From your new Atlas Organization, you can start taking advantage of other Azure services already integrated into MongoDB Atlas: Configure security and network settings using existing Azure Virtual Networks and Azure Private Link, as required. Begin building AI capabilities into applications by connecting to Azure AI Foundry, Azure Databricks, or Microsoft Fabric. Get started with deploying MongoDB Atlas as an Azure Native Integration through our quick start guide .

May 14, 2025

OrderOnline: AI Improves Conversion Rate by 56% with MongoDB

Established by Ordivo Group in 2018, OrderOnline has quickly become a driving force behind Indonesia’s thriving social commerce market. OrderOnline offers an end-to-end solution for organizations and individuals selling on social platforms like Facebook Marketplace, typically through social ads, landing pages, and storefronts. OrderOnline built its social commerce platform on the MongoDB Community Edition , and later migrated to MongoDB Atlas in 2022. The platform provides everything from managing orders to handling logistics for companies and individuals selling on social platforms. It addresses common social commerce pain points, such as complex logistics, failed deliveries, and unmanageable order processing due to scale. Speaking at MongoDB.local Jakarta 2024 , Wafda Mufti, Vice President of Technology for Ordivo Group, explained how his slogan—“Simple Input, High Accuracy”—drove OrderOnline to become one of Indonesia’s leading social commerce companies. “We have sellers using storefronts, landing pages, and checkout forms. Thanks to MongoDB's flexibility, we can manage these unpredictable business processes. We even store our front-end structure in MongoDB,” said Mufti. “Thanks to MongoDB, we can ensure we have the highest quality of data.” Mufti also shared how the company is using MongoDB Atlas Search and MongoDB Atlas Vector Search to power innovative search and AI use cases. Scaling social commerce with MongoDB Atlas Five years after its launch, OrderOnline had grown to 40,000 users and was handling 1.5 million transactions each month. This fast growth led to challenges, particularly around managing data at scale and ensuring high success rates for sellers. Most of OrderOnline’s users drive orders by using a wide range of sources. These include social ads, landing pages, and storefronts. Many of OrderOnline’s orders are handled via WhatsApp through Click to WhatsApp Ads (CTWA). Initially, managing orders via platforms like WhatsApp was feasible. However, as social commerce became more popular, the volume of orders increased, which quickly became overwhelming. Furthermore, for large sellers who do not handle their own products, OrderOnline had to manage order packing and shipping, as well as managing returns. “We were overwhelmed with orders, but we wanted to manage our SLAs,” said Mufti. “We wanted to ensure products were well-delivered.” MongoDB Atlas’s flexibility has enabled OrderOnline to manage unpredictable business processes, and to efficiently handle various complex tasks associated with order management and logistics. Because MongoDB Atlas is designed for fast iteration, it enables OrderOnline to swiftly adapt its platform in response to changing business needs and user demands. MongoDB Atlas also supports high scalability. This empowers OrderOnline to manage a growing user base and increasing transaction volumes without compromising performance. Additionally, MongoDB Atlas's reliability under high transactional loads ensures that OrderOnline can maintain quick response times—a core part of their SLA. This is critical for maintaining the agility needed in the dynamic world of social commerce. “We have a monitoring system that triggers alarms if response times fall below one second,” noted Mufti. Another critical SLA that OnlineOrder tracks is the delivery success rate. Previously, deliveries were only successful 94% of the time. Using MongoDB Atlas, OrderOnline built OExpress, a service that sellers can use to customize the number of delivery attempts based on specific service agreements. An upper limit cap of up to five delivery attempts is also mandated. OExpress closely tracks delivery attempts data. This ensures packages are delivered and minimizes returns and damages. “Thanks to MongoDB, we have achieved a success rate of 98.4%,” said Mufti. “We can manage multiple attempts to deliver to the customers, so sellers don’t have to worry about dealing with delivery issues anymore when using a marketplace.” Beyond deliveries, OrderOnline identified seamless search and customer support integrations as key operations that MongoDB could enhance. AI and search: conversion rates jump by 56% As OrderOnline’s business grew, scalability created specific challenges with CTWAs. Particularly, OrderOnline’s platform struggled to manage and make sense of the growing volume of inconsistent data types it was receiving, such as location, postal codes, and product details—accurate input of data is vital to ensuring orders are being processed and delivered. “People want [to be able to input] freeform text. They want things to be simple and easy, and not be restricted by rigid formats,” said Mufti. “But we still have to ensure data accuracy.” One of the standout features that helped OrderOnline improve search accuracy and management is MongoDB Atlas Search . Fuzzy search in MongoDB Atlas Search can handle typo errors when searching for districts. For example, if a user types “Surabaya,” Atlas Search will still fetch results for “Surabaya”. Furthermore, synonyms in MongoDB Atlas Search can handle shortened names for provinces and districts in Indonesia. For example, “Jabar” for Jawa Barat or “Jateng” for Jawa Tengah. Acronyms are also handled. “Because there’s AI in the background, there’s no need to manually input zip codes for example. Our engine can search for it,” said Mufti. “Someone clicks, then places an order, fills out the form, and it goes straight into our order management system, which supports fuzzy search.” As OrderOnline grew, it also needed to increase customer support with 24/7 availability and fast response times. MongoDB Atlas Vector Search supported the development of a seamless and user-friendly interface with the creation of an AI Chatbot. This chatbot provides sellers with ease in managing customer interactions, checking stock availability, and calculating shipping costs. “If the ad contains a WhatsApp link, it will be directly managed by the chatbot. The chatbot even checks shipping costs, compares prices, and shows how much it would cost if you purchased five items,” explained Mufti. “The AI handles requests for photos, checks stock availability, and much more. And once a deal is closed, it goes directly into our order management system.” Before the creation of the AI chatbot with MongoDB Atlas Vector Search, the WhatsApp conversion rate was 50%. Out of 100 interactions, 50 would successfully close the deal. With the implementation of AI, this rate has increased to 78%. Building on these successes, OrderOnline is now looking at further business and geographic expansion supported by MongoDB’s global reach, with the aim to help more sellers throughout Indonesia make the best out of social commerce. Visit the MongoDB Atlas Learning Hub to boost your MongoDB skills. To learn more about MongoDB Atlas Search, visit our product page . Get started with Atlas Vector Search today through our quick start guide .

May 13, 2025

How MongoDB and Google Cloud Power the Future of In-Car Assistants

The automotive industry is evolving fast: electrification, the rise of autonomous driving, and advanced safety systems are reshaping vehicles from the inside out. But innovation isn’t just happening to the drivetrain. Drivers (and passengers) now expect more intelligent, intuitive, and personalized experiences whenever they get into a car. That’s where things get tricky. While modern cars are packed with features, many of them are complex to use. Voice assistants were supposed to simplify things, but most still only handle basic tasks, like setting navigation or changing music. As consumers’ expectations of technology grow, so does pressure on automakers. Standing out in a competitive market, accelerating time to market, and managing rising development costs—all while delivering seamless digital experiences—is no small task. The good news? Drivers are ready for something better. According to a SoundHoundAI study , 79% of drivers in Europe would use voice assistants powered by generative AI. And 83% of those planning to buy a car in the next 12 months say they’d choose a model with AI features over one without. Gen AI is transforming voice assistants from simple command tools into dynamic copilots—able to answer questions, offer insights, and adapt to each user. At CES 2025, we saw major players like BMW, Honda, and HARMAN pushing the boundaries of AI-driven car assistants. To truly make these experiences personalized, you need the right data infrastructure. Real-time signals from the car, user preferences, and access to unstructured content like manuals and FAQs are essential for building truly intelligent systems. By combining gen AI with powerful data infrastructure, we can create more responsive, smarter in-car assistants. With flexible, scalable data access and built-in vector search, MongoDB Atlas is an ideal solution. Together with partners like Google Cloud, MongoDB is helping automotive companies innovate faster and deliver better in-car experiences. MongoDB as the data layer behind smarter assistants Building intelligent in-car assistants isn't just about having cutting-edge AI models—it’s about what feeds them. A flexible, scalable data platform is the foundation. To deliver real-time insights, personalize interactions, and evolve with new vehicle features, automakers need a data layer that can keep up. MongoDB gives developers the speed and simplicity they need to innovate. Its flexible document model lets teams store data the way applications use it—without rigid schemas or complex joins. That means faster development, fewer dependencies, and less architectural friction. Built-in capabilities like time series, full-text search, and real-time sync mean fewer moving parts and faster time to market. And because MongoDB Atlas is built for scale, availability, and security, automakers get the enterprise-grade reliability they need. Toyota Connected , for example, relies on MongoDB Atlas to power its Safety Connect platform across millions of vehicles, delivering real-time emergency support with 99.99% availability. But what really sets MongoDB apart for gen AI use cases is the way it handles data. AI workloads thrive on diverse, often unstructured inputs—text, metadata, contextual signals, vector embeddings. MongoDB’s document model handles all of it, side by side, in a single, unified platform. That’s why companies like Cognigy use MongoDB to power leading conversational AI platforms that manage hundreds of queries per second across multiple channels and data types. With Atlas Vector Search , development teams in the automotive industry can bring semantic search to unstructured data like manuals, support docs, or historical interactions. And by keeping operational, metadata, and vector data together, MongoDB makes it easier to deploy and scale gen AI apps that go beyond analytics and actually transform in-car experiences. MongoDB is already widely adopted across the automotive industry, powering innovation from the factory floor to the finish line . With its ability to scale and adapt to complex, evolving needs, MongoDB is helping automakers accelerate digital transformation and deliver next-gen in-car experiences. Architecture that drives intelligence at scale To bring generative AI into the driver’s seat, we designed an architecture that shows how these systems can work together in the real world. At the core, we combined the power of MongoDB Atlas with Google Cloud’s AI capabilities to build a seamless, scalable solution. Google Cloud powers speech recognition and language understanding, while MongoDB provides the data layer with Atlas Database and Atlas Vector Search . MongoDB has also worked with PowerSync to keep vehicle data in sync across cloud and edge environments. Imagine you're driving, and a red light pops up on your dashboard. You’re not sure what it means, so you ask the in-car assistant, “What is this red light on my dashboard?” The assistant transcribes your question, checks the real-time vehicle signals to identify the issue, and fetches relevant guidance from your car’s manual. It tells you what the warning means, whether it’s urgent, and what steps you should take. If it’s something that needs attention, it can suggest adding a service stop to your route. Or maybe switch your dashboard view to show more details. All of this happens through a natural voice interaction—no menus, no guesswork. Figure 1. A gen AI in-car assistant in action. Under the hood, this flow brings together several key technologies. Google Cloud’s Speech-to-Text and Text-to-Speech APIs handle the conversation. Document AI breaks the car manual into smaller, searchable chunks. Vertex AI generates text embeddings and powers the large language model. All of this connects to MongoDB Atlas, where Atlas Vector Search retrieves the most relevant content. Vehicle signals are kept up to date using PowerSync, which enables real-time, bidirectional data sync. And, by using the Vehicle Signal Specification (VSS) from COVESA, we’re following a widely adopted standard that makes it easy to expand and integrate with more systems down the road. Figure 2. Reference architecture overview. This is just one example of how flexible, future-ready architecture can unlock powerful, intuitive in-car experiences. Reimagining the driver experience Smarter in-car assistants start with smarter architectures. As generative AI becomes more capable, the real differentiator is how well it connects to the right data—securely, in real time, and at scale. With MongoDB Atlas, automakers can accelerate innovation, simplify architecture complexity, and cut development costs to deliver more intuitive, helpful experiences. It’s not just about adding features—it’s about making them work better together, so drivers get real value from the technology built into their cars. Learn how to power end-to-end value chain optimization with AI/ML, advanced analytics, and real-time data processing for innovative automotive applications. Visit our manufacturing and automotive web page. Want to get hands-on experience? Explore our GitHub repository for an in-depth guide on implementing this solution.

May 13, 2025

Introducing Automated Risk Analysis in Relational Migrator

When planning a complex home renovation, homeowners often turn to a team of experts to evaluate the project. Architects sketch out designs, structural engineers assess a house’s structure and foundation, and contractors estimate renovation costs and timelines. This process can take weeks or even months before construction begins, consuming valuable time and precious resources. The same is true for database migration projects. Solution architects planning a migration from legacy relational databases to modern platforms like MongoDB rely heavily on manual assessments by expert teams. These assessments—which involve analyzing database schemas to identify potential risks—can drain resources and delay progress. That’s where the new Pre-Migration Analysis feature in MongoDB Relational Migrator comes in. First, it uses advanced algorithms to automate much of the pre-migration evaluation process by analyzing a database’s schema. It then provides a detailed, customized report that highlights inconsistencies, flags potential issues like incompatible data types, and recommends actionable steps to ensure a successful migration to MongoDB. The report is dynamic, allowing you to refine it by marking items as completed or triaged. By providing a clear roadmap, this feature empowers you to plan and execute migrations with confidence while saving time and minimizing risk. Why use Pre-Migration Analysis? There are a number of benefits associated with using the new Pre-Migration Analysis tool, from faster migrations to the convenience of reporting. Here are some of the areas Pre-Migration Analysis can help with: Minimized disruption: Migration planning often feels overwhelming because it risks diverting focus from your core business operations. A precise, detailed analysis upfront helps you avoid unnecessary disruptions by providing recommendations for success, saving your team from spinning its wheels and wasting resources. Accurate resource allocation: Without a proper assessment, it’s hard to gauge the time, skills, and budget needed for the migration. Automating the evaluation process can give you a head start and allow you to allocate resources more effectively. Faster time to value: An automated assessment accelerates the process and decision-making, letting you kick-start the migration process sooner and modernize your application modernization initiative. Reduced technical debt: Without a precise evaluation, migrations can introduce inefficiencies or unresolved issues. By analyzing your ecosystem upfront, you ensure a smooth transition with fewer hiccups and better long-term stability. Stronger business case: Having a detailed shareable assessment report in hand makes it easier to justify the migration to stakeholders, showing the effort is well-planned, the risks are understood, and the potential ROI is worth the investment. How Pre-Migration Analysis works The Pre-Migration Analysis tool connects to your relational database and extracts the structure of tables, routines, and other components. It then applies automated rules to identify potential migration issues. These rules help flag areas that may need attention when transitioning to MongoDB. Each rule includes: Category: The type of issue (such as incompatible data types). Difficulty level: An estimate of the effort required to resolve the issue. Required action: Guidance on what actions are necessary, optional, or unnecessary. These actions are categorized into “Tasks,” which are necessary actions; “Risks,” or optional actions that may have some risk associated with them; and “Notes,” or how Relational Migrator will handle the migration. Using these rules, the tool generates a detailed migration risk assessment report, complete with actionable recommendations to help ensure a successful migration. Figure 1. Pre-migration impact analysis The impact analysis shows all objects that require action before performing the migration. Additionally, the tool provides a “traffic light” migration confidence level, indicating the overall readiness of the migration. The migration confidence level is based on the types of identified issues and their complexity, giving you a clear indication of how prepared you are for the migration. Figure 2. Pre-migration analysis summary The pre-migration analysis summary gives an overview of the compatibility for your project and how many objects need attention before migrating. You can learn more about how Pre-Migration works in our documentation . For a demo of Pre-Migration Analysis in action, check out this video from the MongoDB product team. Pre-Migration Analysis is now in Public Preview. Download Relational Migrator now to check it out!

May 12, 2025

People Who Ship: Building Centralized AI Tooling

Welcome to People Who Ship! In this new video and blog series, we'll be bringing you behind-the-scenes stories and hard-won insights from developers building and shipping production-grade AI applications using MongoDB. In each month's episode, your host—myself, Senior AI Developer Advocate at MongoDB—will chat with developers from both inside and outside MongoDB about their projects, tools, and lessons learned along the way. Are you a developer? Great! This is the place for you; People Who Ship is by developers, for developers. And if you're not (yet) a developer, that's great too! Stick around to learn how your favorite applications are built. In this episode, John Ziegler , Engineering Lead on MongoDB's internal generative AI (Gen AI) tooling team, shares technical decisions made and practical lessons learned while developing a centralized infrastructure called Central RAG (RAG = Retrieval Augmented Generation ), which enables teams at MongoDB to rapidly build RAG-based chatbots and copilots for diverse use cases. John’s top three insights During our conversation, John shared a number of insights learned during the Central RAG project. Here are the top three: 1. Enforce access controls across all operations Maintaining data sensitivity and privacy is a key requirement when building enterprise-grade AI applications. This is especially important when curating data sources and building centralized infrastructure that teams and applications across the organization can use. In the context of Central RAG, for example, users should only be able to select or link data sources that they have access to, as knowledge sources for their LLM applications. Even at query time, the LLM should only pull information that the querying user has access to, as context to answer the user's query. Access controls are typically enforced by an authentication service using access control lists (ACLs) that define the relationships between users and resources. In Central RAG, this is managed by Credal’s permissions service . You can check out this article that shows you how to build an authentication layer using Credal’s permissions service, and other tools like OpenFGA. 2. Anchor your evaluations in the problem you are trying to solve Evaluation is a critical aspect of shipping software, including LLM applications. It is not a one-and-done process—each time you change any component of the system, you need to ensure that it does not adversely impact the system's performance. The evaluation metrics depend on your application's specific use cases. For Central RAG, which aims to help teams securely access relevant and up-to-date data sources for building LLM applications, the team incorporates the following checks in the form of integration and end-to-end tests in their CI/CD pipeline: Ensure access controls are enforced when adding data sources. Ensure access controls are enforced when retrieving information from data sources. Ensure that data retention policies are respected, so that removed data sources are no longer retrieved or referenced downstream. LLM-as-a-judge to evaluate response quality across various use cases with a curated dataset of question-answer pairs. If you would like to learn more about evaluating LLM applications, we have a detailed tutorial with code . 3. Educate your users on what’s possible and what’s not User education is critical yet often overlooked when deploying software. This is especially true for this new generation of AI applications, where explaining best practices and setting clear expectations can prevent data security issues and user frustration. For Central RAG, teams must review the acceptable use policies, legal guidelines, and documentation on available data sources and appropriate use cases before gaining access to the platform. These materials also highlight scenarios to avoid, such as connecting sensitive data sources, and provide guidance on prompting best practices to ensure users can effectively leverage the platform within its intended boundaries. John’s AI tool recommendations The backbone of Central RAG is a tool called Credal . Credal provides a platform for teams to quickly create AI applications on top of their data. As maintainers of Central RAG, Credal allows John’s team to create a curated list of data sources for teams to choose from and manage applications created by different teams. Teams can choose from the curated list or connect custom data sources via connectors, select from an exhaustive list of large language models (LLMs), configure system prompts, and deploy their applications to platforms like Slack, etc., directly from the Credal UI or via their API. Surprising and delighting users Overall, John describes his team’s goal with Central RAG as “making it stunningly easy for teams to build RAG applications that surprise and delight people.” We see several organizations adopting this central RAG model to both democratize the development of AI applications and to reduce the time to impact of their teams. If you are working on similar problems and want to learn about how MongoDB can help, submit a request to speak with one of our specialists. If you would like to explore on your own, check out our self-paced AI Learning Hub and our gen AI examples GitHub repository .

May 12, 2025

Capgemini & MongoDB: Smarter AI and Data for Business

AI is reshaping the way enterprises operate, but one fundamental challenge still exists: Most applications were not built with AI in mind. Traditional enterprise systems are designed for transactions, not intelligent decision-making, making it difficult to integrate AI at scale. To bridge this gap, MongoDB and Capgemini are enabling businesses to modernize their infrastructure, unify data platforms, and power AI-driven applications. This blog explores the trends driving the AI revolution and the role that Capgemini and MongoDB play in powering AI solutions. The Challenge: Outdated infrastructure is slowing AI innovation In talking to many customers across industries, we have heard the following key challenges in adopting AI: Data fragmentation: Organizations have long struggled with siloed data, where operational and analytical systems exist separately, making it difficult to unify data for AI-driven insights. In fact, according to the Workday Global survey , 59% of C-suite executives said their organizations' data is somewhat or completely siloed, which results in inefficiencies and lost opportunities. Moreover, AI workloads such as retrieval-augmented generation (RAG), semantic search , and recommendation engines require vector databases, yet most traditional data architectures fail to support these new AI-driven capabilities. Lack of AI-ready data infrastructure: The lack of AI-ready data infrastructure forces developers to work with multiple disconnected systems, adding complexity to the development process. Instead of seamlessly integrating AI models, developers often have to manually sync data, join query results across multiple platforms, and ensure consistency between structured and unstructured data sources. This not only slows down AI adoption but also significantly increases the operational burden. The solution: AI-Ready data infrastructure with MongoDB and Capgemini Together, MongoDB and Capgemini provide enterprises with the end-to-end capabilities needed to modernize their data infrastructure and harness AI's full potential. MongoDB provides a flexible document model that allows businesses to store and query structured, semi-structured, and unstructured data seamlessly, a critical need for AI-powered applications. Its vector search capabilities enable semantic search, recommendation engines, RAG, and anomaly detection, eliminating the need for complex data pipelines while reducing latency and operational overhead. Furthermore, MongoDB’s distributed and serverless architecture ensures scalability, allowing businesses to deploy real-time AI workloads like chatbots, intelligent search, and predictive analytics with the agility and efficiency needed to stay competitive. Capgemini plays a crucial role in this transformation by leveraging AI-powered automation and migration frameworks to help enterprises restructure applications, optimize data workflows, and transition to AI-ready architectures like MongoDB. Using generative AI, Capgemini enables organizations to analyze existing systems, define data migration scripts, and seamlessly integrate AI-driven capabilities into their operations. Real-world use cases Let's explore impactful real-world use cases where MongoDB and Capgemini have collaborated to drive cutting-edge AI projects. AI-powered field operations for a global energy company: Workers in hazardous environments, such as oil rigs, previously had to complete complex 75-field forms, which slowed down operations and increased safety risks. To streamline this process, the company implemented a conversational AI interface, allowing workers to interact with the system using natural language instead of manual form-filling. This AI-driven solution has been adopted by 120,000+ field workers, significantly reducing administrative workload, improving efficiency, and enhancing safety in high-risk conditions. AI-assisted anomaly detection in the automotive industry: Manual vehicle inspections often led to delays in diagnostics and high maintenance costs, making it difficult to detect mechanical issues early. To address this, an automotive company implemented AI-powered engine sound analysis, which used vector embeddings to identify anomalies and predict potential failures before they occurred. This proactive approach has reduced breakdowns, optimized maintenance scheduling, and improved overall vehicle reliability, ensuring cost savings and enhanced operational efficiency. Making insurance more efficient: GenYoda, an AI-driven solution developed by Capgemini, is revolutionizing the insurance industry by enhancing the efficiency of professionals through advanced data analysis. By harnessing the power of MongoDB Atlas Vector Search, GenYoda processes vast amounts of customer information including policy statements, premiums, claims histories, and health records to provide actionable insights. This comprehensive analysis enables insurance professionals to swiftly evaluate underwriters' reports, construct detailed health summaries, and optimize customer interactions, thereby improving contact center performance. Remarkably, GenYoda can ingest 100,000 documents within a few hours and deliver responses to user queries in just two to three seconds, matching the performance of leading AI models. The tangible benefits of this solution are evident; for instance, one insurer reported a 15% boost in productivity, a 25% acceleration in report generation—leading to faster decision-making—and a 10% reduction in manual efforts associated with PDF searches, culminating in enhanced operational efficiency. Conclusion As AI becomes operational, real-time, and mission-critical for enterprises, businesses must modernize their data infrastructure and integrate AI-driven capabilities into their core applications. With MongoDB and Capgemini, enterprises can move beyond legacy limitations, unify their data, and power the next generation of AI applications. For more, watch this TechCrunch Disrupt session by Steve Jones (EVP, Data-Driven Business & Gen AI at Capgemini) and Will Shulman (former VP of Product at MongoDB) to learn about more real world use cases. And discover how Capgemini and MongoDB are driving innovation with AI and data solutions.

May 8, 2025