MongoDB Blog
Announcements, updates, news, and more
Binary Quantization & Rescoring: 96% Less Memory, Faster Search
We are excited to share that several new vector quantization capabilities are now available in public preview in MongoDB Atlas Vector Search : support for binary quantized vector ingestion, automatic scalar quantization, and automatic binary quantization and rescoring. Together with our recently released support for scalar quantized vector ingestion , these capabilities will empower developers to scale semantic search and generative AI applications more cost-effectively. For a primer on vector quantization, check out our previous blog post . Enhanced developer experience with native quantization in Atlas Vector Search Effective quantization methods—specifically scalar and binary quantization—can now be done automatically in Atlas Vector Search. This makes it easier and more cost-effective for developers to use Atlas Vector Search to unlock a wide range of applications, particularly those requiring over a million vectors. With the new “quantization” index definition parameters, developers can choose to use full-fidelity vectors by specifying “none,” or they can quantize vector embeddings by specifying the desired quantization type—”scalar” or “binary” (Figure 1). This native quantization capability supports vector embeddings from any model provider as well as MongoDB’s BinData float32 vector subtype . Figure 1: New index definition parameters for specifying automatic quantization type in Atlas Vector Search Scalar quantization—converting a float point into an integer—is generally used when it's crucial to maintain search accuracy on par with full-precision vectors. Meanwhile, binary quantization—converting a float point into a single bit of 0 or 1—is more suitable for scenarios where storage and memory efficiency are paramount, and a slight reduction in search accuracy is acceptable. If you’re interested in learning more about this process, check out our documentation . Binary quantization with rescoring: Balance cost and accuracy Compared to scalar quantization, binary quantization further reduces memory usage, leading to lower costs and improved scalability—but also a decline in search accuracy. To mitigate this, when “binary” is chosen in the “quantization” index parameter, Atlas Vector Search incorporates an automatic rescoring step, which involves re-ranking a subset of the top binary vector search results using their full-precision counterparts, ensuring that the final search results are highly accurate despite the initial vector compression. Empirical evidence demonstrates that incorporating a rescoring step when working with binary quantized vectors can dramatically enhance search accuracy, as shown in Figure 2 below. Figure 2: Combining binary quantization and rescoring helps retain search accuracy by up to 95% And as Figure 3 shows, in our tests, binary quantization reduced processing memory requirement by 96% while retaining up to 95% search accuracy and improving query performance. Figure 3: Improvements in Atlas Vector Search with the use of vector quantization It’s worth noting that even though the quantized vectors are used for indexing and search, their full-fidelity vectors are still stored on disk to support rescoring. Furthermore, retaining the full-fidelity vectors enables developers to perform exact vector search for experimental, high-precision use cases, such as evaluating the search accuracy of quantized vectors produced by different embedding model providers, as needed. For more on evaluating the accuracy of quantized vectors, please see our documentation . So how can developers make the most of vector quantization? Here are some example use cases that can be made more efficient and scaled effectively with quantized vectors: Massive knowledge bases can be used efficiently and cost-effectively for analysis and insight-oriented use cases, such as content summarization and sentiment analysis. Unstructured data like customer reviews, articles, audio, and videos can be processed and analyzed at a much larger scale, at a lower cost and faster speed. Using quantized vectors can enhance the performance of retrieval-augmented generation (RAG) applications. The efficient processing can support query performance from large knowledge bases, and the cost-effectiveness advantage can enable a more scalable, robust RAG system, which can result in better customer and employee experience. Developers can easily A/B test different embedding models using multiple vectors produced from the same source field during prototyping. MongoDB’s flexible document model lets developers quickly deploy and compare embedding models’ results without the need to rebuild the index or provision an entirely new data model or set of infrastructure. The relevance of search results or context for large language models (LLMs) can be improved by incorporating larger volumes of vectors from multiple sources of relevance, such as different source fields (product descriptions, product images, etc.) embedded within the same or different models. To get started with vector quantization in Atlas Vector Search, see the following developer resources: Documentation: Vector Quantization in Atlas Vector Search Documentation: How to Measure the Accuracy of Your Query Results Tutorial: How to Use Cohere's Quantized Vectors to Build Cost-effective AI Apps With MongoDB
IntellectAI Unleashes AI at Scale With MongoDB
IntellectAI , a business unit of Intellect Design Arena , is a trailblazer in AI. Since 2019 the company has been using MongoDB to drive a number of innovative use cases in the banking, financial services, and insurance (BFSI) industry. For example, Intellect Design Arena’s broader insurance business has been using MongoDB Atlas as a foundation for its architecture. Atlas’s flexibility enables Intellect Design Arena to manage varied and constantly evolving datasets and increase operational performance. Building on this experience, the company looked at deepening its use of MongoDB Atlas’s unique AI and search capabilities for its new IntellectAI division. IntellectAI Partner and Chief Technology Officer Deepak Dastrala spoke on the MongoDB.local Mumbai stage in September 2024 . Dastrala shared how the company has built a powerful, scalable, and highly accurate AI platform-as-a-service offering, Purple Fabric , using MongoDB Atlas and Atlas Vector Search . Using AI to generate actionable compliance insights for clients Purple Fabric helps transform enterprise data into actionable AI insights and solutions by making data ready for retrieval-augmented generation (RAG). The platform collects and analyzes structured and unstructured enterprise data, policies, market data, regulatory information, and tacit knowledge to enable its AI Expert Agent System to achieve precise, goal-driven outcomes with accuracy and speed. A significant part of IntellectAI’s work involves assessing environmental, social, and governance (ESG) compliance. This requires companies to monitor diverse nonfinancial factors such as child labor practices, supply chain ethics, and biodiversity. “Historically, 80% to 85% of AI projects fail because people are still worried about the quality of the data. With Generative AI, which is often unstructured, this concern becomes even more significant,” said Deepak Dastrala. According to Deepak Dastrala, the challenge today is less about building AI tools than about operationalizing AI effectively. A prime example of this is IntellectAI’s work with one of the largest sovereign wealth funds in the world, which manages over $1.5 trillion across 9,000 companies. The fund sought to utilize AI for making responsible investment decisions based on millions of unique data points across those companies, including compliance, risk prediction, and impact assessment. This included processing both structured and unstructured data to enable the fund to make informed, real-time decisions. “We had to process almost 10 million documents in more than 30 different data formats—text and image—and correlate both structured and unstructured data to provide those particular hard-to-find insights,” said Dastrala. “We ingested hundreds of millions of vectors across these documents, and this is where we truly understood the power of MongoDB.” For example, by leveraging MongoDB's capabilities, including time series collections, IntellectAI simplifies the processing of unstructured and semi-structured data from companies' reports over various years, extracting key performance metrics and trends to enhance compliance insights. “MongoDB Atlas and Vector Search give us flexibility around the schema and how we can turn particular data into knowledge,” Dastrala said. For Dastrala, there are four unique advantages of working with MongoDB—particularly using MongoDB Atlas Vector Search—that other companies should consider when building long-term AI strategies: a unified data model, multimodality, dynamic data linking, and simplicity. “For me, the unified data model is a really big thing because a stand-alone vector database will not help you. The kind of data that you will continue to ingest will increase, and there are no limits. So whatever choices that you make, you need to make the choices from the long-term perspective,” said Dastrala. Delivering massive scale, driving more than 90% AI accuracy, and accelerating decision-making with MongoDB Before IntellectAI built this ESG capability, its client relied on subject matter experts, but they could examine only a limited number of companies and datasets and were unable to scale their investigation of portfolios or information. “If you want to do it at scale, you need proper enterprise support, and that’s where MongoDB became really handy for us. We are able to give 100% coverage and do what the ESG analysts were able to do for this organization almost a thousand times faster,” said Dastrala. Previously, analysts could examine only between 100 and 150 companies. With MongoDB Atlas and Atlas Vector Search, Purple Fabric can now process information from over 8,000 companies across the world, covering different languages and delivering more than 90% accuracy. “Generally, RAG will probably give you 80% to 85% accuracy. But in our case, we are talking about a fund deciding whether to invest billions or not in a company, so the accuracy should be 90% minimum,” said Dastrala. “What we are doing is not ‘simple search’; it is very contextual, and MongoDB helps us provide that high-dimension data.” Concluding the presentation speech on the MongoDB.local stage, Dastrala reminded the audience why IntellectAI is using MongoDB’s unique capabilities to support its long-term vision: “Multimodality is very important because today we are using text and images, but tomorrow we might use audio, video, and more. And don’t forget, from a developer perspective, how important it is to keep the simplicity and leverage all the options that MongoDB provides.” This is just the beginning for IntellectAI and its Purple Fabric platform. “Because we are doing more and more with greater accuracy, our customers have started giving us more problems to solve. And this is absolutely happening at a scale [that] is unprecedented,” said Dastrala. Using MongoDB Atlas to drive broader business benefits across Intellect Design The success encountered with the Purple Fabric platform is leading Intellect Design’s broader business to look at MongoDB Atlas for more use cases. Intellect Design is currently in the process of migrating more of its insurance and Wealth platforms onto MongoDB Atlas, as well as leveraging the product family to support the next phase of its app modernization strategy. Using MongoDB Atlas, Intellect Design aims to improve resilience, support scalable growth, decrease time to market, and enhance data insights. Head over to our product page to learn more about MongoDB Atlas . To learn more about how MongoDB Atlas Vector Search can help you build or deepen your AI and search capabilities, visit our Vector Search page .
Away From the Keyboard: Everton Agner, Staff Software Engineer
We’re back with a new article in our ongoing “Away From the Keyboard” series, featuring in-depth interviews with people at MongoDB, discussing what they do, how they prioritize time away from their work, and approach to coding. Everton Agner, Staff Software Engineer at MongoDB, talked to us about why team support, transparent communication, and having small rituals are important for creating healthy work-life boundaries. Q: What do you do at MongoDB? Ev: I’m a Staff Software Engineer on the Atlas Foundational Services team. In practice, that means that I develop systems, tools, frameworks, processes and provide guidance within our systems architecture to other engineering teams so they can deliver value and make their customers happy! Q: What does work-life balance look like for you? Ev: My team is hybrid and distributed. I enjoy going to our office a couple of times every week (but don’t have to), and all of our team processes are built with remote friendliness in mind, which is very helpful. Occasionally, I go on call for a week, and make sure that my laptop is reachable in case something happens and it needs my attention. On my team, when there’s an on-call shift during a particular day or weekend that is really inconvenient, we are very supportive, and usually someone is able to swap rotations. Q: How do you ensure you set boundaries between work and personal life? Ev: It’s very easy to fall into the trap of never really disconnecting, thinking about or really just working all day when it’s just an open laptop away. As a rule of thumb, I tell myself that I only ever spend time outside of business hours doing anything work-related when I am not asked or expected to do so by anyone. When I do it, it’s because I want to and will likely have some fun! On the other hand, I’m very transparent when it comes to my personal life and responsibilities, as well as any work adjustments that are needed. Transparency is key, and I’m very lucky that all my managers at MongoDB have always been very accommodating. Q: Has work/life balance always been a priority for you, or did you develop it later in your career? Ev: It always was, but I struggled a bit during my first experience working from home in a hybrid model. Over time, I realized that the small rituals I’ve done during the days I commuted to the office, like getting ready in the morning and driving back home after work, were essential for me “flipping the switch” into on and off of work mode. Developing new rituals when I worked from home—like making sure I had breakfast, took care of my pets, or exercising after work—was essential for me to truly disconnect when I close my laptop. Otherwise I would struggle to enjoy my personal time during the evening or would think about work right after waking up in the morning. Q: What benefits has this balance given you in your career? Ev: I feel like both my personal and professional lives benefited from that. On the personal side, it’s really nice to know that my work schedule accommodates me not being a big morning person, and that it can take personal appointments that can overlap with business hours, like language classes (I’m learning Japanese currently!). On the professional side, sometimes I personally find it productive to spend some time during off-hours to research, write experimental code or documents, or just get ready for the next day while everything’s quiet. Q: What advice would you give to someone seeking to find a better balance? Ev: For me, work-life balance means being able to fully dedicate myself to my personal life without affecting success at my job and vice-versa. Most importantly, it is important to make sure that it’s sustainable and not detrimental to your health. On a more practical note, if you have access to work emails or communication channels on your phone, learning how to set up meaningful notifications is critical. If your phone notifies you of anything work-related outside of working hours, it needs to be important and actionable! Thank you to Everton Agner for sharing their insights! And thanks to all of you for reading. For past articles in this series, check out our interviews with: Senior AI Developer Advocate, Apoorva Joshi Developer Advocate Anaiya Raisinghani Senior Partner Marketing Manager Rafa Liou Interested in learning more about or connecting more with MongoDB? Join our MongoDB Community to meet other community members, hear about inspiring topics, and receive the latest MongoDB news and events. And let us know if you have any questions for our future guests when it comes to building a better work-life balance as developers. Tag us on social media: @/mongodb #LoveYourDevelopers #AwayFromTheKeyboard
Atlas Stream Processing Now Supports Azure and Azure Private Link
Today, we’re excited to announce that Atlas Stream Processing now supports Microsoft Azure! This update opens new possibilities for developers leveraging Azure’s cloud ecosystem, offering a way to: Seamlessly integrate MongoDB Atlas and Apache Kafka Effortlessly handle complex and rapidly changing data structures Use the familiarity of the MongoDB Query API for processing streaming data Benefit from a fully managed service that eliminates operational overhead Azure support in four regions At launch, we’re supporting four Azure regions spanning both the U.S. and Europe: Azure Region Location US East Virginia, US US East 2 Virginia, US US West California, US West Europe Netherlands We’ll continue adding more regions across cloud providers in the future. Let us know which regions you need next in UserVoice . Atlas Stream Processing simplifies integrating MongoDB with Apache Kafka to build event-driven applications. New to Atlas Stream Processing? Watch our 3-minute explainer . How it works Working with Atlas Stream Processing on Azure will feel just like it does already today when using AWS. During the Stream Processing Instance (SPI) tier selection in the Atlas UI or CLI, simply select Azure as your provider and then choose your desired region. Figure 1: Stream Processing instance setup via Atlas UI $ atlas streams instances create AzureSPI --provider AZURE --region westus --tier SP10 Figure 2: Stream Processing instance setup via the Atlas CLI Secure networking for Azure Event Hubs via Azure Private Link In addition to adding support for Azure in multiple regions, we’re introducing Azure Private Link support for developers using Azure Event Hubs . Event Hubs is Azure’s native, Kafka-compatible data streaming service. As a reminder, Atlas Stream Processing supports any service that uses the Kafka Wire Protocol . That includes Azure Event Hubs, AWS Managed Service for Kafka (MSK), Redpanda, and Confluent Cloud. As we have written before , security is critical for data services, and it’s especially important with stream processing systems where connecting to technologies like Apache Kafka external to a database like MongoDB Atlas, is required. For this reason, we’re engineering Atlas Stream Processing to leverage the advanced networking capabilities available through the major cloud providers (AWS, Azure, and GCP). Networking To better understand the value of support for private link, let’s summarize the three key ways that developers typically connect between data services: Public networking Private networking through VPC peering Private networking through private link Public networking connects services using public IP addresses. It’s the least secure of all approaches. This makes it the easiest to set up, but it's a less secure approach than either VPC peering or private link. Private networking through VPC peering connects services across two virtual private clouds (VPCs). This improves security compared with public networking by keeping traffic off the public internet and is commonly used for testing and development purposes. Private networking through private link is even more secure by enforcing connections to specific endpoints. While VPC peering lets resources from one VPC connect to all of the resources in the other VPC, private link ensures that each specific resource can only connect to defined services with specific associated endpoints. This connection method is important for use cases relying on sensitive data. Figure 3: Private Link allows for connecting to specific endpoints Ready to get started? With support for Azure Private Link, Atlas Stream Processing now makes it simple to implement the most secure method for networking across MongoDB and Kafka on Azure Event Hubs. Login today to get started, or check out our documentation to create your first private link connection.
Goodnotes Finds Marketplace Success Using MongoDB Atlas
In the fast-paced world of app development, creating a feature-rich digital marketplace that scales effectively can be challenging. Goodnotes was founded in 2010 with the aim of replacing traditional paper notebooks with a digital alternative that reimagines the note-taking experience. Since then, the app has gone through several iterations and grown in popularity, now with more than 24 million monthly active users and 2.5 billion notes. The team behind Goodnotes spoke at MongoDB.local Hong Kong in September 2024. They shared their journey of using MongoDB Atlas and MongoDB Atlas Search to build and run a comprehensive marketplace that expands the company’s offerings, catering to its growing number of content creators. “At the beginning of 2023, we launched a pop-up shop, which was a very simple version of the marketplace, to test the water, and we realized it got really popular,” said Xing Dai, Principal Backend Engineer at Goodnotes. The full Goodnotes Marketplace launched in August 2024 as a space where content creators can enhance their note-taking experience by purchasing additional content, such as planners, stickers, and textbooks. Building a robust digital marketplace with MongoDB Atlas “The first and the most difficult challenge [was] that we are a multiplatform app, and if you want to launch on multiple platforms, you need to support different app stores as well as [the] web,” said Dai. Using MongoDB Atlas, Dai’s team created a fully configurable marketplace that would be accessible on different mobile and desktop platforms and the web. The initial pop-up shop’s infrastructure consisted of a Payload content management system connected to a MongoDB database. However, building a full-fledged marketplace was more challenging. The architecture needed to be scalable and include search, ordering, and customization capabilities. “With [MongoDB] Atlas, it was really easy to add the in-app purchase and build the subscription infrastructure to manage the purchase workflow,” said Dai. The Goodnotes team introduced NestJS—a JavaScript API framework—to build client APIs. It then developed a user-friendly portal for the operations team and for creators who want to upload new products. Finally, the team built a full event-based data pipeline on top of MongoDB. “What’s nice is that everything on the marketplace is actually configurable in the backend,” said Dai. “We don’t need to do anything other than configuring what we need to store in the database, and the iOS client will connect it to the backend.” “When we want to extend the marketplace to other platforms, nothing needs to be changed,” Dai added. “We only need to configure different shops for different platforms.” This means that Goodnotes can easily make its marketplace available on different app platforms, such as Apple and Android, and on the web. Adding searches, charts, and soon AI As Goodnotes added more products to its marketplace, users had difficulty finding what they wanted. Despite having limited resources, the Goodnotes team endeavored to build a comprehensive search function. Using MongoDB Atlas Search and MongoDB Atlas Triggers , the team built a search function that would generate the search view collection by-products and attributes, combining them into one collection. The team then added an Atlas Search index for the search field with an API exposing the search. “We also added an auto-complete function, which is very similar to search, in the sense that we just had to create a function to generate aggregated collections, trigger it using [MongoDB] Atlas Triggers, and add the index and expose it in the marketplace,” said Dai. The search function is now popular among marketplace users, making it quick and easy for them to find what they are looking for. Goodnotes also regularly uses MongoDB Atlas Charts . For example, it creates charts showing how many products there are in the system over time. One of the key next steps for Goodnotes involves using generative AI to translate product descriptions and content into different languages (the app is currently available in 11 languages). The team also wants to introduce personalized recommendations for a more tailored experience for each user. Ending the MongoDB.local presentation, Dai said: “MongoDB made it very fast and easy to build the whole marketplace and our search feature on top of the database using [MongoDB] Atlas Search. The solution scales, and so far we haven’t had any performance issues.” Visit our product page to learn more about MongoDB Atlas .
Managing Data Corruption in the Cloud
Key Takeaways Silent data corruption—rare events in which data becomes corrupt in a way that is not readily detected—can impact systems of all kinds across the software industry. MongoDB Atlas, our global cloud database service, operates at petabyte scale, which requires sophisticated solutions to manage the risk of silent data corruption. Because MongoDB Atlas itself relies on cloud services, the systems we have engineered to safeguard customer data have to account for limited access to the physical hardware behind our infrastructure. The systems we have implemented consist of software-level techniques for proactively detecting and repairing instances of silent data corruption. These systems include monitoring for checksum failures and similar runtime evidence of corrupt data, methods of identifying corrupt documents by leveraging MongoDB indexes and replication, and processes for repairing corrupt data by utilizing the redundant replicas. Introduction: Hardware corruption in the cloud As a cloud platform, MongoDB Atlas promises to free its customers from the work of managing hardware. In our time developing Atlas, however, some hardware problems have been challenging to abstract away. One of the most notable of these is silent data corruption. No matter how well we design our software, in rare cases hardware can silently fail in ways that compromise data. Imagine, for example, a distance sensor that detects an obstacle 10 meters away. Even if the many layers of software handling the sensor’s data function flawlessly (application, network, database, operating system, etc.), if an unlucky memory chip has a single bit flipped by radiation the value of this measurement could mutate to something like 26. 1 The consequences of this botched measurement would depend on how this data is used: in some cases it may introduce a blip in a vast amount of research data, 2 but in the wrong system it could be dangerous. Despite the rarity of events like this, global cloud systems like MongoDB Atlas operate at such a scale that these events become statistically inevitable over time, even in the presence of existing manufacturer defect screening. Our platform currently stores petabytes of data and operates nearly half a million virtual machines in cloud datacenters in dozens of countries; even random failures with odds as low as one in a hundred thousand become likely to appear in such an environment. Complicating this further is the reality that silent data corruption has many possible causes beyond rare, random failures like the example of radiation above. Recent research has identified notable rates of data corruption originating from defective CPUs in large-scale data centers, 3 and corruption can just as easily originate from bugs in operating systems or other software. Considering this scope, and with limited levels of access to the cloud hardware we use to power MongoDB Atlas, how can we best stay ahead of the inevitability of silent data corruption affecting customers on our platform? Our response to this problem has been to implement features both in the MongoDB database and the orchestration software that powers MongoDB Atlas for continuously detecting and repairing silent data corruption. Our systems are designed to be proactive, identifying potential issues before they ever affect customer data or operations. The way we use these systems can be described in three steps. First, Atlas proactively monitors for signals of corrupt data from across our fleet of databases by checking for certain logical truths about data in the course of runtime operations. Then, in the case that evidence of corruption is identified, we utilize features in MongoDB for scanning databases to pinpoint the location of corrupt data, narrowing down the site of corruption to specific documents. Finally, once we have enough information to identify a remediation plan, we repair corruption in coordination with our customers by leveraging the redundancy of MongoDB’s replica set deployment model. As a whole this approach gives us early visibility into new types of data corruption that emerge in our fleet, as well as the tools we need to pinpoint and repair corruption when it occurs. Proactively monitoring for evidence of corruption Fortunately for anyone interested in managing silent data corruption at scale, databases tend to tell you a lot about what they observe. In the course of a single day, the hundreds of thousands of database processes in the Atlas fleet generate terabytes of database logs describing their operations: connections received from clients, the details of startup and shutdown procedures, queries that are performing poorly. At their worst, logs at this scale can be expensive to store and difficult to parse, but at their best they are an indispensable diagnostic tool. As such, the first step of our strategy for managing corruption in Atlas is strategic log management. There are several points in the course of a MongoDB database’s operations where we can proactively validate logical assumptions about the state of data and emit messages if something appears to be corrupt. The most fundamental form of this validation we perform is checksum validation. A checksum is a very small piece of information that is deterministically generated from a much larger piece of information by passing it through a mathematical function. When the storage engine for an Atlas cluster writes data to disk, each block–or individual unit of data written–is accompanied by a checksum of the data. When that block is later read from disk, the storage engine once again passes the data through the checksum function and verifies that the output matches the checksum that was originally stored. If there is a mismatch, we have a strong reason to suspect that the stored information is corrupt; the database process then emits a log line indicating this and halts further execution. You can see this behavior in the MongoDB source code here . Figure 1: Checksum validation fails after corruption is introduced on the disk level. In addition to checksums, there are other opportunities for checking basic assumptions about the data we are handling during routine MongoDB operations. For example, when iterating through a list of values that is expected to be in ascending order, if we find that a given value is actually less than the one that preceded it we can also reasonably suspect that a piece of information is corrupt. Similar forms of validation exist in dozens of places in the MongoDB database. Successfully leveraging these types of runtime errors in the context of the entire Atlas fleet, however, comes with some challenges. We need to quickly detect these messages from among the flood of logs generated by the Atlas fleet, and, importantly, do so in a way that maintains data isolation for the many individual customer applications running on Atlas. Database logs, by design, reveal a lot about what is happening in a system; as the creators of a managed database service, exposing the full contents of database logs to our employees for corruption analysis is a non-starter, and so we need a more nuanced method of detecting these signals. To solve this problem, we implemented a system for detecting certain classes of error as Atlas database logs are generated and emitting high-level metadata that can be analyzed by our teams internally without revealing sensitive information about the content or operations of a given database. To describe this system, it is first useful to understand a pair of concepts that we often reference internally, and which play an important role in the development of the Atlas platform, the data plane and the control plane. The data plane describes the systems that manage the data that resides in a given Atlas cluster. It consists of the virtual network containing the cluster’s resources, the virtual machines and related resources hosting the cluster’s database processes, and storage for diagnostic information like database logs. As a whole, it consists of many thousands of individual private networks and virtual machines backing the Atlas fleet of database clusters. The control plane, on the other hand, is the environment in which the Atlas management application runs. It consists of separate networks hosting our own application processes and backing databases, and stores all of the operational metadata required for running Atlas including, for example, metadata about the configurations of the clusters that constitute the Atlas fleet. Figure 2: An Agent observes log line indicative of data corruption and communicates this to to the Atlas Control Plane. The flow of information between the two planes only occurs on a limited set of vectors, primary among these being the MongoDB Agent, a background process that runs locally on virtual machines in the Atlas data plane. The Agent serves as the primary orchestrator of a cluster’s database processes. Whenever an Atlas customer requests a change to their cluster–for example, upgrading the version of their database–their request results in some modification to metadata that resides in the control plane which is then picked up by the Agents in the data plane. The Agents then begin to interact with the individual database processes of the cluster to bring them to the desired state. The Agent, with its ability to access database logs inside the data plane, provides the tool we need to detect critical logs in a controlled manner. In fact, at the time we implemented the feature for ingesting these logs, the Agent was already capable of tailing MongoDB logs in search of particular patterns. This is how the Performance Advisor feature works, in which the Agent looks for slow query logs above a certain operation duration threshold to alert users of potentially inefficient data queries. For the purposes of corruption detection we introduced a new feature for defining additional log patterns for the Agent to look for: for example, a pattern that matches the log line indicating an invalid checksum when data is read from disk. If the Agent observes a line that matches a specified pattern it will send a message to the control plane reporting when the message was observed, in which process, along with high-level information–such as an error code–but without further information about the activity of the cluster. The next step of this process, once evidence of corruption is detected, is to assess the extent of the problem and gather additional information to inform our response. This brings us to the next piece of our corruption management system: once we become aware of the possibility of corruption, how do we pinpoint it and determine a remediation plan? Scanning databases to pinpoint identified corruption So far, we have outlined our system for detecting runtime errors symptomatic of corrupt data. However, the detection of these errors by itself is not sufficient to fully solve the problem of data corruption in a global cloud database platform. It enables early awareness of potential corruption within the Atlas fleet, but when it is time to diagnose and treat a specific case we often need more exhaustive tools. The overall picture here is not unlike the treatment of an illness: so far, what we have described is our system for detecting symptoms. Once symptoms have been identified, further testing may be needed to determine the correct course of treatment. In our case, we may need to perform further scanning of data to identify the extent of corruption and the specific information affected. The ability to scan MongoDB databases for corruption relies on two of the most fundamental concepts of a database, indexes and replication . These two basic features of MongoDB each come with certain simple logical assumptions that they adhere to in a healthy state. By scanning for specific index entries or replicated data that violate these assumptions, we are able to pinpoint the location of corrupt data, a necessary step towards determining a remediation path. Indexes–data structures generated by a database to allow for quick lookup of information–relate to the contents of a database following specific logical constraints. For example, if a given collection is using an index on the lastName field and contains a document with a lastName value of “Turing,” the value “Turing” should be present in that collection’s index keys. A violation of this constraint, therefore, could point to the location of corrupt data; the absence of an in-use lastName value in the index would indicate that either the index has become corrupt or the lastName value on the document itself has become corrupt. Because almost all indexes are specified by the customer, Atlas does not have control over how the data in a cluster is indexed. In practice, though, the most frequently-accessed data tends to be highly indexed, making index scanning a valuable tool in validating the most critical data in a cluster. Replicated data, similarly, adheres to certain constraints in a healthy state: namely, that replicas of data representing the same point in time should be identical to one another. As such, replicated data within a database can also be scanned at a common point in time to identify places where data has diverged as a result of corruption. If two of three replicas in a cluster show a lastName value of “Turing” for a given document but the third shows “Toring” 4 , we have a clear reason to suspect that this particular document’s data has become corrupt. Since all Atlas clusters are deployed with at least three redundant copies of data, replication is always available to be leveraged when repairing corruption on Atlas. This is, of course, easier said than done. In practice, performing integrity scanning of indexes and replicated data for a very large database requires processing a large amount of complex data. In the past, performing such an operation was often infeasible on a database cluster that was actively serving reads and writes. The db.collection.validate command was one of the first tools we developed at MongoDB for performing integrity scans of index data, but it comes with certain restrictions. The command obtains an exclusive lock on the collection it is validating, which means it will block reads and writes on the collection until it is finished. We still use this command as part of our corruption scanning strategy, but because of its limitations this means it is often only feasible to run on an offline copy of a database restored from a backup snapshot. This can be expensive, and comes with the overhead of managing additional hardware for performing validations on offline copies. With this in mind, we have been developing new tools for scanning for data corruption that are more feasible to run in the background of a cluster actively serving traffic. Our latest tools for detecting inconsistencies in replica sets utilize the replication process to perform background validation of data while a cluster is processing ordinary operations, and can be rate-limited based on the available resources on the cluster. When this process is performed, the primary node will begin by iterating through a collection in order, pulling a small batch of data into memory that stays within the bounds of a specified limit. It will make note of the range of the data being analyzed and produce an MD5 hash , writing this information to an entry in the oplog , a transaction log maintained by MongoDB that is replayed by secondary nodes in the database. When the secondaries of the cluster encounter this entry in their copies of the oplog, they perform the same calculation based on their replicas of the data, generating their own hash belonging to the indicated range of data at the specified point in time. By comparing a secondary’s hash with the original hash recorded by the primary, we can determine whether or not this batch of data is consistent between the two nodes. The result of this comparison (consistent or inconsistent) is then logged to an internal collection in the node’s local database . In this manner small batches of data are processed until the collection has been fully scanned. Figure 3: Data consistency between replicas is validated by leveraging the oplog. This form of online data consistency scanning has allowed us to scan our own internal databases without interruption to their ordinary operations, and is a promising tool for expanding the scale of our data corruption scanning without needing to manage large amounts of additional hardware for performing offline validations. We do, nonetheless, recognize there will be some cases where running an online validation may be unviable, as in the cases of clusters with very limited available CPU or memory. For this reason, we continue to utilize offline validations as part of our strategy, trading the cost of running additional hardware for the duration of the validation for complete isolation between the validation workload and the application workload. Overall, utilizing both online and offline approaches in different cases gives us the flexibility we need to handle the wide range of data characteristics we encounter. Repairing corruption Once the location of corrupt data has been identified, the last step in our process is to repair it. Having several redundant copies of data in an Atlas cluster means that more often than not it is straightforward to rebuild the corrupt data. If there is an inconsistency present on a given node in a cluster, that node can be rebuilt by triggering an initial sync and designating a known healthy member as the sync source. Triggering this type of resync is sufficient to remediate both index inconsistencies and replication inconsistencies 5 as long as there is at least one known, healthy copy of data in a cluster. While it is typically the case that it is straightforward to identify a healthy sync source when repairing corruption–truly random failures would be unlikely to happen on more than one copy of data–there are some additional considerations we have to make when identifying a sync source. A given node in an Atlas cluster may have already had its data resynced at some point in the past in the course of ordinary operations. For example, if the cluster was migrated to a new set of hardware in the past, some or all nodes in the cluster may have already been rebuilt at least once before. For this reason, it is important for us to consider the history of changes in the cluster in the time after corruption may have been introduced to rule out any nodes that may have copied corrupt data, and to separately validate the integrity of the sync source before performing any repair. Once we are confident in a remediation plan and have coordinated any necessary details with our customers, we leverage internal features for performing automated resyncs on Atlas clusters to rebuild corrupt data. Very often, these repairs can be done with little interruption to a database’s operations. Known healthy nodes can continue to serve application traffic while data is repaired in the background. Internal-facing functionality for repairing Atlas clusters has existed since the early days of the platform, but in recent years we have added additional features and levels of control to facilitate corruption remediation. In particular, in many cases we are able to perform optimized versions of the initial sync process by using cloud provider snapshots of healthy nodes to circumvent the sometimes-lengthy process of copying data directly between replica set members, reducing the overall time it takes to repair a cluster. In the rarer event that we need to perform a full logical initial sync of data, we can continue to perform this mode of data synchronization as well. After repair has completed, we finish by performing follow-up scanning to validate that the repair succeeded. We are still hard at work refining our systems for detecting and remediating data corruption. At the moment, much of our focus is on making our scanning processes as performant and thorough as possible and continuing to reduce the time it takes to identify new instances of corruption when they occur. With these systems in place it is our intention to make silent data corruption yet another detail of hardware management that the customers of Atlas don’t need to lose any sleep over, no matter what kinds of rare failures may occur. Join our MongoDB Community to learn about upcoming events, hear stories from MongoDB users, and connect with community members from around the world. Acknowledgments The systems described here are the work of dozens of engineers across several teams at MongoDB. Significant developments in these areas in recent years were led by Nathan Blinn, Xuerui Fa, Nan Gao, Rob George, Chris Kelly, and Eric Sedor-George. A special thanks to Eric Sedor-George for invaluable input throughout the process of writing this post. 1 Instances of alpha radiation, often in the form of cosmic rays, introducing silent data corruption have been explored in many studies since the 1970s. For a recent overview of literature on this topic, see Reghenzani, Federico, Zhishan Guo, and William Fornaciari. "Software fault tolerance in real-time systems: Identifying the future research questions." ACM Computing Surveys 55.14s (2023): 1-30 . 2 Five instances of silent data corruption introducing inaccurate results in scientific research were identified in a review of computing systems at Los Alamos National Laboratory in S. E. Michalak, et al , "Correctness Field Testing of Production and Decommissioned High Performance Computing Platforms at Los Alamos National Laboratory," SC '14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2014 3 A recent survey of the CPU population of Alibaba Cloud identified a corruption rate of 3.61 per 10,000 CPUs, see Wang, Shaobu, et al. "Understanding Silent Data Corruptions in a Large Production CPU Population." Proceedings of the 29th Symposium on Operating Systems Principles. 2023. Research by Google on its own datacenters identified CPU corruption on “the order of a few mercurial cores per several thousand machines,” see Hochschild, Peter H., et al. "Cores that don't count." Proceedings of the Workshop on Hot Topics in Operating Systems. 2021. Research by Meta on its own datacenters found that “hundreds of CPUs” demonstrated silent data corruption across “hundreds of thousands of machines,” see Dixit, Harish Dattatraya, et al. "Silent data corruptions at scale." arXiv preprint arXiv:2102.11245 (2021). 4 In practice, it is rare that corrupt data results in a legible value; more often data in this state would be simply illegible. 5 There are other methods of rebuilding indexes in a MongoDB database beyond what is described here; see db.collection.reIndex for information on rebuilding indexes without triggering an initial sync.
What’s New From MongoDB at AWS re:Invent 2024
As thousands of attendees make their way home after a week in Vegas—a week packed with learning, product launches, and round-the-clock events—we thought we’d reflect on the show’s highlights. MongoDB was excited to showcase our latest integrations and solutions with Amazon Web Services (AWS), which range from new ways to optimize generative AI, to faster, more cost-effective methods for modernizing applications. But first, we want to thank our friends at AWS for recognizing MongoDB as the AWS Technology Partner of the Year NAMER! This prestigious award recognizes AWS Technology Partners that are using AWS to lower costs, increase agility, and innovate faster. Announced during the re:Invent Partner Awards Gala, the Technology Partner of the Year Award is a testament to the specialization, innovation, and cooperation MongoDB and AWS have jointly brought to customers this year. In addition, MongoDB also received AWS Partner of the Year awards for Italy, Turkey, and Iberia. These awards follow wins in the ASEAN Global Software Partner of the Year and Taiwan Technology Partner of the Year categories earlier in the year, further demonstrating the global reach and popularity of the world’s most popular document database! Harnessing the potential of gen AI If 2024 (and 2023, and 2022…) was the year of gen AI excitement, then 2025 may turn out to be marked by realistic gen AI implementation. Indeed, we’ve seen customers shift their 2025 AI focus toward optimizing resource-intensive gen AI workloads to drive down costs—and to get the most out of this groundbreaking technology. Retrieval-augmented generation (RAG), one of the main ways companies use their data to customize the output of foundation models, has become the focus of this push for optimization. Customers are looking for easier ways to fine-tune their RAG systems, asking questions like, “How do I evaluate the efficiency and accuracy of my current RAG workflow?” To that end, AWS and MongoDB are introducing new services and technologies for enterprises to optimize RAG architecture compute costs, while also maintaining accuracy. First up is vector quantization. By reducing vector storage and memory requirements while preserving performance, these capabilities empower developers to build AI-enriched applications with more scale—and at a lower cost. Leading foundation models like Amazon Titan are already compatible with vector quantization, helping to maintain high accuracy of generated responses while simultaneously reducing costs. You can read more about vector quantization on our blog. As for RAG evaluation, AWS has launched a new feature for Amazon Bedrock called, naturally, RAG Evaluator. This tool allows Bedrock users to evaluate and monitor RAG Apps natively within the Bedrock environment, eliminating the need for third-party frameworks to run tests and comparisons. As a Knowledge Base for Amazon Bedrock, MongoDB Atlas is ready on day one to take advantage of Bedrock RAG Evaluator, allowing companies to gauge and compare the quality of their RAG Apps across different applications. The RAG Evaluator, built on several joint integrations and solutions AWS and MongoDB released in 2024, and vector quantization together can streamline the deployment of enterprise generative AI. For example, in October MongoDB, Anthropic, and AWS announced a joint solution to create a memory-enhanced AI agent . Together, the three partners offer enterprise-grade, trusted, secure technologies to build generative AI apps quickly and flexibly using a family of foundation models in a fully managed environment. Overall, MongoDB and AWS are making it easier—and more cost-effective—for developers to build innovative applications that harness the full potential of generative AI on AWS. From cars to startups to glue MongoDB and AWS have been hard at work on a number of other solutions for developers across industries. Here’s a quick roundup: AWS Amplify + AppSync + MongoDB For startups, or for any organization looking to quickly test and launch applications, speed is everything. That’s why MongoDB teamed up with AWS to create a full-stack solution that provides developers with the same high standards of performance and scalability they would demand for any app. By combining AWS Amplify, AWS AppSync, and MongoDB Atlas, AWS and MongoDB have created a full-stack solution that enables seamless front-end development, robust and scalable backend services, out-of-the-box CI/CD, and a flexible and powerful database solution, allowing developers to drastically reduce the coding time required to launch new applications. Check out this tutorial and repository for a starter template . Digital twins on AWS CMS For those in the automotive sector, MongoDB and AWS have developed a connected mobility solution to help remove the undifferentiated integration, or “technical plumbing” work, of connecting vehicles to the cloud. When used together, Connected Mobility Solution (CMS) on AWS and MongoDB Atlas help accelerate the development of next-generation digital twin use cases and applications, including connected car use cases. MongoDB’s document model allows easy and flexible modeling and storage of connected vehicle sensor data. Read our joint blog with AWS to learn how the MongoDB Atlas document model helps with data modeling of connected vehicles data and how this capability can be leveraged via AWS Automotive Cloud Developer Portal (ACDP). AWS Glue + MongoDB Atlas Speaking of undifferentiated plumbing, MongoDB Atlas is now integrated into AWS Glue’s visual interface. The new integration simplifies data integration between MongoDB Atlas and AWS, making it easy to build efficient ETL (Extract, Transform, Load) pipelines with minimal effort. With its visual interface, AWS Glue allows users to seamlessly transfer, transform, and load data to and from MongoDB Atlas without needing deep technical expertise in Spark or SQL. In this blog post , we look at how AWS Glue and MongoDB Atlas can transform the way you manage data movement. Buy with AWS In the spirit of making things easier for our joint customers, in early 2025 MongoDB will also join the AWS ‘Buy with AWS’ program. Once up and running, Buy With AWS will allow customers to pay for Atlas using their AWS account directly from the Atlas UI, further reducing friction for customers wanting to get started with Atlas on AWS. New Atlas Updates Announced at re:Invent Aside from our joint endeavors with AWS, MongoDB has also been hard at work on improving the core Atlas platform. Here’s an overview of what we announced: Asymmetrical sharding support for Terraform Atlas Provider Customers are constantly seeking ways to optimize costs to ensure they get the best value for their resources. With Asymmetrical Sharding, now available in the Terraform MongoDB Atlas Provider, MongoDB Atlas users can customize the Cluster Tier and IOPS for each shard, encouraging better resource allocation, improved operational efficiency, and cost savings as customer needs evolve. Atlas Flex Tier Our new Atlas Flex tier offers the scaled flexibility of serverless, with the cost-capped assurance of shared tier clusters. With Atlas Flex Tier, developers can build and scale applications cost-effectively without worrying about runaway bills or resource provisioning. New test bench feature in Query Converter At MongoDB, we firmly believe that the document model is the best way for customers to build applications with their data. In our latest update to Relational Migrator , we’ve introduced Generative AI to automatically convert SQL database objects and validate them using the test bench in a fraction of the time, producing deployment-ready code up to 90% faster. This streamlined approach reduces migration risks and manual development effort, enabling fast, efficient, and precise migrations to MongoDB. For more about MongoDB’s work with AWS—including recent announcements and the latest product updates—please visit the MongoDB Atlas on AWS page ! Visit our product page to learn more about MongoDB Atlas .
New Course for Building AI Applications with MongoDB on AWS
Developers everywhere want to expand the limits of what they can build with new generative AI technologies. But the AI market and its offerings have evolved so quickly that for many developers, keeping up can feel overwhelming. As we’ve entered the AI era, MongoDB and Amazon Web Services (AWS) have built upon our eight year partnership to deliver technology integrations—like MongoDB Atlas’s integrations with Amazon Bedrock and Amazon Q Developer (formerly CodeWhisperer)—that simplify the process of building and deploying gen AI applications. By combining MongoDB’s integrated operational and vector database capabilities with AWS’s AI infrastructure solutions, our goal is to make it easier for our developer community to innovate with AI. So, to help developers get started, we’re launching a new, free MongoDB Learning Badge focused on Building AI Applications with MongoDB on AWS . Building AI with MongoDB on AWS This is MongoDB University’s first AWS Learning Badge, and with it, we’ve focused on teaching developers how Amazon Bedrock and Atlas work together—including how to create a knowledge base in Amazon Bedrock, configure a knowledge base to use Atlas, inspect how a query is answered, create an Agent to answer questions based on data in Atlas, and configure guardrails that support responsible agentic behavior. In short, developers will learn how to remove the heavy lifting of infrastructure configuration and integration so they can get up and running with innovative new semantic search and RAG applications faster. Amazon Bedrock is a fully managed service from AWS that offers a choice of high-performing foundation models from leading AI companies via a single API, along with a broad set of capabilities organizations need to build secure, high-performing AI applications. Developers can connect Bedrock to MongoDB Atlas for blazing-fast vector searches and secure vector storage with minimal coding. With the integration, developers’ can use their proprietary data alongside industry-leading foundation models to launch AI applications that deliver hyper-intelligent and hyper-relevant results. Tens of thousands of customers are running MongoDB Atlas on AWS, and many have already embarked successfully on cutting-edge AI journeys. Take Scalestack for example, which used MongoDB Atlas Vector Search to build a RAG-powered AI copilot, named Spotlight, and is now using Bedrock’s customizable models to enhance Spotlight’s relevance and performance. Meanwhile, Base39 —a Brazilian fintech provider—used MongoDB Atlas and Amazon Bedrock to automate loan analysis, decreasing decision time from three days to one hour and reducing cost per loan analysis by 96%. Badge up with MongoDB MongoDB Learning Badges are a powerful way to demonstrate your dedication to continuous learning. These digital credentials not only validate your educational accomplishments but also stand as a testament to your expertise and skill. Whether you're a seasoned developer, an aspiring data scientist, or an enthusiastic student, earning a MongoDB badge can elevate your professional profile and unlock new opportunities in your field. Learn, prepare, and earn Complete the Learning Badge Path and pass a brief assessment to earn your badge. Upon completion, you'll receive an email with your official Credly badge and digital certificate, ready to share on social media, in email signatures, or on your resume. Additionally, you'll gain inclusion in the Credly Talent Directory, where you will be visible to recruiters from top employers. Millions of builders have been trained through MongoDB University courses—join them and get started building your AI future with MongoDB Atlas and AWS. And if you’re attending AWS re:Invent 2024, come find MongoDB at Booth #824. The first 100 people to receive their learning badge will receive a special gift! Start learning today
AI-Powered Call Centers: A New Era of Customer Service
Customer satisfaction is critical for insurance companies. Studies have shown that companies with superior customer experiences consistently outperform their peers. In fact, McKinsey found that life and property/casualty insurers with superior customer experiences saw a significant 20% and 65% increase in Total Shareholder Return , respectively, over five years. A satisfied customer is a loyal customer. They are 80% more likely to renew their policies, directly contributing to sustainable growth. However, one major challenge faced by many insurance companies is the inefficiency of their call centers. Agents often struggle to quickly locate and deliver accurate information to customers, leading to frustration and dissatisfaction. This article explores how Dataworkz and MongoDB can transform call center operations. By converting call recordings into searchable vectors (numerical representations of data points in a multi-dimensional space), businesses can quickly access relevant information and improve customer service. We'll dig into how the integration of Amazon Transcribe, Cohere, and MongoDB Atlas Vector Search—as well as Dataworkz's RAG-as-a-service platform— is achieving this transformation. From call recordings to vectors: A data-driven approach Customer service interactions are goldmines of valuable insights. By analyzing call recordings, we can identify successful resolution strategies and uncover frequently asked questions. In turn, by making this information—which is often buried in audio files— accessible to agents, they can give customers faster and more accurate assistance. However, the vast volume and unstructured nature of these audio files make it challenging to extract actionable information efficiently. To address this challenge, we propose a pipeline that leverages AI and analytics to transform raw audio recordings into vectors as shown in Figure 1: Storage of raw audio files: Past call recordings are stored in their original audio format Processing of the audio files with AI and analytics services (such as Amazon Transcribe Call Analytics ): speech-to-text conversion, summarization of content, and vectorization Storage of vectors and metadata: The generated vectors and associated metadata (e.g., call timestamps, agent information) are stored in an operational data store Figure 1: Customer service call insight extraction and vectorization flow Once the data is stored in vector format within the operational data store, it becomes accessible for real-time applications. This data can be consumed directly through vector search or integrated into a retrieval-augmented generation (RAG) architecture, a technique that combines the capabilities of large language models (LLMs) with external knowledge sources to generate more accurate and informative outputs. Introducing Dataworkz: Simplifying RAG implementation Building RAG pipelines can be cumbersome and time-consuming for developers who must learn yet another stack of technologies. Especially in this initial phase, where companies want to experiment and move fast, it is essential to leverage tools that allow us to abstract complexity and don’t require deep knowledge of each component in order to experiment with and realize the benefits of RAG quickly. Dataworkz offers a powerful and composable RAG-as-a-service platform that streamlines the process of building RAG applications for enterprises. To operationalize RAG effectively, organizations need to master five key capabilities: ETL for LLMs: Dataworkz connects with diverse data sources and formats, transforming the data to make it ready for consumption by generative AI applications. Indexing: The platform breaks down data into smaller chunks and creates embeddings that capture semantics, storing them in a vector database. Retrieval: Dataworkz ensures the retrieval of accurate information in response to user queries, a critical part of the RAG process. Synthesis: The retrieved information is then used to build the context for a foundational model, generating responses grounded in reality. Monitoring: With many moving parts in the RAG system, Dataworkz provides robust monitoring capabilities essential for production use cases. Dataworkz's intuitive point-and-click interface (as seen in Video 1) simplifies RAG implementation, allowing enterprises to quickly operationalize AI applications. The platform offers flexibility and choice in data connectors, embedding models, vector stores, and language models. Additionally, tools like A/B testing ensure the quality and reliability of generated responses. This combination of ease of use, optionality, and quality assurance is a key tenet of Dataworkz's "RAG as a Service" offering. Diving deeper: System architecture and functionalities Now that we’ve looked at the components of the pre-processing pipeline, let’s explore the proposed real-time system architecture in detail. It comprises the following modules and functions (see Figure 2): Amazon Transcribe , which receives the audio coming from the customer’s phone and converts it into text. Cohere ’s embedding model, served through Amazon Bedrock , vectorizes the text coming from Transcribe. MongoDB Atlas Vector Search receives the query vector and returns a document that contains the most semantically similar FAQ in the database. Figure 2: System architecture and modules Here are a couple of FAQs we used for the demo: Q: “Can you explain the different types of coverage available for my home insurance?” A: “Home insurance typically includes coverage for the structure of your home, your personal belongings, liability protection, and additional living expenses in case you need to temporarily relocate. I can provide more detailed information on each type if you'd like.” Q: “What is the process for adding a new driver to my auto insurance policy?" A: “To add a new driver to your auto insurance policy, I'll need some details about the driver, such as their name, date of birth, and driver's license number. We can add them to your policy over the phone, or you can do it through our online portal.” Note that the question is reported just for reference, and it’s not used for retrieval. The actual question is provided by the user through the voice interface and then matched in real-time with the answers in the database using Vector Search. This information is finally presented to the customer service operator in text form (see Fig. 3). The proposed architecture is simple but very powerful, easy to implement, and effective. Moreover, it can serve as a foundation for more advanced use cases that require complex interactions, such as agentic workflows , and iterative and multi-step processes that combine LLMs and hybrid search to complete sophisticated tasks. Figure 3: App interface, displaying what has been asked by the customer (left) and how the information is presented to the customer service operator (right) This solution not only impacts human operator workflows but can also underpin chatbots and voicebots, enabling them to provide more relevant and contextual customer responses. Building a better future for customer service By seamlessly integrating analytical and operational data streams, insurance companies can significantly enhance both operational efficiency and customer satisfaction. Our system empowers businesses to optimize staffing, accelerate inquiry resolution, and deliver superior customer service through data-driven, real-time insights. To embark on your own customer service transformation, explore our GitHub repository and take advantage of the Dataworkz free tier .
Better Digital Banking Experiences with AI and MongoDB
Interactive banking represents a new era in financial services where customers engage with digital platforms that anticipate, understand, and meet their needs in real-time. This approach encompasses AI-driven technologies such as chatbots, virtual assistants, and predictive analytics that allow banks to enhance digital self-service while delivering personalized, context-aware interactions. According to Accenture’s 2023 consumer banking study , 44% of consumers aged 18-44 reported difficulty accessing human support when needed, underscoring the demand for more responsive digital solutions that help bridge this gap between customers and financial services. Generative AI technologies like chatbots and virtual assistants can fill this need by instantly addressing inquiries, providing tailored financial advice, and anticipating future needs. This shift has tremendous growth potential; the global chatbot market is expected to grow at a CAGR of 23.3% from 2023 to 2030 , with the financial sector experiencing the fastest growth rate of 24.0%. This shift is more than just a convenience; it aims to create a smarter, more engaging, and intuitive banking journey for every user. Simplifying self-service banking with AI Navigating daily banking activities like transfers, payments, and withdrawals can often raise immediate questions for customers: “Can I overdraft my account?” “What will the penalties be?” or “How can I avoid these fees?” While the answers usually lie within the bank’s terms and conditions, these documents are often dense, complex, and overwhelming for the average user. At the same time, customers value their independence and want to handle their banking needs through self-service channels, but wading through extensive fine print isn't what they signed up for. By integrating AI-driven advisors into the digital banking experience, banks can provide a seamless, in-app solution that delivers instant, relevant answers. This removes the need for customers to leave the app to sift through pages of bank documentation in search of answers, or worse, endure the inconvenience of calling customer service. The result is a smoother and user-friendly interaction, where customers feel supported in their self-service journey, free from the frustration of navigating traditional, cumbersome information sources. The entire experience remains within the application, enhancing convenience and efficiency. Solution overview This AI-driven solution enhances the self-service experience in digital banking by applying Retrieval-Augmented Generation (RAG) principles, which combine the power of generative AI with reliable information retrieval, ensuring that the chatbot provides accurate, contextually relevant responses. The approach begins by processing dense, text-heavy documents, like terms and conditions, often the source of customer inquiries. These documents are divided into smaller, manageable chunks vectorized to create searchable data representations. Storing these vectorized chunks in MongoDB Atlas allows for efficient querying using MongoDB Atlas Vector Search , making it possible to instantly retrieve relevant information based on the customer’s question. Figure 1: Detailed solution architecture When a customer inputs a question in the banking app, the system quickly identifies and retrieves the most relevant chunks using semantic search. The AI then uses this information to generate clear, contextually relevant answers within the app, enabling a smooth, frustration-free experience without requiring customers to sift through dense documents or contact support. Figure 2: Leafy Bank mock-up chatbot in action How MongoDB supports AI-driven banking solutions MongoDB offers unique capabilities that empower financial institutions to build and scale AI-driven applications. Unified data model for flexibility: MongoDB’s flexible document model unifies structured and unstructured data, creating a consistent dataset that enhances the AI’s ability to understand and respond to complex queries. This model enables financial institutions to store and manage customer data, transaction history, and document content within a single system, streamlining interactions and making AI responses more contextually relevant. Vector search for enhanced querying: MongoDB Atlas Vector Search makes it easy to perform semantic searches on vectorized document chunks, quickly retrieving the most relevant information to answer user questions. This capability allows the AI to find precise answers within dense documents, enhancing the self-service experience for customers. Scalable integration with AI models: MongoDB is designed to work seamlessly with leading AI frameworks, allowing banks to integrate and scale AI applications quickly and efficiently. By aligning MongoDB Atlas with cloud-based LLM providers, banks can use the best tools available to interpret and respond to customer queries accurately, meeting demand with responsive, real-time answers. High performance and cost efficiency: MongoDB’s multi-cloud, developer-friendly platform allows financial institutions to innovate without costly infrastructure changes. It’s built to scale as data and AI needs to grow, ensuring banks can continually improve the customer experience with minimal disruptions. MongoDB’s built-in scalability allows banks to expand their AI capabilities effortlessly, offering a future-proof foundation for digital banking. Building future-proof applications Implementing generative AI presents several advantages, not only for end-users of the interactive banking applications but also for financial institutions: Enhanced user experience encourages customer satisfaction, ensures retention, boosts reputation, and reduces customer turnover while unlocking new opportunities for cross-selling and up-selling to increase revenue, drive growth and elevate customer value. Moreover, adopting AI-driven initiatives prepares the groundwork for businesses to develop innovative, creative, and future-proof applications to address customer needs and upgrade business applications with features that are shaping the industry and will continue to do so, here are some examples: Summarize and categorize transactional information by powering applications with MongoDB’s Real-Time Analytics . Understand and find trends based on customer behavior that could positively impact and leverage fraud prevention , anti-money laundering (AML) , and credit card application (just to mention a few). Offering investing, budgeting, and loan assessments through AI-powered conversational banking experience. In today’s data-driven world, companies face increasing pressure to stay ahead of rapid technological advancements and ever-evolving customer demands. Now more than ever, businesses must deliver intuitive, robust, and high-performing services through their applications to remain competitive and meet user expectations. Luckily, MongoDB provides businesses with comprehensive reference architectures for building generative AI applications, an end-to-end technology stack that includes integrations with leading technology providers, professional services, and a coordinated support system through the MongoDB AI Applications Program (MAAP) . By building AI-enriched applications with the leading multi-cloud developer data platform, companies can leverage low-cost, efficient solutions through MongoDB’s flexible and scalable document model which empowers businesses to unify real-time, operational, unstructured, and AI-related data, extending and customizing their applications to seize upcoming technological opportunities. Check out these additional resources to get started on your AI journey with MongoDB: How Leading Industries are Transforming with AI and MongoDB Atlas - E-book Our Solutions Library is where you can learn about different use cases for gen AI and other interesting topics that are applied to financial services and many other industries.
Influencing Product Strategy at MongoDB with Garaudy Etienne
Garaudy Etienne joined MongoDB as a Product Manager in October of 2019. Since then, he’s experienced tremendous growth. Successful deliveries of MongoDB 4.4 features and MongoDB 5.0 sharding features helped fuel Garaudy’s career development, as did his work establishing a long-term sharding vision, mentoring others, and successfully managing interns. Now, as a Director of Product, he’s defining the strategic direction across multiple products and helping grow our product management organization and culture. Read on to learn more about Garaudy’s experience at MongoDB and his expanding team. A team with impact My team focuses on distributed systems within MongoDB's core database functions, also known as the database engine. Our team ensures the database is reliable and scalable for our most demanding customers. We ensure the product consistently performs as promised, especially at scale. MongoDB's dependability drives greater usage, which enhances our revenue and brand perception. The problems my team works on are vast and relatively undefined. These include revamping our Go-To-Market strategy for new and existing features, guiding the engineering team on architectural decisions driven by customer demands, identifying target markets, and assisting customers in challenging situations. MongoDB and AI We’re in the early stages of the AI boom. MongoDB’s document model is particularly well-suited for this era, as it excels in handling unstructured data, which makes up the majority of today’s information. As AI increasingly relies on diverse formats like text, images, and videos, our flexible schema enables efficient storage and retrieval of unstructured data, enabling applications to extract valuable insights. Our vector search capability enables fast, complex data matching and retrieval, making it ideal for AI-powered applications. This synergy between MongoDB’s document model plus Vector Search and the needs of AI-driven applications positions us as a powerful foundation for companies looking to enable AI into their workflows. The beauty of working in the core database is that it has to support every workload, including the new and expanding Vector Search applications. This means we need to ensure the database remains robust and scalable as AI demands evolve. Some examples are helping develop a more scalable architecture for Search or a new networking stack for Search. No matter what new capabilities MongoDB decides to deliver or the new markets we enter, everything must pass through the core database. This also allows you to meet lots of people and understand everything the company is doing instead of working in a silo. A rewarding career in product MongoDB is committed to career development, something I’ve experienced first-hand. The company has provided me with development opportunities through product management-specific training with Reforge, conferences, direct engagement with critical customers, and leadership training. As a product manager, I was offered mentorship and coaching with multiple experienced product leaders who provided guidance and support as I worked toward promotions. The company clearly communicates the expectations and requirements for advancement within the product management organization. Reflecting on my journey at MongoDB, I still remember the first two features I PM’d: Hedged Reads and Mirrored Reads. One of my first major highlights was presenting at the MongoDB 5.0 keynote to showcase resharding. Seeing genuine excitement from customers and internal teams about this new feature was incredibly fulfilling and reinforced its value. While the keynote was a public milestone, another personal highlight came when I finally visited one of my engineering teams in Barcelona after nearly two years of remote collaboration. This in-person time was invaluable and helped us bring the groundbreaking sharding changes for MongoDB 6.0 to the finish line. Most recently, defining the key strategic pillars for MongoDB 8.0 and allowing other product managers to take ownership of key initiatives has been more rewarding than I imagined. MongoDB’s engineering team is extremely talented, and collaborating with them always brings me tremendous joy. The most recent highlight of my career has been building a diverse product team and helping other product managers make a larger impact than they previously envisioned. Why MongoDB What keeps me at MongoDB is the opportunity to tackle significant challenges, make autonomous decisions, own multiple products, and take on greater leadership responsibilities. MongoDB also rewards and recognizes product managers who drive meaningful impact across the organization and its products. If these opportunities excite you, you'll thrive as part of MongoDB’s product management team! For my team, I’m committed to providing the right balance of guidance and autonomy. Your decisions will have a lasting impact at the executive and organizational levels, creating continuous opportunities to excel and deliver meaningful results. Plus, I always try to make the job fun. Head to our careers site to apply for a role on Garaudy’s team and join our talent community to stay in the loop on all things #LifeAtMongoDB!
Customer Service Expert Wati.io Scales Up on MongoDB
Wati.io is a software-as-a-service (SaaS) platform that empowers businesses to develop conversation-driven strategies to boost growth. Founded by CEO Ken Yeung in 2019, Wati started as a chatbot solution for large enterprises, such as banks and insurance companies. However, over time, Yeung and his team noticed a growing need among small and medium-sized businesses (SMBs) to manage customer conversations more effectively. To address this need, Wati used MongoDB Atlas and built a solution based on the WhatsApp Business API. It enables businesses to manage and personalize conversations with customers, automate responses, improve commerce functions, and enhance customer engagement. Speaking at MongoDB.local Hong Kong in September 2024, Yeung said, “The current solutions on the market today are not good enough. Especially for SMBs [that] don’t have the same level of resources as enterprises to deal with the number of conversations and messages that need to be handled every day.” Supporting scale: From MongoDB Community Edition to MongoDB Atlas “From the beginning, we relied on MongoDB to handle high volumes of messaging data and enable businesses to manage and scale their customer interactions efficiently,” said Yeung. Wati originally used MongoDB Community Edition , as the company saw the benefits of a NoSQL model from the beginning. As the company grew, it realized it needed a scalable infrastructure, so Wati transitioned to MongoDB Atlas. “When we started reaching the 2 billion record threshold, we started having some issues. Our system slowed down, and we were not able to scale it,” said Yeung. Atlas has now become an essential part of Wati’s infrastructure, helping the company store and process millions of messages each month for over 10,000 customers in 165 countries. “Transitioning to a new platform—MongoDB Atlas—seamlessly was critical because our messaging system needs to be on 24/7,” said Yeung. Wati collaborated closely with the MongoDB Professional Services and MongoDB Support teams, and in a few months it was able to rearchitect the deployment and data model for future growth and demand. The work included optimizing Wati’s database by breaking it down into clusters. Wati then focused on extracting connections, such as conversations, and dividing and categorizing data within the clusters—for example, qualifying data as cold or hot based on the read and write frequencies. This architecture underpins the platform’s core features, including automated customer engagement, lead qualification, and sales management. Deepening search capabilities with MongoDB Atlas Search For Wati’s customers, the ability to search through conversation histories and company documents to retrieve valuable information is a key function. This often requires searching through millions of records to rapidly find answers so that they can respond to customers in real-time. By using MongoDB Atlas Search , Wati improved its search capabilities, ultimately helping its business customers perform more advanced analytics and improve their customer service agents’ efficiency and customer reporting. “[MongoDB] Atlas Search is really helpful because we don’t have to do a lot of technical integration, and minimal programming is required,” said Yeung. Looking ahead: Using AI and integrating more channels Wati expects to continue collaborating with MongoDB to add more features to its platform and keep innovating at speed. The company is currently exploring to build more AI capabilities of Wati KnowBot , as well as how it can expand its integration with other conversation platforms and channels such as Instagram and Facebook. To learn more about MongoDB Atlas, visit our product page . To get started with MongoDB Atlas Search, visit the Atlas Search product page .