Artificial Intelligence

LangChainGo and MongoDB: Powering RAG Applications in Go

March 31, 2025

Next-Generation Mobility Solutions with Agentic AI and MongoDB Atlas

Driven by advancements in vehicle connectivity, autonomous systems, and electrification, the automotive and mobility industry is currently undergoing a significant transformation. Vehicles today are sophisticated machines, computers on wheels, that generate massive amounts of data, driving demand for connected and electric vehicles. Automotive players are embracing artificial intelligence (AI), battery electrical vehicles (BEVs), and software-defined vehicles (SDVs) to maintain their competitive advantage. However, managing fleets of connected vehicles can be a challenge. As cars get more sophisticated and are increasingly integrated with internal and external systems, the volume of data they produce and receive greatly increases. This data needs to be stored, transferred, and consumed by various downstream applications to unlock new business opportunities. This will only grow: the global fleet management market is projected to reach $65.7 billion by 2030, growing at a rate of almost 10.8% annually. A 2024 study conducted by Webfleet showed that 32% of fleet managers believe AI and machine learning will significantly impact fleet operations in the coming years; optimizing route planning and improving driver safety are the two most commonly cited use cases. As fleet management software providers continue to invest in AI, the integration of agentic AI can significantly help with things like route optimization and driver safety enhancement. For example, AI agents can process real-time traffic updates and weather conditions to dynamically adjust routes, ensuring timely deliveries while advising drivers on their car condition. This proactive approach contrasts with traditional reactive methods, improving vehicle utilization and reducing operational and maintenance costs. But what are agents? In short, they are operational applications that attempt to achieve goals by observing the world and acting upon it using the data and tools the application has at its disposal. The term "agentic" denotes having agency, as AI agents can proactively take steps to achieve objectives without constant human oversight. For example, rather than just reporting an anomaly based on telemetry data analysis, an agent for a connected fleet could autonomously cross-check that anomaly against known issues, decide whether it's critical or not, and schedule a maintenance appointment all on its own. Why MongoDB for agentic AI Agentic AI applications are dynamic by nature as they require the ability to create a chain of thought, use external tools, and maintain context across their entire workflow. These applications generate and consume diverse data types, including structured and unstructured data. MongoDB’s flexible document model is uniquely suited to handle both structured and unstructured data as vectors. It allows all of an agent’s context, chain-of-thought, tools metadata, and short-term and long-term memory to be stored in a single database. This means that developers can spend more time on innovation and rapidly iterate on agent designs without being constrained by rigid schemas of a legacy relational database. Figure 1. Major components of an AI agent. Figure 1 shows the major components of an AI agent. The agent will first receive a task from a human or via an automated trigger, and will then use a large language model (LLM) to generate a chain of thought or follow a predetermined workflow. The agent will use various tools and models during its run and store/retrieve data from a memory provider like MongoDB Atlas . Tools: The agent utilizes tools to interact with the environment. This can contain API methods, database queries, vector search, RAG application, anything to support the model Models: can be a large language model (LLM), vision language model (VLM), or a simple supervised machine learning model. Models can be general purpose or specialized, and agents may use more than one. Data: An agent requires different types of data to function. MongoDB’s document model allows you to easily model all of this data in one single database. An agentic AI spans a wide range of functional tools and context. The underlying data structures evolve throughout the agentic workflow and as an agent uses different tools to complete a task. It also builds up memory over time. Let us list down the typical data types you will find in an agentic AI application. Data types: Agent profile: This contains the identity of the agent. It includes instructions, goals and constraints. Short-term memory: This holds temporary, contextual information—recent data inputs or ongoing interactions—that the agent uses in real-time. For example, short-term memory could store sensor data from the last few hours of vehicle activity. In certain agentic AI frameworks like Langgraph, short term memory is implemented through a checkpointer. The checkpointer stores intermediate states of the agent’s actions and/or reasoning. This memory allows the agent to seamlessly pause and resume operations. Long-term memory: This is where the agent stores accumulated knowledge over time. This may include patterns, trends, logs and historical recommendations and decisions. By storing each of these data types into rich, nested documents in MongoDB, AI developers can create a single-view representation of an agent’s state and behavior. This enables fast retrieval and simplifies development. In addition to the document model advantage, building agentic AI solutions for mobility requires a robust data infrastructure. MongoDB Atlas offers several key advantages that make it an ideal foundation for these AI-driven architectures. These include: Scalability and flexibility: Connected Car platforms like fleet management systems need to handle extreme data volumes and variety. MongoDB Atlas is proven to scale horizontally across cloud clusters, letting you ingest millions of telemetry events per minute and store terabytes of telemetry data with ease. For example, the German company ZF uses MongoDB to process 90,000 vehicle messages per minute (over 50 GB of data per day) from hundreds of thousands of connected cars. The flexibility of the document model accelerates development and ensures your data model stays aligned with the real-world entities it represents. Built-in vector search: AI agents require a robust set of tools to work with. One of the most widely used tools is vector search, which allows agents to perform semantic searches on unstructured data like driver logs, error codes descriptions, and repair manuals. MongoDB Atlas Vector Search allows you to store and index high-dimensional vectors alongside your documents and to perform semantic search over unstructured data. In practice, this means your AI embeddings live right next to the relevant vehicle telemetry and operational data in the database, simplifying architectures for use cases like the connected car incident advisor, in which a new issue can be matched against past issues before passing contextual information to the LLM. For more, check out this example of how an automotive OEM leverages vector search for audio based diagnostics with MongoDB Atlas Vector Search. Time series collections and real-time data processing: MongoDB Atlas is designed for real-time applications. It provides time series collections for connected car telemetry data storage, change streams, and triggers that can react to new data instantly. This is crucial for agentic AI feedback loops, where ongoing data ingestion and learning are happening continuously. Best-in-class embedding models with Voyage AI: In early 2025, MongoDB acquired Voyage AI , a leader in embedding and reranking models. Voyage AI embedding models are currently being integrated into MongoDB Atlas, which means developers will no longer need to manage external embedding APIs, standalone vector stores, or complex search pipelines. AI retrieval will be built into the database itself, making semantic search, vector retrieval, and ranking as seamless as traditional queries. This will reduce the time required for developing agentic AI applications. Agentic AI in action: Connected fleet incident advisor Figure 2 shows a list of use cases in the Mobility sector, sorted by various capabilities that an agent might demonstrate. AI agents excel at managing multi-step tasks via context management across tasks, they automate repetitive tasks better than Robotic process automation (RPA), and they demonstrate human-like reasoning by revisiting and revising past decisions. These capabilities enable a wide range of applications both during the manufacturing of a vehicle and while it's on the road, connected and sending telemetry. We will review a use case in detail below, and will see how it can be implemented using MongoDB Atlas, LangGraph, Open AI, and Voyage AI. Figure 2. Major use cases of agentic AI in the mobility and manufacturing sectors. First, the AI agent connects to traditional fleet management software and supports the fleet manager in diagnosing and advising the drivers. This is an example of a multi-step diagnostic workflow that gets triggered when a driver submits a complaint about the vehicle's performance (for example, increased fuel consumption). Figure 3 shows the sequence diagram of the agent. Upon receiving the driver complaint, it creates a chain of thought that follows a multi-step diagnostic workflow where the system ingests vehicle data such as engine codes and sensor readings, generates embeddings using the Voyage AI voyage-3-large embedding model, and performs a vector search using MongoDB Atlas to find similar past incidents. Once relevant cases are identified, those–along with selected telemetry data–are passed to OpenAI gpt-4o LLM to generate a final recommendation for the driver (for example, to pull off immediately or to keep driving and schedule regular maintenance). All data, including telemetry, past issues, session logs, agent profiles, and recommendations are stored in MongoDB Atlas, ensuring traceability and the ability to refine diagnostics over time. Additionally, MongoDB Atlas is used as a checkpointer by LangGraph, which defines the agent's workflow. Figure 3. Sequence diagram for a connected fleet advisor agentic workflow. Figure 4 shows the agent in action, from receiving an issue to generating a recommendation. So by leveraging MongoDB’s flexible data model and powerful Vector Search capabilities, we can agentic AI can transform fleet management through predictive maintenance and proactive decision-making. Figure 4. The connected fleet advisor AI agent in action. To set up the use case shown in this article, please visit our GitHub repository . And to learn more about MongoDB’s role in the automotive industry, please visit our manufacturing and automotive webpage . Want to learn more about why MongoDB is the best choice for supporting modern AI applications? Check out our on-demand webinar, “ Comparing PostgreSQL vs. MongoDB: Which is Better for AI Workloads? ” presented by MongoDB Field CTO, Rick Houlihan.

April 4, 2025

Artificial Intelligence

MongoDB Powers M-DAQ’s Anti-Money Laundering Compliance Platform

Founded and headquartered in Singapore, M-DAQ Global is a fintech powerhouse providing seamless cross-border transactions for businesses worldwide. M-DAQ’s comprehensive suite of foreign exchange, collections, and payments solutions help organizations of all sizes navigate the complexities of global trade, offering FX clarity, certainty, and payment mobility. M-DAQ also offers AI-powered services like Know Your Business (KYB), onboarding, and advanced risk management tools. Amidst ever-evolving requirements, these enable business transactions across borders with ease, while staying compliant. One of M-DAQ's most innovative solutions, CheckGPT , is an AI-powered platform designed to streamline Anti-Money Laundering (AML) compliance. It was built on MongoDB Atlas , providing a strong foundation for designing multitenant data storage. This approach ensures that each client has a dedicated database, effectively preventing any data co-mingling. Traditional AML processes often involve tedious, time-consuming tasks, from document review, to background checks, to customer onboarding. By building CheckGPT, M-DAQ’s aim was to change this paradigm, and to leverage AI to automate (and speed) these manual processes. Today, CheckGPT allows businesses to process onboarding 30 times faster than traditional human processing. The platform also leverages MongoDB Atlas’s native Vector Search capabilities to power intelligent semantic searches across unstructured data. The challenge: Managing unstructured, sensitive data, and performing complex searches One of CheckGPT’s priorities was to improve processes around collecting, summarizing, and analyzing data, while flagging potential risks to customers quickly and accurately. Considering the vast number and complexity of data sets its AI platform had to handle, and the strict regulatory landscape the company operates in, it was crucial that M-DAQ chose a robust database. CheckGPT needed a database that could efficiently and accurately handle unstructured data, and adapt rapidly as the data evolved. The database also had to be highly secure; to function, the AI tool would have to handle highly sensitive data, and would need to be used by companies operating in highly regulated industries. Finally, CheckGPT was looking for the ability to perform complex, high-dimensional searches to power a wide range of complex searches and real-time information analysis. MongoDB Atlas: A complete platform with unique features According to M-DAQ, there are many benefits of using MongoDB Atlas’ document model: Flexibility: MongoDB Atlas’s document model accommodates the evolving nature of compliance data, providing the flexibility needed to manage CheckGPT's dynamic data structures, such as onboarding documents and compliance workflows. Security and performance: The MongoDB Atlas platform also ensures that data remains secure throughout its lifecycle. M-DAQ was able to implement a multi-tenancy architecture that securely isolates data across its diverse client base. This ensures that the platform can handle varying compliance demands while maintaining exceptional performance, giving M-DAQ’s customers the confidence that the AML processes handled by CheckGPT are compliant with stringent regulatory standards. Vector search capabilities: MongoDB Atlas provides a unified development experience. Particularly, MongoDB Atlas Vector Search enables real-time searches across a vast amount of high-dimensional datasets. This makes it easier to verify documents, conduct background checks, and continuously monitor customer activity, ensuring fast and accurate results during AML processes. “AI, together with the flexibility of MongoDB, has greatly impacted CheckGPT, enabling us to scale operations and automate complex AML compliance processes,” said Andrew Marchen, General Manager, Payments and Co-founder, Wallex at M-DAQ Global. “This integration significantly reduces onboarding time, which typically took between 4-8 hours to three days depending on the document’s complexity, to less than 10 minutes. With MongoDB, M-DAQ is able to deliver faster and more accurate results while meeting customer needs in a secure and adaptable environment." The future of CheckGPT, powered by MongoDB M-DAQ believes that AI and data-driven technologies and tools will continue to play a central role in automating complex processes. By employing AI, M-DAQ aims to improve operational efficiency, enhance customer experiences, and scale rapidly—while maintaining high service standards. MongoDB’s flexibility and multi-cloud support will be key as M-DAQ plans to use single/multi-cluster and multi-region capabilities in the future. M-DAQ aims to explore additional features that could enhance CheckGPT's scalability and performance. The company, for example, plans to expand its use of MongoDB for future projects involving automating complex processes like compliance, onboarding, and risk management in 2025. Learn more about CheckGPT on their site . Visit our product page to learn more about MongoDB Atlas. Get started with MongoDB Atlas Vector Search today with our Atlas Vector Search Quick Start guide .

April 1, 2025

Artificial Intelligence

Building Gen AI with MongoDB & AI Partners | February 2025

February was big for MongoDB—and, more importantly, for anyone looking to build AI applications that deliver highly accurate, relevant information (in other words, for everyone building AI apps). MongoDB announced the acquisition of Voyage AI , a pioneer in state-of-the-art embedding and reranking models that power next-generation AI applications. Because generative AI is by nature probabilistic, models can “hallucinate”, and generate false or misleading information. This can lead to serious risks, especially in cases or industries (e.g., financial services) where accurate information is paramount. To address this, organizations building AI apps need high-quality retrieval; they need to trust that the most relevant information is extracted from their data with precision. Voyage AI’s advanced embedding and reranking models enable applications to extract meaning from highly specialized and domain-specific text and unstructured data. With roots at Stanford and MIT, Voyage AI’s world-class team is trusted by AI innovators like Anthropic, LangChain, Harvey, and Replit. Integrating Voyage AI’s technology with MongoDB will enable organizations to easily build trustworthy, AI-powered applications by offering highly accurate and relevant information retrieval deeply integrated with operational data. For more, check out MongoDB CEO Dev Ittycheria’s blog post about Voyage AI , and what this means for developers and businesses (in short, delivering high-quality results at scale). Onward! P.S. If you’re in Vegas for HumanX this week, stop by booth 412 to say hi to MongoDB! Welcoming new AI and tech partners The Voyage AI news was hardly the only exciting development last month. In February 2025, MongoDB welcomed three new AI and tech partners that offer product integrations with MongoDB. Read on to learn more about each great new partner! CopilotKit Seattle-based CopilotKit provides open source infrastructure for in-app AI copilots. CopilotKit helps organizations build production-ready copilots and agents effortlessly. “We’re excited to be partnering with MongoDB to help companies build best-in-class copilots that leverage RAG & take action based on internal data,” said Uli Barkai, Co-Founder and Chief Marketing Officer at CopilotKit. “MongoDB made it dead simple to build a scalable vector database with operational data. This collaboration enables developers to easily ship production-grade RAG applications.” Varonis Varonis is the leader in data security, protecting data wherever it lives—across SaaS, IaaS, and hybrid cloud environments. Varonis’ cloud-native Data Security Platform continuously discovers and classifies critical data, removes exposures, and detects advanced threats with AI-powered automation. “Varonis’s mission is to protect data wherever it lives,” said David Bass, Executive Vice President of Engineering and Chief Technology Officer at Varonis. “We are thrilled to further advance our mission by offering AI-powered data security and compliance for MongoDB, the database of choice for high-performance application and AI development. With this integration, joint customers can automatically discover and classify sensitive data, detect abnormal activities, secure AI data pipelines, and prevent data leaks.” Xlrt Xlrt is an automated insight-generation platform that enables financial institutions to create innovative financial credit products at scale by simplifying the financial spreading process. “We are excited to partner with MongoDB Atlas to transform AI-driven financial workflows,” said Rupesh Chaudhuri, Chief Operating Officer and Co-Founder of Xlrt. “XLRT.ai leverages agentic AI, combining graph-based contextualization, vector search, and LLMs to redefine data-driven decision-making. With MongoDB's robust NoSQL and vector search capabilities, we’re delivering unparalleled efficiency, accuracy, and scalability in automating financial processes.” To learn more about building AI-powered apps with MongoDB, check out our AI Learning Hub and stop by our Partner Ecosystem Catalog to read about our integrations with MongoDB’s ever-evolving AI partner ecosystem. And visit the MongoDB AI Applications Program (MAAP) page to learn how MongoDB and the MAAP ecosystem helps organizations build applications with advanced AI capabilities.

March 12, 2025

Artificial Intelligence

ORiGAMi: A Machine Learning Architecture for the Document Model

The document model has proven to be the optimal paradigm for modern application schemas. At MongoDB, we've long understood that semi-structured data formats like JSON offer superior expressiveness compared to traditional tabular and relational representations. Their flexible schema accommodates dynamic and nested data structures, naturally representing complex relationships between data entities. However, the machine learning (ML) community has faced persistent challenges when working with semi-structured formats. Traditional ML algorithms, as implemented in popular libraries like scikit-learn and pandas , operate on the assumption of fixed-dimensional tabular data consisting of rows and columns. This fundamental mismatch forces data scientists to manually convert JSON documents into tabular form—a time-consuming process that requires significant domain expertise. Recent advances in natural language processing (NLP) demonstrate the power of Transformers in learning from unstructured data but their application to semi-structured data has been under-studied. To bridge this gap, MongoDB's ML research group has developed a novel Transformer-based architecture designed for supervised learning on semi-structured data (e.g., JSON data in a document model database). We call this new architecture ORiGAMi (Object Representation through Generative, Autoregressive Modelling), and we're excited to make it available to the community at github.com/mongodb-labs/origami . It includes components that make training a Transformer model feasible on datasets entailing as few as 200 labeled samples. By combining this data efficiency with the flexibility of Transformers, ORiGAMi enables prediction directly from semi-structured documents, without the cumbersome flattening and manual feature extraction required for tabular data representation. You can read more about our model on arXiv . Technical innovation The key insight behind ORiGAMi lies in its tokenization strategy: documents are transformed into sequences of key-value pairs and special structural tokens that encode nested types like arrays and subdocuments: These token sequences serve as input to the Transformer model trained to predict the next token given a portion of the document, similar to how large language models (LLMs) are trained on text tokens. What’s more, our modifications to the standard Transformer architecture include guardrails to ensure that the model only generates valid, well-formed documents, and a novel position encoding strategy that respects the order invariance of key/value pairs in JSON. These modifications also allow for much smaller models compared to LLMs, which can thus be trained on consumer hardware in minutes to hours depending on dataset size and complexity, versus days to weeks for LLMs. By reformulating classification as a next-token prediction task, ORiGAMi can predict any field within a document, including complex types like arrays and nested subdocuments. This unified approach eliminates the need for separate models or preprocessing pipelines for different prediction tasks. Example use case Our initial focus has been supervised learning: training models from labeled data to make predictions on unseen documents. Let's explore a practical example of user segmentation. Consider a collection where each document represents a user profile, containing both simple fields and complex nested structures: { "_id": "user_7842", "email": "sarah.chen@example.com", "signup_date": "2024-01-15", "device_history": [ { "device": "mobile_ios", "first_seen": "2024-01-15", "last_seen": "2024-02-11" }, { "device": "desktop_chrome", "first_seen": "2024-01-16", "last_seen": "2024-02-10" } ], "subscription": { "plan": "pro", "billing_cycle": "annual", "features_used": ["analytics", "api_access", "team_sharing"], "usage_metrics": { "storage_gb": 45.2, "api_calls_per_day": 1250, "active_projects": 8 } }, "user_segment": "enterprise_power_user" // <-- target field } Suppose you want to automatically classify users into segments like "enterprise_power_user", "smb_growth", or "early_stage_startup" based on their behavior and characteristics. Some documents in your collection already have correct labels, perhaps assigned through manual analysis or customer interviews. Traditional ML approaches would require flattening this rich document structure, leading to very sparse tables and potentially losing important hierarchical relationships. With ORiGAMi, you can: Train directly on the raw documents with existing labels Preserve the full context of nested structures and arrays Make predictions for the "user_segment" field on new users immediately after signup Update predictions as user behavior evolves without rebuilding feature pipelines Getting started with ORiGAMi We're excited to be open-sourcing ORiGAMi ( github.com/mongodb-labs/origami ) and you can read more about our model on arXiv . We've also included a command-line interface that lets users make predictions without writing any code. Training a model is as simple as pointing ORiGAMi to your MongoDB collection: origami train <mongo-uri> -d app -c users Once trained, you can generate predictions and seamlessly integrate them back into your MongoDB workflow. For example, to predict user segments for new signups (from the analytics.signups collection ) and write the resulting predictions back to MongoDB to an analytics.predicted collection: origami predict <mongo-uri> -d analytics -c signups --target user_segment --json | mongoimport -d analytics -c predicted For those looking to dive deeper, we've also included several Jupyter notebooks in the repository that demonstrate advanced features and customization options. Model performance can be improved by adjusting the hyperparameters. We're just scratching the surface of what's possible with document-native machine learning, and have many more use cases in mind. We invite you to explore the repository, contribute to the project, and share how you use ORiGAMi to solve real-world problems. Head over to the ORiGAMi github repo , play around with it, and tell us about new ways of applying it and problems it’s well-suited to solving.

March 11, 2025

Artificial Intelligence

AI-Powered Java Applications With MongoDB and LangChain4j

MongoDB is pleased to introduce its integration with LangChain4j , a popular framework for integrating large language models (LLMs) into Java applications. This collaboration simplifies the integration of MongoDB Atlas Vector Search into Java applications for building AI applications. The advent of generative AI has opened up many new possibilities for developing novel applications. These advancements have led to the development of AI frameworks that simplify the complexities of orchestrating and integrating LLMs and the various components of the AI stack , where MongoDB plays a key role as an operational and vector database. Simplifying AI development for Java The first AI frameworks to emerge were developed for Python and JavaScript, which were favored by early AI developers. However, Java remains widespread in enterprise software. This has led to the development of LangChain4j to address the needs of the Java ecosystem. While largely inspired by LangChain and other popular AI frameworks, LangChain4j is independently developed. As with other LLM frameworks, LangChain4j offers several advantages for developing AI systems and applications by providing: A unified API for integrating LLM providers and vector stores. This enables developers to adopt a modular approach with an interchangeable stack while ensuring a consistent developer experience. Common abstractions for LLM-powered applications, such as prompt templating, chat memory management, and function calling, offering ready-to-use building blocks for common AI applications like retrieval-augmented generation (RAG) and agents. Powering RAG and agentic systems with MongoDB and LangChain4j MongoDB worked with the LangChain4j open-source community to integrate MongoDB Atlas Vector Search into the framework, enabling Java developers to develop AI-powered applications from simple RAG to agentic applications. In practice, this means developers can now use the unified LangChain4j API to store vector embeddings in MongoDB Atlas and use Atlas Vector Search capabilities for retrieving relevant context data. These capabilities are essential for enabling RAG pipelines, where private, often enterprise data is retrieved based on relevancy and combined with the original prompt to get more accurate results in LLM-based applications. LangChain4j supports various levels of RAG, from basic to advanced implementations, making it easy to prototype and experiment before customizing and scaling your solution to your needs. A basic RAG setup with LangChain4j typically involves loading and parsing unstructured data from documents stored locally or on remote services like Amazon S3 or Azure Storage using the Document API. The process then transforms and splits the data, then embeds it to capture the semantic meaning of the content. For more details, check out the documentation on core RAG APIs . However, real-world use cases often demand solutions with advanced RAG and agentic systems. LangChain4j optimizes RAG pipelines with predefined components designed to enhance accuracy, latency, and overall efficiency through techniques like query transformation, routing, content aggregation, and reranking. It also supports AI agent implementation through dedicated APIs, such as AI Services and Tools , with function calling and RAG integration, among others. Learn more about the MongoDB Atlas Vector Search integration in LangChain4j’s documentation . MongoDB’s dedication to providing the best developer experience for building AI applications across different ecosystems remains strong, and this integration reinforces that commitment. We will continue strengthening our integration with LLM frameworks enabling developers to build more-innovative AI applications, agentic systems, and AI agents. Ready to start building AI applications with Java? Learn how to create your first RAG system by visiting our tutorial: How to Make a RAG Application With LangChain4j .

March 4, 2025

Artificial Intelligence

Why Vector Quantization Matters for AI Workloads

Key takeaways As vector embeddings scale into millions, memory usage and query latency surge, leading to inflated costs and poor user experience. By storing embeddings in reduced-precision formats (int8 or binary), you can dramatically cut memory requirements and speed up retrieval. Voyage AI's quantization-aware embedding models are specifically tuned to handle compressed vectors without significant loss of accuracy. MongoDB Atlas streamlines the workflow by handling the creation, storage, and indexing of compressed vectors, enabling easier scaling and management. MongoDB is built for change, allowing users to effortlessly scale AI workloads as resource demands evolve. Organizations are now scaling AI applications from proofs of concept to production systems serving millions of users. This shift creates scalability, latency, and resource challenges for mission-critical applications leveraging recommendation engines, semantic search, and retrieval-augmented generation (RAG) systems. At scale, minor inefficiencies compound and become major bottlenecks, increasing latency, memory usage, and infrastructure costs. This guide explains how vector quantization enables high-performance, cost-effective AI applications at scale. The challenge: Scaling vector search in production Let’s start by considering a modern voice assistance platform that combines semantic search with natural language understanding. During development, the system only needs to process a few hundred queries per day, converting speech to text and matching the resulting embeddings against a modest database of responses. The initial implementation is straightforward: each query generates a 32-bit floating-point embedding vector that's matched against a database of similar vectors using cosine similarity. This approach works smoothly in the prototype phase—response times are quick, memory usage is manageable, and the development team can focus on improving accuracy and adding features. However, as the platform gains traction and scales to processing thousands of queries per second against millions of document embeddings, the simple approach begins to break down. Each incoming query now requires loading massive amounts of high-precision floating-point vectors into memory, computing similarity scores across an exponentially larger dataset, and maintaining increasingly complex vector indexes for efficient retrieval. Without proper optimization, the system struggles as memory usage balloons, query latency increases, and infrastructure costs spiral upward. What started as a responsive, efficient prototype has become a bottleneck production system that struggles to maintain its performance requirements while serving a growing user base. The key challenges are: Loading high-precision 32-bit floating-point vectors into memory Computing similarity scores across massive embedding collections Maintaining large vector indexes for efficient retrieval Which can lead to critical issues like: High memory usage as vector databases struggle to keep float32 embeddings in RAM Increased latency as systems process large volumes of high-precision data Growing infrastructure costs as organizations scale their vector operations Reduced query throughput due to computational overhead AI workloads with tens or hundreds of millions of high-dimensional vectors (e.g., 80M+ documents at 1536 dimensions) face soaring RAM and CPU requirements. Storing float32 embeddings for these workloads can become prohibitively expensive. Vector quantization: A path to efficient scaling The obvious question is: How can you maintain the accuracy of your recommendations, semantic matches, and search queries, while drastically cutting down on compute and memory usage and reducing retrieval latency? Vector quantization is how. It helps you store embeddings more compactly, reduce retrieval times, and keep costs under control. Vector quantization offers a powerful solution to scalability, latency, and resource utilization challenges by compressing high-dimensional embeddings into compact representations while preserving their essential characteristics. This technique can dramatically reduce memory requirements and accelerate similarity computations without compromising retrieval accuracy. What is vector quantization? Vector quantization is a compression technique widely applied in digital signal processing and machine learning. Its core idea is to represent numerical data using fewer bits, reducing storage requirements without entirely sacrificing the data’s informative value. In the context of AI workloads, quantization commonly involves converting embeddings—originally stored as 32-bit floating-point values—into formats like 8-bit integers. By doing so, you can substantially decrease memory and storage consumption while maintaining a level of precision suitable for similarity search tasks. An important point to note is that the quantization mechanism is especially suitable for use cases that involve over 1 million vector embeddings, such as RAG applications, semantic search, or recommendation systems that require tight control of operational costs without a compromise on retrieval accuracy. Smaller datasets with fewer than 1 million embeddings might not see significant gains from quantization procedures. For smaller datasets, the overhead of implementing quantization might outweigh its benefits. Understanding vector quantization Vector quantization operates by mapping high-dimensional vectors to a discrete set of prototype vectors or converting them to lower-precision formats. There are three main approaches: Scalar quantization: Converts individual 32-bit floating-point values to 8-bit integers, reducing memory usage of vector values by 75% while maintaining reasonable precision. Product quantization: Compresses entire vectors at once by mapping them to a codebook of representative vectors, offering better compression than scalar quantization at the cost of more complex encoding/decoding. Binary quantization: Transforms vectors into binary (0/1) representations, achieving maximum compression but with more significant information loss. A vector database that applies these compression techniques must effectively manage multiple data structures: Hierarchical navigable small world (HNSW) graph for navigable search Full-fidelity vectors (32-bit float embeddings) Quantized vectors (int8 or binary) When quantization is defined in the vector index, the system builds quantized vectors and constructs the HNSW graph from these compressed vectors. Both structures are placed in memory for efficient search operations, significantly reducing the RAM footprint compared to storing full-fidelity vectors alone. The table below illustrates how different quantization mechanisms impact memory usage and disk consumption. This example focuses on HNSW indexes storing 30 GB of original float32 embeddings alongside a 0.1 GB HNSW graph structure. Our RAM usage estimates include a 10% overhead factor (1.1 multiplier) to account for JVM memory requirements with indexes loaded into page cache, reflecting typical production deployment conditions. Actual overhead may vary based on specific configurations. Here are key attributes to consider based on the table below: Estimated RAM usage: Combines HNSW graph size with either full or quantized vectors, plus a small overhead factor (1.1 for index overhead). Disk usage: Includes storage for full-fidelity vectors, HNSW graph, and quantized vectors when applicable. Notice that while enabling quantization increases total disk usage —because you still store full-fidelity vectors for exact nearest neighbor queries in both cases and rescoring in the case of binary quantization—it dramatically decreases RAM requirements and speeds up initial retrieval . MongoDB Atlas Vector Search offers powerful scaling capabilities through its automatic quantization system . As illustrated in Figure 1 below, MongoDB Atlas supports multiple vector search indexes with varying precision levels: Float32 for maximum accuracy, Scalar Quantized (int8) for balanced performance with 3.75× RAM reduction, and Binary Quantized (1-bit) for maximum speed with 24× RAM reduction. The quantization variety provided by MongoDB Atlas allows users to optimize their vector search workloads based on specific requirements. For collections exceeding 1M vectors, Atlas automatically applies the appropriate quantization mechanism, with binary quantization particularly effective when combined with Float32 rescoring for final refinement. Figure 1: MongoDB Atlas Vector Search Architecture with Automatic Quantization Data flow through embedding generation, storage, and tiered vector indexing with binary rescoring. Binary quantization with rescoring A particularly effective strategy is to combine binary quantization with a rescoring step using full-fidelity vectors. This approach offers the best of both worlds: extremely fast lookups thanks to binary data formats, plus more precise final rankings from higher-fidelity embeddings. Initial retrieval (Binary) Embeddings are stored as binary to minimize memory usage and accelerate the approximate nearest neighbor (ANN) search. Hamming distance (via XOR + population count) is used, which is computationally faster than Euclidean or cosine similarity on floats. Rescoring The top candidate results from the binary pass are re-evaluated using their float or int8 vectors to refine the ranking. This step mitigates the loss of detail in binary vectors, balancing result accuracy with the speed of the initial retrieval. By pairing binary vectors for rapid recall with full-fidelity embeddings for final refinement, you can keep your system highly performant and maintain strong relevance. The need for quantization-aware models Not all embedding models perform equally well under quantization. Models need to be specifically trained with quantization in mind to maintain their effectiveness when compressed. Some models—especially those trained purely for high-precision scenarios—suffer significant accuracy drops when their embeddings are represented with fewer bits. Quantization-aware training (QAT) involves: Simulating quantization effects during the training process Adjusting model weights to minimize information loss Ensuring robust performance across different precision levels This is particularly important for production applications where maintaining high accuracy is crucial. Embedding models like those from Voyage AI— which recently joined MongoDB —are specifically designed with quantization awareness, making them more suitable for scaled deployments. These models preserve more of their essential feature information even under aggressive compression. Voyage AI provides a suite of embedding models specifically designed with QAT in mind, ensuring minimal loss in semantic quality when shifting to 8-bit integer or even binary representations. Figure 2: Embedding model performance comparing retrieval quality (NDCG@10) versus storage costs . Voyage AI models (green) maintain superior retrieval quality even with binary quantization (triangles) and int8 compression (squares), achieving up to 100x storage efficiency compared to standard float embeddings (circles) . The graph above shows several important patterns that demonstrate why quantization-aware training (QAT) is crucial for maintaining performance under aggressive compression. The Voyage AI family of models (shown in green) demonstrates strong performance in retrieval quality even under extreme compression. The voyage-3-large model demonstrates this dramatically—when using int8 precision at 1024 dimensions, it performs nearly identically to its float precision, 2048-dimensional counterpart, showing only a minimal 0.31% quality reduction despite using 8 times less storage. This showcases how models specifically designed with quantization in mind can preserve their semantic understanding even under substantial compression. Even more impressive is how QAT models maintain their edge over larger, uncompressed models. The voyage-3-large model with int8 precision and 1024 dimensions outperforms OpenAI-v3-large (using float precision and 3072 dimensions) by 9.44% while requiring 12 times less storage. This performance gap highlights that raw model size and dimension count aren't the decisive factors —it's the intelligent design for quantization that matters. The cost implications become truly striking when we examine binary quantization. Using voyage-3-large with 512-dimensional binary embeddings, we still achieve better retrieval quality than OpenAI-v3-large with its full 3072-dimensional float embeddings while using 200 times less storage. To put this in practical terms: what would have cost $20,000 in monthly storage can be reduced to just $100 while actually improving performance. In contrast, models not specifically trained for quantization, such as OpenAI's v3-small (shown in gray), show a more dramatic drop in retrieval quality as compression increases. While these models perform well in their full floating-point representation (at 1x storage cost), their effectiveness deteriorates more sharply when quantized, especially with binary quantization. For production applications where both accuracy and efficiency are crucial, choosing a model that has undergone quantization-aware training can make the difference between a system that degrades under compression and one that maintains its effectiveness while dramatically reducing resource requirements. Read more on the Voyage AI blog . Impact: Memory, retrieval latency, and cost Vector quantization addresses the three core challenges of large-scale AI workloads—memory, retrieval latency, and cost—by compressing full-precision embeddings into more compact representations. Below is a breakdown of how quantization drives efficiency in each area. Figure 3: Quantization Performance Metrics: Memory Savings with Minimal Accuracy Trade-offs Comparison of scalar vs. binary quantization showing RAM reduction (75%/96%), query accuracy retention (99%/95%), and performance gains (>100%) for vector search operations Memory and storage optimization Quantization techniques dramatically reduce compute resource requirements while maintaining search accuracy for vector embeddings at scale. Lower RAM footprint Storage in RAM is often the primary bottleneck for vector search systems Embeddings stored as 8-bit integers or binary reduce overall memory usage, allowing significantly more vectors to remain in memory. This compression directly shrinks vector indexes (e.g., HNSW), leading to faster lookups and fewer disk I/O operations. Reduced disk usage in collection with binData binData (binary) formats can cut raw storage needs by up to 66%. Some disk overhead may remain when storing both quantized and original vectors, but the performance benefits justify this tradeoff. Practical gains 3.75× reduction in RAM usage with scalar (int8) quantization Up to 24× reduction with binary quantization, especially when combined with rescoring to preserve accuracy. Significantly more efficient vector indexes, enabling large-scale deployments without prohibitive hardware upgrades. Retrieval latency Quantization methods leverage CPU cache optimizations and efficient distance calculations to accelerate vector search operations beyond what's possible with standard float32 embeddings. Faster similarity computations Smaller data types are more CPU-cache-friendly, which speeds up distance calculations. Binary quantization uses Hamming distance (XOR + popcount), yielding dramatically faster top-k candidate retrieval. Improved throughput With reduced memory overhead, the system can handle more concurrent queries at lower latencies. In internal benchmarks, query performance for large-scale retrievals improved by up to 80% when adopting quantized vectors. Cost efficiency Vector quantization provides substantial infrastructure savings by reducing memory and computation requirements while maintaining retrieval quality through compression and rescoring techniques. Lower infrastructure costs Smaller vectors consume fewer hardware resources, enabling deployments on less expensive instances or tiers. Reduced CPU/GPU time per query allows resource reallocation to other critical parts of the application. Better scalability As data volumes grow, memory and compute requirements don’t escalate as sharply. Quantization-aware training (QAT) models, such as those from Voyage AI, help maintain accuracy while reaping cost savings at scale. By compressing vectors into int8 or binary formats, you tackle memory constraints, accelerate lookups, and curb infrastructure expenses—making vector quantization an indispensable strategy for high-volume AI applications. MongoDB Atlas: Built for Changing Workloads with Automatic Vector Quantization The good news for developers is that MongoDB Atlas supports “automatic scalar” and “automatic binary quantization” in index definitions, reducing the need for external scripts or manual data preprocessing. By quantizing at index build time and query time, organizations can run large-scale vector workloads on smaller, more cost-effective clusters. A common question most developers ask is when to use quantization. Quantization becomes most valuable once you reach substantial data volumes—on the order of a million or more embeddings. At this scale, memory and compute demands can skyrocket, making reduced memory footprints and faster retrieval speeds essential. Examples of cases that call for quantization include: High-volume scenarios: Datasets with millions of vector embeddings where you must tightly control memory and disk usage. Real-time responses: Systems needing low-latency queries under high user concurrency. High query throughput: Environments with numerous concurrent requests demanding both speed and cost-efficiency. For smaller datasets (under 1 million vectors), the added complexity of quantization may not justify the benefits. However, for large-scale deployments, it becomes a critical optimization that can dramatically improve both performance and cost-effectiveness. Now that we have established a strong foundation on the advantages of quantization—specifically the benefits of binary quantization with rescoring— feel free to refer to the MongoDB documentation to learn more about implementing vector quantization. You can also learn more about Voyage AI’s state-of-the-art embedding models on our product page .

February 27, 2025

Artificial Intelligence

Redefining the Database for AI: Why MongoDB Acquired Voyage AI

This post is also available in: Deutsch , Français , Español , Português , Italiano , 한국어 , 简体中文 . AI is reshaping industries, redefining customer experiences, and transforming how businesses innovate, operate, and compete. While much of the focus is on frontier models, a fundamental challenge lies in data—how it is stored, retrieved, and made useful for AI applications. The democratization of AI-powered software depends on building on top of the right abstractions, yet today, creating useful, real-time AI applications at scale is not feasible for most organizations. The challenge isn’t just complexity—it’s trust. AI models are probabilistic, meaning their outputs aren’t deterministic and predictable. This is easily evident in the hallucination problem in chatbots today, and becomes even more critical with the rise of agents, where AI systems make autonomous decisions. Development teams need the ability to control, shape, and ground generated outputs to align with their objectives and ensure accuracy. AI-powered search and retrieval is a powerful tool that extracts relevant contextual data from specific sources, augmenting AI models to generate reliable and accurate responses or take responsible and safe actions, as seen in the prominent retrieval augmented generation (RAG) approach. At the core of AI-powered retrieval are embedding generation and reranking—two key AI components that capture the semantic meaning of data and assess the relevance of queries and results. We believe embedding generation and reranking, as well as AI-powered search, belong in the database layer, simplifying the stack and creating a more reliable foundation for AI applications. By bringing more intelligence into the database, we help businesses mitigate hallucinations, improve trustworthiness, and unlock AI’s full potential at scale. The most impactful applications require a flexible, intelligent, and scalable data foundation. That’s why we’re excited to announce the acquisition of Voyage AI , a leader in embedding and reranking models that dramatically improve accuracy through AI-powered search and retrieval. This move isn’t just about adding AI capabilities— it’s about redefining the database for the AI era . Why this matters: The future of AI is built on better relevance and accuracy in data AI is probabilistic—it’s not built like traditional software with pre-defined rules and logic. Instead, it generates responses or takes action based on how the AI model is trained and what data is retrieved. However, due to the probabilistic nature of the technology, AI can hallucinate. Hallucinations are a direct consequence of poor or imprecise retrieval—when AI lacks access to the right data, it generates plausible but incorrect information. This is a critical barrier to AI adoption, especially in enterprises and for mission-critical use cases where accuracy is non-negotiable. This makes retrieving the most relevant data essential for AI applications to deliver high-quality, contextually accurate results. Today, developers rely on a patchwork of separate components to build AI-powered applications. Sub-optimal choices of these components, such as embedding models, can yield low-relevancy data retrieval and low-quality generated outputs. This fragmented approach is complex, costly, inefficient, and cumbersome for developers. With Voyage AI, MongoDB solves this challenge by making AI-powered search and retrieval native to the database. Instead of implementing workarounds or managing separate systems, developers can generate high-quality embeddings from real-time operational data, store vectors, perform semantic search, and refine results—all within MongoDB. This eliminates complexity and delivers higher accuracy, lower latency, and a streamlined developer experience. What Voyage AI brings to MongoDB Voyage AI has built a world-class AI research team with roots at Stanford, MIT, UC Berkeley, and Princeton and has rapidly become a leader in high-precision AI retrieval. Their technology is already trusted by some of the most advanced AI startups, including Anthropic, LangChain, Harvey, and Replit. Notably, Voyage AI’s embedding models are the highest-rated zero-shot models in the Hugging Face community. Voyage AI’s models are designed to increase the quality of generated output by: Enhancing vector search by creating embeddings that better capture meaning across text, images, PDFs, and structured data. Improving retrieval accuracy through advanced reranking models that refine search results for AI-powered applications. Enabling domain-specific AI with fine-tuned models optimized for different industries such as financial services, healthcare, and law, and use cases such as code generation. By integrating Voyage AI’s retrieval capabilities into MongoDB, we’re helping organizations more easily build AI applications with greater accuracy and reliability—without unnecessary complexity. How Voyage AI will be integrated into MongoDB We are integrating Voyage AI with MongoDB in three phases. In the first phase, Voyage AI’s text embedding, multi-modal embedding, and reranking models will remain widely available through Voyage AI’s current APIs and via the AWS and Azure Marketplaces—ensuring developers can continue to use their best-in-class embedding and reranking capabilities. We will also invest in the scalability and enterprise readiness of the platform to support the increased adoption of Voyage AI’s models. Next, we will seamlessly embed Voyage AI’s capabilities into MongoDB Atlas , starting with an auto-embedding service for Vector Search, which will handle embedding generation automatically. Native reranking will follow, allowing developers to boost retrieval accuracy instantly. We also plan to expand domain-specific AI capabilities to better support different industries (e.g., financial services, legal, etc.) or use cases (e.g., code generation). Finally, we will advance AI-powered retrieval with enhanced multi-modal capabilities, enabling seamless retrieval and ranking of text, images, and video. We also plan to introduce instruction-tuned models, allowing developers to refine search behavior using simple prompts instead of complex fine-tuning. This will be complemented by embedding lifecycle management in MongoDB Atlas, ensuring continuous updates and real-time optimization for AI applications. What this means for developers and businesses AI-powered applications need more than a database that just stores, processes, and persists data—they need a database that actively improves retrieval accuracy, scales seamlessly, and eliminates operational friction. With Voyage AI, MongoDB redefines what’s required for a database to underpin mission-critical AI-powered applications. Developers will no longer need to manage external embedding APIs, standalone vector stores, or complex search pipelines. AI retrieval will be built into the database itself, making semantic search, vector retrieval, and ranking as seamless as traditional queries. For businesses, this translates to faster time-to-value and greater confidence in scaling AI applications. By delivering high-quality results at scale, enterprises can seamlessly integrate AI into their most critical use cases, ensuring reliability, performance, and real-world impact. Looking ahead: What comes next This is just the beginning. Our vision is to make MongoDB the most powerful and intuitive database for modern, AI-driven applications. Voyage AI’s models will soon be natively available in MongoDB Atlas. We will continue evolving MongoDB’s AI retrieval capabilities, making it smarter, more adaptable, and capable of handling a wider range of data types and use cases. Stay tuned for more details on how you can start using Voyage AI’s capabilities in MongoDB. To learn more about how MongoDB and Voyage AI are powering state-of-the-art AI search and retrieval for building, scaling, and deploying intelligent applications, visit our product page .

February 24, 2025

Artificial Intelligence

Multi-Agent Collaboration for Manufacturing Operations Optimization

While there are some naysayers across the media landscape who doubt the potential impact of AI innovations, for those of us immersed in implementing AI on a daily basis, there’s wide agreement that its potential is huge and world-altering. It’s now generally accepted that Large Language Models (LLMs) will eventually be able to perform tasks as well—if not better—than a human. And the size of the potential AI market is truly staggering. Bain’s AI analysis estimates that the total addressable market (TAM) for AI and gen AI-related hardware and software will grow between 40% and 55% annually, reaching between $780 billion and $990 billion by 2027. This growth is especially relevant to industries like manufacturing, where generative AI can be applied across the value chain. From inventory categorization to product risk assessments, knowledge management, and predictive maintenance strategy generation, AI's potential to optimize manufacturing operations cannot be overstated. But in order to realize the transformative economic potential of AI, applications powered by LLMs need to evolve beyond chatbots that leverage retrieval-augmented generation (RAG). Truly transformative AI-powered applications need to be objective-driven, not just responding to user queries but also taking action on behalf of the user. This is crucial in complex manufacturing processes. In other words, they need to act like agents. Agentic systems, or compound AI systems, are currently emerging as the next frontier of generative AI applications. These systems consist of a single or multiple AI agents that collaborate with each other and use tools to provide value. An AI agent is a computational entity containing short- and long-term memory, which enables it to provide context to an LLM. It also has access to tools, such as web search and function calling, that enable it to act upon the response from an LLM or provide additional information to the LLM. Figure 1. Basic components of an agentic system. An agentic system can have more than one AI agent. In most cases, AI agents may be required to interact with other agents within the same system or external systems., They’re expected to engage with humans for feedback or review of outputs from execution steps. AI agents can also comprehend the context of outputs from other agents and humans, and change their course of action and next steps. For example, agents can monitor and optimize various facets of manufacturing operations simultaneously, such as supply chain logistics and production line efficiency. There are certain benefits of having a multi-agent collaboration system instead of having one single agent. You can have each agent customized to do one thing and do it well. For example, one agent can create meeting minutes while another agent writes follow-up emails. It can also be implemented on predictive maintenance, with one agent analyzing machine data to find mechanical issues before they occur while another optimizes resource allocation, ensuring materials and labor are utilized efficiently. You can also provision dedicated resources and tools for different agents. For example, one agent uses a model to analyze and transcribe videos while the other uses models for natural language processing (NLP) and answering questions about the video. Figure 2. Multi-agent collaboration system. MongoDB can act as the memory provider for an agentic system. Conversation history alongside vector embeddings can be stored in MongoDB leveraging the flexible document model. Atlas Vector Search can be used to run semantic search on stored vector embeddings, and our sharding capabilities allow for horizontal scaling without compromising on performance. Our clients across industries have been leveraging MongoDB Atlas for their generative AI use cases , including agentic AI use cases such as Questflow , which is transforming work by using multi-agent AI to handle repetitive tasks in strategic roles. Supported by MiraclePlus and MongoDB Atlas, it enables startups to automate workflows efficiently. As it expands to larger enterprises, it aims to boost AI collaboration and streamline task automation, paving the way for seamless human-AI integration. The concept of a multi-agent collaboration system is new, and it can be challenging for manufacturing organizations to identify the right use case to apply this cutting-edge technology. Below, we propose a use case where three agents collaborate with each other to optimize the performance of a machine. Multi-agent collaboration use case in manufacturing In manufacturing operations, leveraging multi-agent collaboration for predictive maintenance can significantly boost operational efficiency. For instance, consider a production environment where three distinct agents—predictive maintenance, process optimization, and quality assurance—collaborate in real-time to refine machine operations and maintain the factory at peak performance. In Figure 3, the predictive maintenance agent is focused on machinery maintenance. Its main tasks are to monitor equipment health by analyzing sensor data generated from the machines. It predicts machine failures and recommends maintenance actions to extend machinery lifespan and prevent downtime as much as possible. Figure 3. A multi-agent system for production optimization. The process optimization agent is designed to enhance production efficiency. It analyzes production parameters to identify inefficiencies and bottlenecks, and it optimizes said parameters by adjusting them (speed, vibration, etc.) to maintain product quality and production efficiency. This agent also incorporates feedback from the other two agents while making decisions on what production parameter to tune. For instance, the predictive maintenance agent can flag an anomaly in a milling machine temperature sensor reading; for example, if temperature values are going up, the process optimization agent can review the cutting speed parameter for adjustment. The quality assurance agent is responsible for evaluating product quality. It analyzes optimized production parameters and checks how those parameters can affect the quality of the product being fabricated. It also provides feedback for the other two agents. The three agents constantly exchange feedback with each other, and this feedback is also stored in the MongoDB Atlas database as agent short-term memory. In contrast, vector embeddings and sensor data are persisted as long-term memory. MongoDB is an ideal memory provider for agentic AI use case development thanks to its flexible document model, extensive security and data governance features, and horizontal scalability. All three agents have access to a "search_documents" tool, which leverages Atlas Vector Search to query vector embeddings of machine repair manuals and old maintenance work orders. The predictive maintenance agent leverages this tool to figure out additional insights while performing machine root cause diagnostics. Set up the use case shown in this article using our repo . To learn more about MongoDB’s role in the manufacturing industry, please visit our manufacturing and automotive webpage . To learn more about AI agents, visit our Demystifying AI Agents guide .

February 19, 2025

Artificial Intelligence

Smarter Care: MongoDB & Microsoft

Healthcare is on the cusp of a revolution powered by data and AI. Microsoft, with innovations like Azure OpenAI, Microsoft Fabric, and Power BI, has become a leading force in this transformation. MongoDB Atlas complements these advancements with a flexible and scalable platform for unifying operational, metadata, and AI data, enabling seamless integration into healthcare workflows. By combining these technologies, healthcare providers can enhance diagnostics, streamline operations, and deliver exceptional patient care. In this blog post, we explore how MongoDB and Microsoft AI technologies converge to create cutting-edge healthcare solutions through our “Leafy Hospital” demo—a showcase of possibilities in breast cancer diagnosis. The healthcare data challenge The healthcare industry faces unique challenges in managing and utilizing massive datasets. From mammograms and biopsy images to patient histories and medical literature, making sense of this data is often time-intensive and error-prone. Radiologists, for instance, must analyze vast amounts of information to deliver accurate diagnoses, while ensuring sensitive patient data is handled securely. MongoDB Atlas addresses these challenges by providing a unified view of disparate data sources, offering scalability, flexibility, and advanced features like Search and Vector search. When paired with Microsoft AI technologies, the potential to revolutionize healthcare workflows becomes limitless. The leafy hospital solution: A unified ecosystem Our example integrated solution, Leafy Hospital, showcases the transformative potential of MongoDB Atlas and Microsoft AI capabilities in healthcare. Focused on breast cancer diagnostics, this demo explores how the integration of MongoDB’s flexible data platform with Microsoft’s cutting-edge features—such as Azure OpenAI, Microsoft Fabric, and Power BI—can revolutionize patient care and streamline healthcare workflows. The solution takes a three-pronged approach to improve breast cancer diagnosis and patient care: Predictive AI for early detection Generative AI for workflow automation Advanced BI and analytics for actionable insights Figure 1. Leafy hospital solution architecture If you’re interested in discovering how this solution could be applied to your organization’s unique needs, we invite you to connect with your MongoDB account representative. We’d be delighted to provide a personalized demonstration of the Leafy Hospital solution and collaborate on tailoring it for your specific use case. Key capabilities Predictive AI for early detection Accurate diagnosis is critical in breast cancer care. Traditional methods rely heavily on radiologists manually analyzing mammograms and biopsies, increasing the risk of errors. Predictive AI transforms this process by automating data analysis and improving accuracy. BI-RADS prediction BI-RADS (Breast Imaging-Reporting and Data System) is a standardized classification for mammogram findings, ranging from 0 (incomplete) to 6 (malignant). To predict BI-RADS scores, deep learning models like VGG16 and EfficientNetV2L are trained on mammogram images dataset. Fabric Data Science simplifies the training and experimentation process by enabling: Direct data uploads to OneLake for model training Easy comparison of multiple ML experiments and metrics Auto-logging of parameters with MLflow for lifecycle management These models are trained on a significant number of epochs until a reliable accuracy is achieved, offering reliable predictions for radiologists. Biopsy classification In the case of biopsy analysis, classification models such as the random forest classifier are trained on biopsy features like cell size, shape uniformity, and mitoses counts. Classification models attain high accuracy when trained on scalar data, making it highly effective for classifying cancers as malignant or benign. Data ingestion, training, and prediction cycles are well managed using Fabric Data Science and the MongoDB Spark Connector , ensuring a seamless flow of metadata and results between Azure and MongoDB Atlas. Generative AI for workflow automation Radiologists often spend hours documenting findings, which could be better spent analyzing cases. Generative AI streamlines this process by automating report generation and enabling intelligent chatbot interactions. Vector search: The foundation of semantic understanding At the heart of these innovations lies MongoDB Atlas Vector Search , which revolutionizes how medical data is stored, accessed, and analyzed. By leveraging Azure OpenAI’s embedding models, clinical notes and other unstructured data are transformed into vector embeddings—mathematical representations that capture the meaning of the text in a high-dimensional space. Similarity search is a key use case, enabling radiologists to query the system with natural language prompts like “Show me cases where additional tests were recommended.” The system interprets the intent behind the question, retrieves relevant documents, and delivers precise, context-aware results. This ensures that radiologists can quickly access information without sifting through irrelevant data. Beyond similarity search, vector search facilitates the development of RAG architectures , which combine semantic understanding with external contextual data. This architecture allows for the creation of advanced features like automated report generation and intelligent chatbots, which further streamline decision-making and enhance productivity. Automated report generation Once a mammogram or biopsy is analyzed, Azure OpenAI’sLarge Language models can be used to generate detailed clinical notes, including: Findings: Key observations from the analysis Conclusions: Diagnoses and suggested next steps Standardized codes: Using SNOMED terms for consistency This automation enhances productivity by allowing radiologists to focus on verification rather than manual documentation. Chatbots with retrieval-augmented generation Chatbots can be another approach to support radiologists, when they need quick access to historical patient data or medical research. Traditional methods can be inefficient, particularly when dealing with older records or specialized cases. Our retrieval-augmented generation-based chatbot, powered by Azure OpenAI, Semantic Kernel, and MongoDB Atlas, provides: Patient-specific insights: Querying MongoDB for 10 years of patient history, summarized and provided as context to the chatbot Medical literature searches: Using vector search to retrieve relevant documents from indexed journals and studies Secure responses: Ensuring all answers are grounded in validated patient data and research The chatbot improves decision-making and enhances the user experience by delivering accurate, context-aware responses in real-time. Advanced BI and analytics for actionable insights In healthcare, data is only as valuable as the insights it provides. MongoDB Atlas bridges real-time transactional analytics and long-term data analysis, empowering healthcare providers with tools for informed decision-making at every stage. Transactional analytics Transactional, or in-app, analytics deliver insights directly within applications. For example, MongoDB Atlas enables radiologists to instantly access historical BI-RADS scores and correlate them with new findings, streamlining the diagnostic process. This ensures decisions are based on accurate, real-time data. Advanced clinical decision support (CDS) systems benefit from integrating predictive analytics into workflows. For instance, biopsy results stored in MongoDB are enriched with machine learning predictions generated in Microsoft Fabric , helping radiologists make faster, more precise decisions. Long-term analytics While transactional analytics focus on operational efficiency, long-term analytics enable healthcare providers to step back and evaluate broader trends. MongoDB Atlas, integrated with Microsoft Power BI and Fabric, facilitates this critical analysis of historical data. For instance, patient cohort studies become more insightful when powered by a unified dataset that combines MongoDB Atlas’ operational data with historical trends stored in Microsoft OneLake. Long-term analytics also shine in operational efficiency assessments. By integrating MongoDB Atlas data with Power BI, hospitals can create dashboards that track key performance indicators such as average time to diagnosis, wait times for imaging, and treatment start times. These insights help identify bottlenecks, streamline processes, and ultimately improve the patient experience. Furthermore, historical data stored in OneLake can be combined with MongoDB’s real-time data to train machine learning models, enhancing future predictive analytics. OLTP vs OLAP This unified approach is exemplified by the distinction between OLTP and OLAP workloads. On the OLTP side, MongoDB Atlas handles real-time data processing, supporting immediate tasks like alerting radiologists to anomalies. On the OLAP side, data stored in Microsoft OneLake supports long-term analysis, enabling hospitals to identify trends, evaluate efficiency, and train advanced AI models. This dual capability allows healthcare providers to “run the business” through operational insights and “analyze the business” by uncovering long-term patterns. Figure 2. Real-time analytics data pipeline MongoDB’s Atlas SQL Connector plays a crucial role in bridging these two worlds. By converting MongoDB’s flexible document model into a relational format, it allows tools like Power BI to work seamlessly with MongoDB data. Next steps For a detailed, technical exploration of the architecture, including ML notebooks, chatbot implementation code, and dataset resources, visit our Solution Library Building Advanced Healthcare Solutions with MongoDB and Microsoft . Whether you’re a developer, data scientist, or healthcare professional, you’ll find valuable insights to replicate and expand upon this solution! To learn more about how MongoDB can power healthcare solutions, visit our solutions page . Check out our Atlas Vector Search Quick Start guide to get started with MongoDB Atlas Vector Search today.

February 18, 2025

Artificial Intelligence

Supercharge AI Data Management With Knowledge Graphs

WhyHow.AI has built and open-sourced a platform using MongoDB, enhancing how organizations leverage knowledge graphs for data management and insights. Integrated with MongoDB, this solution offers a scalable foundation with features like vector search and aggregation to support organizations in their AI journey. Knowledge graphs address the limitations of traditional retrieval-augmented generation (RAG) systems, which can struggle to capture intricate relationships and contextual nuances in enterprise data. By embedding rules and relationships into a graph structure, knowledge graphs enable accurate and deterministic retrieval processes. This functionality extends beyond information retrieval: knowledge graphs also serve as foundational elements for enterprise memory, helping organizations maintain structured datasets that support future model training and insights. WhyHow.AI enhances this process by offering tools designed to combine large language model (LLM) workflows with Python- and JSON-native graph management. Using MongoDB’s robust capabilities, these tools help combine structured and unstructured data and search capabilities, enabling efficient querying and insights across diverse datasets. MongoDB’s modular architecture seamlessly integrates vector retrieval, full-text search, and graph structures, making it an ideal platform for RAG and unlocking the full potential of contextual data. Check out our AI Learning Hub to learn more about building AI-powered apps with MongoDB. Creating and storing knowledge graphs with WhyHow.AI and MongoDB Creating effective knowledge graphs for RAG requires a structured approach that combines workflows from LLMs, developers, and nontechnical domain experts. Simply capturing all entities and relationships from text and relying on an LLM to organize the data can lead to a messy retrieval process that lacks utility. Instead, WhyHow.AI advocates for a schema-constrained graph creation method, emphasizing the importance of developing a context-specific schema tailored to the user’s use case. This approach ensures that the knowledge graphs focus on the specific relationships that matter most to the user’s workflow. Once the knowledge graphs are created, the flexibility of MongoDB’s schema design ensures that users are not confined to rigid structures. This adaptability enables seamless expansion and evolution of knowledge graphs as data and use cases develop. Organizations can rapidly iterate during early application development without being restricted by predefined schemas. In instances where additional structure is required, MongoDB supports schema enforcement, offering a balance between flexibility and data integrity. For instance, aligning external research with patient records is crucial to delivering personalized healthcare. Knowledge graphs bridge the gap between clinical trials, best practices, and individual patient histories. New clinical guidelines can be integrated with patient records to identify which patients would benefit most from updated treatments, ensuring that the latest practices are applied to individual care plans. Optimizing knowledge graph storage and retrieval with MongoDB Harnessing the full potential of knowledge graphs requires both effective creation tools and robust systems for storage and retrieval. Here’s how WhyHow.AI and MongoDB work together to optimize the management of knowledge graphs. Storing data in MongoDB WhyHow.AI relies on MongoDB’s document-oriented structure to organize knowledge graph data into modular, purpose-specific collections, enabling efficient and flexible queries. This approach is crucial for managing complex entity relationships and ensuring accurate provenance tracking. To support this functionality, the WhyHow.AI Knowledge Graph Studio comprises several key components: Workspaces separate documents, schemas, graphs, and associated data by project or domain, maintaining clarity and focus. Chunks are raw text segments with embeddings for similarity searches, linked to triples and documents to provide evidence and provenance. Graph collection stores the knowledge graph along with metadata and schema associations, all organized by workspace for centralized data management. Schemas define the entities, relationships, and patterns within graphs, adapting dynamically to reflect new data and keep the graph relevant. Nodes represent entities like people, locations, or concepts, each with unique identifiers and properties, forming the graph’s foundation. Triples define subject-predicate-object relationships and store embedded vectors for similarity searches, enabling reliable retrieval of relevant facts. Queries log user queries, including triple results and metadata, providing an immutable history for analysis and optimization. Figure 1. WhyHow.AI platform and knowledge graph illustration. To enhance data interoperability, MongoDB’s aggregation framework enables efficient linking across collections. For instance, retrieving chunks associated with a specific triple can be seamlessly achieved through an aggregation pipeline, connecting workspaces, graphs, chunks, and document collections into a cohesive data flow. Querying knowledge graphs With the representation established, users can perform both structured and unstructured queries with the WhyHow.AI querying system. Structured queries enable the selection of specific entity types and relationships, while unstructured queries enable natural language questions to return related nodes, triples, and linked vector chunks. WhyHow.AI’s query engine embeds triples to enhance retrieval accuracy, bypassing traditional Text2Cypher methods. Through a retrieval engine that embeds triples and enables users to retrieve embedded triples with chunks tied to them, WhyHow.AI uses the best of both structured and unstructured data structures and retrieval patterns. And, with MongoDB’s built-in vector search, users can store and query vectorized text chunks alongside their graph and application data in a single, unified location. Enabling scalability, portability, and aggregations MongoDB’s horizontal scalability ensures that knowledge graphs can grow effortlessly alongside expanding datasets. Users can also easily utilize WhyHow.AI's platform to create modular multiagent and multigraph workflows. They can deploy MongoDB Atlas on their preferred cloud provider or maintain control by running it in their own environments, gaining flexibility and reliability. As graph complexity increases, MongoDB’s aggregation framework facilitates diverse queries, extracting meaningful insights from multiple datasets with ease. Providing familiarity and ease of use MongoDB’s familiarity enables developers to apply their existing expertise without the need to learn new technologies or workflows. With WhyHow.AI and MongoDB, developers can build graphs with JSON data and Python-native APIs, which are perfect for LLM-driven workflows. The same database trusted for years in application development can now manage knowledge graphs, streamlining onboarding and accelerating development timelines. Taking the next steps WhyHow.AI’s knowledge graphs overcome the limitations of traditional RAG systems by structuring data into meaningful entities, relationships, and contexts. This enhances retrieval accuracy and decision-making in complex fields. Integrated with MongoDB, these capabilities are amplified through a flexible, scalable foundation featuring modular architecture, vector search, and powerful aggregation. Together, WhyHow.AI and MongoDB help organizations unlock their data’s potential, driving insights and enabling innovative knowledge management solutions. No matter where you are in your AI journey, MongoDB can help! You can get started with your AI-powered apps by registering for MongoDB Atlas and exploring the tutorials available in our AI Learning Hub . Otherwise, head over to our quick-start guide to get started with MongoDB Atlas Vector Search today. Want to learn more about why MongoDB is the best choice for supporting modern AI applications? Check out our on-demand webinar, “ Comparing PostgreSQL vs. MongoDB: Which is Better for AI Workloads? ” presented by MongoDB Field CTO, Rick Houlihan. If your company is interested in being featured in a story like this, we’d love to hear from you. Reach out to us at ai_adopters@mongodb.com .

February 13, 2025

Artificial Intelligence

Building Gen AI with MongoDB & AI Partners | January 2025

Even for those of us who work in technology, it can be hard to keep track of the awards companies give and receive throughout the year. For example, in the past few months MongoDB has announced both our own awards (such as the William Zola Award for Community Excellence ) and awards the company has received—like the AWS Technology Partner of the Year NAMER and two awards from RepVue. And that’s just us! It can be a lot! But as hard as they can be to follow, industry awards—and the recognition, thanks, and collaboration they represent—are important. They highlight the power and importance of working together and show how companies like MongoDB and partners are committed to building best-in-class solutions for customers. So without further ado, I’m pleased to announce that MongoDB has been named Technology Partner of the Year in Confluent’s 2025 Global Partner Awards ! As a member of the MongoDB AI Applications Program (MAAP) ecosystem, Confluent enables businesses to build a trusted, real-time data foundation for generative AI applications through seamless integration with MongoDB and Atlas Vector Search. Above all, this award is a testament to MongoDB and Confluent’s shared vision: to help enterprises unlock the full potential of real-time data and AI. Here’s to what’s next! Welcoming new AI and tech partners It's been an action-packed start to the year: in January 2025, we welcomed six new AI and tech partners that offer product integrations with MongoDB. Read on to learn more about each great new partner! Base64 Base64 is an all-in-one solution to bring AI into document-based workflows, enabling complex document processing, workflow automation, AI agents, and data intelligence. “MongoDB provides a fantastic platform for storing and querying all kinds of data, but getting unstructured information like documents into a structured format can be a real challenge. That's where Base64 comes in. We're the perfect onramp, using AI to quickly and accurately extract the key data from documents and feed it right into MongoDB,” said Chris Huff, CEO of Base64. “ This partnership makes it easier than ever for businesses to unlock the value hidden in their documents and leverage the full power of MongoDB." Dataloop Dataloop is a platform that allows developers to build and orchestrate unstructured data pipelines and develop AI solutions faster. " We’re thrilled to join forces with MongoDB to empower companies in building multimodal AI agents”, said Nir Buschi, CBO and co-founder of Dataloop. “Our collaboration enables AI developers to combine Dataloop’s data-centric AI orchestration with MongoDB’s scalable database. Enterprises can seamlessly manage and process unstructured data, enabling smarter and faster deployment of AI agents. This partnership accelerates time to market and helps companies get real value to customers faster." Maxim AI Maxim AI is an end-to-end AI simulation and evaluation platform, helping teams ship their AI agents reliably and more than 5x faster. “ We're excited to collaborate with MongoDB to empower developers in building reliable, scalable AI agents faster than ever,” said Vaibhavi Gangwar, CEO of Maxim AI. “By combining MongoDB’s robust vector database capabilities with Maxim’s comprehensive GenAI simulation, evaluation, and observability suite, this partnership enables teams to create high-performing retrieval-augmented generation (RAG) applications and deliver outstanding value to their customers.” Mirror Security Mirror Security offers a comprehensive AI security platform that provides advanced threat detection, security policy management, continuous monitoring ensuring compliance and protection for enterprises. “ We're excited to partner with MongoDB to redefine security standards for enterprise AI deployment,” said Dr. Aditya Narayana, Chief Research Officer, at Mirror Security. “By combining MongoDB's scalable infrastructure with Mirror Security's end-to-end vector encryption, we're making it simple for organizations to launch secure RAG pipelines and trusted AI agents. Our collaboration eliminates security-performance trade-offs, empowering enterprises in regulated industries to confidently accelerate their AI initiatives while maintaining the highest security standards.” Squid AI Squid AI is a full-featured platform for creating private AI agents in a faster, secure, and automated way. “As an AI agent platform that securely connects to MongoDB in minutes, we're looking forward to helping MongoDB customers reveal insights, take action on their data, and build enterprise AI agents,” said Leslie Lee, Head of Product at Squid AI. “ By pairing Squid's semantic RAG and AI functions with MongoDB's exceptional performance , developers can build powerful AI agents that respond to new inputs in real-time.” TrojAI TrojAI is an AI security platform that protects AI models and applications from new and evolving threats before they impact businesses. “ TrojAI is thrilled to join forces with MongoDB to help companies secure their RAG-based AI apps built on MongoDB,” said Lee Weiner, CEO of TrojAI. “We know how important MongoDB is to helping enterprises adopt and harness AI. Our collaboration enables enterprises to add a layer of security to their database initialization and RAG workflows to help protect against the evolving GenAI threat landscape.” But wait, there’s more! In February, we’ve got two webinars coming up with MAAP partners that you don’t want to miss: Build a JavaScript AI Agent With MongoDB and LangGraph.js : Join MongoDB Staff Developer Advocate Jesse Hall and LangChain Founding Software Engineer Jacob Lee for an exclusive webinar that highlights the integration of LangGraph.js, LangChain’s cutting-edge JavaScript library, and MongoDB - live on Feb 25 . Architecting the Future: RAG and Al Agents for Enterprise Transformation : Join MongoDB, LlamaIndex, and Together AI to explore how to strategically build a tech stack that supports the development of enterprise-grade RAG and AI agentic systems, explore technical foundations and practical applications, and learn how the MongoDB Applications Program (MAAP) will enable you to rapidly innovate with AI - content on demand . To learn more about building AI-powered apps with MongoDB, check out our AI Learning Hub and stop by our Partner Ecosystem Catalog to read about our integrations with MongoDB’s ever-evolving AI partner ecosystem.

February 11, 2025

Artificial Intelligence

Automate Network Management Using Gen AI Ops with MongoDB

Imagine that it’s a typical Tuesday afternoon and that you’re the operations manager for a major North American telecommunications company. Suddenly, your Network Operations Center (NOC) receives an alert that web traffic in Toronto has surged by hundreds of percentage points over the last hour—far above its usual baseline. At nearly the same moment, a major Toronto-based client complains that their video streams have been buffering nonstop. Just a few years ago, a scenario like this would trigger a frantic scramble: teams digging into logs, manually writing queries, and attempting to correlate thousands of lines of data in different formats to find a single root cause. Today, there’s a more streamlined, AI-driven approach. By combining MongoDB’s modern database with large language models (LLMs) and a retrieval-augmented generation (RAG) architecture, you can move from reactive “firefighting” to proactive, data-informed diagnostics. Instead of juggling multiple monitoring dashboards or writing complicated queries by hand, you can simply ask for insights—and the system retrieves and analyzes the necessary data automatically. Facing the unexpected traffic spike Now let’s imagine the same situation, but this time with AI-assisted network management. Shortly after you spot a traffic surge in Toronto, your NOC chatbot pings you with a situation report: requests from one neighborhood are skyrocketing, and an unusually high percentage involve video streaming paths or caching servers. Under the hood, MongoDB automatically ingests every log entry and telemetry event in real time—capturing IP addresses, geographic data, request paths, timestamps, router logs, and sensor data. Meanwhile, textual content (such as error messages, user complaints, and chat transcripts) is vectorized and stored in MongoDB for semantic search. This setup enables near-instant access to relevant information whenever a keyword like “buffering,” “video streams,” or “streaming lag” is mentioned, ensuring a fast, end-to-end diagnosis. Refer to this article to learn more about semantic search. Zeroing in on the root cause Instead of rummaging through separate logging tools, you pose a simple natural-language question to the system: “What might be causing the client’s video stream buffering problem in Toronto?” The LLM responds by generating a custom MongoDB Aggregation Pipeline —written in Python code—tailored to your query. It might look something like this: a $match stage to filter for the last twenty-four hours of data in Toronto, a $group stage to roll up metrics by streaming services, and a $sort stage to find the largest error counts. The code is automatically served back to you, and with a quick confirmation, you execute it on your MongoDB cluster. A moment later, the chatbot returns with a summarized explanation that points to an overloaded local CDN node, along with higher-than-expected requests from older routers known to misbehave under peak load. Next, you ask the system to explain the core issue in simpler terms so you can share it with a business stakeholder. The LLM takes the numeric results from the Aggregation Pipeline, merges them with textual logs that mention “firmware out-of-date,” and then outputs a cohesive explanation. It even suggests that many of these older routers are still running last year’s firmware release—a known contributor to buffering issues on video streams during traffic spikes. How retrieval-augmented generation (RAG) helps The power behind this effortless insight is a RAG architecture, which marries semantic search with generative text responses. First, the LLM uses vector search in MongoDB to retrieve only those log entries, complaint records, and knowledge base articles that directly relate to streaming. Once it has these key data chunks, the LLM can generate—and continually refine—its analysis. Figure 1. Network chatbot architecture with MongoDB. When the system references historical data to confirm that “similar spikes occurred during the playoffs last year” or that “users with older firmware frequently complain about buffering,” it’s not blindly guessing. Instead, it’s accessing domain-specific logs, user feedback, and diagnostic documents stored in MongoDB, and then weaving them together into a coherent explanation. This eliminates guesswork and slashes the time your team would otherwise spend on low-level data cleanup, correlation, and interpretation. Executing automated remediation Armed with these insights, your team can roll out a targeted fix, possibly involving an auto-update to the affected routers or load-balancing traffic to alternative CDN endpoints. MongoDB’s Change Streams can monitor for future anomalies. If a traffic spike starts to look suspiciously similar to the scenario you just solved, the system can raise a proactive alert or even initiate the fix automatically. Refer to the official documentation to learn more about the change streams. Meanwhile, the cost savings add up. You no longer need engineers manually piecing data together, nor do you endure prolonged user dissatisfaction while you try to figure out what’s happening. Everything from anomaly detection to root-cause analysis and recommended mitigation steps is fed through a single pipeline—visible and explainable in plain language. A future of AI-driven operations This scenario highlights how (gen) AI Ops and MongoDB complement each other to transform network management: Schema flexibility: MongoDB’s document-based model effortlessly stores logs, performance metrics, and user feedback in a single, consistent environment. Real-time performance: With horizontal scaling, you can ingest the massive volumes of data generated by network logs and user requests at any hour of the day. Vector search integration: By embedding textual data (such as logs, user complaints, or FAQs) and storing those vectors in MongoDB, you enable instant retrieval of semantically relevant content—making it easy for an LLM to find exactly what it needs. Aggregation + LLM: An LLM can auto-generate MongoDB Aggregation Pipelines to sift through numeric data with ease, while a second pass to the LLM composes a final summary that merges both numeric and textual analysis. Once you see how much time and effort this end-to-end workflow saves, you can extend it across the entire organization. Whether it’s analyzing sudden traffic spikes in specific geographies, diagnosing a security event, or handling peak online shopping loads during a holiday sale, the concept remains the same: empower people to ask natural-language questions about complex data, rely on AI to craft the specialized queries behind the scenes, and store it all in a platform that can handle unbounded complexity. Ready to embrace gen AI ops with MongoDB? Network disruptions will never fully disappear, but how quickly and intelligently you respond can be a game-changer. By uniting MongoDB with LLM-based AI and a retrieval-augmented generation (RAG) strategy, you transform your network operations from a tangle of logs and dashboards into a conversational, automated, and deeply informed system. Sign up for MongoDB Atlas to start building your own RAG-based workflows. With intelligent vector search, automated pipeline generation, and natural-language insight, you’ll be ready to tackle everything from video streams buffering complaints to the next unexpected traffic surge—before users realize there’s a problem. If you would like to learn more about how to build gen AI applications with MongoDB, visit the following resources: Learn more about MongoDB capabilities for artificial intelligence on our product page. Get started with MongoDB Vector Search by visiting our product page. Blog: Leveraging an Operational Data Layer for Telco Success Want to learn more about why MongoDB is the best choice for supporting modern AI applications? Check out our on-demand webinar, “ Comparing PostgreSQL vs. MongoDB: Which is Better for AI Workloads? ” presented by MongoDB Field CTO, Rick Houlihan.

February 5, 2025

Artificial Intelligence

Ready to get Started with MongoDB Atlas?

Start Free