MongoDB Blog

Announcements, updates, news, and more

Modernize On-Prem MongoDB With Google Cloud Migration Center

Shifting your business infrastructure to the cloud offers significant advantages, including enhanced system performance, reduced operational costs, and increased speed and agility. However, a successful cloud migration isn’t a simple lift-and-shift. It requires a well-defined strategy, thorough planning, and a deep understanding of your existing environment to align with your company’s unique objectives. Google Cloud’s Migration Center is designed to simplify this complex process, acting as a central hub for your migration journey. It streamlines the transition from your on-premises servers to the Google Cloud environment, offering tools for discovery, assessment, and planning. MongoDB is excited to announce a significant enhancement to Google Cloud Migration Center: integrated MongoDB cluster assessment in the Migration Center Use Case Navigator. Google Cloud and MongoDB have collaborated to help you gain in-depth visibility into your MongoDB deployments, both MongoDB Community Edition and MongoDB Enterprise Edition , and simplify your move to the cloud. To understand the benefits of using Migration Center, let’s compare it with the process of migrating without it. Image 1. Image of the Migration Center Use Case Navigator menu, showing migration destinations for MongoDB deployments. Migrating without Migration Center Manual discovery: Without automation, asset inventories were laborious, leading to frequent errors and omissions. Complex planning: Planning involved cumbersome spreadsheets and manual dependency mapping, making accurate cost estimation and risk assessment difficult. Increased risk: Lack of automated assessment resulted in higher migration failure rates and potential data loss, due to undiscovered compatibility issues. Fragmented tooling: Disparate tools for each migration phase created inefficiencies and complexity, hindering a unified migration strategy. Higher costs and timelines: Manual processes and increased risks significantly lengthened project timelines and inflated migration costs. Specialized skill requirement: Migrating required teams to have deep specialized knowledge of all parts of the infrastructure being moved. Migrating with Migration Center When you move to the cloud, you want to make your systems better, reduce costs, and improve performance. A well-planned migration helps you do that. With Migration Center’s new MongoDB assessment, you can: Discover and inventory your MongoDB clusters: Easily identify all your MongoDB Community Server and MongoDB Enterprise Server clusters running in your on-premises environment. Gain deep insights: Understand the configuration, performance, and resource utilization of your MongoDB clusters. This data is essential for planning a successful and cost-effective migration. Simplify your migration journey: By providing a clear understanding of your current environment, Migration Center helps you make informed decisions and streamline the migration process, minimizing risk and maximizing efficiency. Use a unified platform: Migration Center is designed to be a one-stop shop for your cloud migration needs. It integrates asset discovery, cloud spend estimation, and various migration tools, simplifying your end-to-end journey. Accelerate using MongoDB Atlas : Migrate your MongoDB workloads to MongoDB Atlas running on Google Cloud with confidence. Migration Center provides the data you need to ensure a smooth transition, enabling you to fully use the scalability and flexibility of MongoDB Atlas. By providing MongoDB workload identification and guidance, the Migration Center Use Case Navigator enables you to gain valuable insights into the potential transformation journeys for your MongoDB workloads. With the ability to generate comprehensive reports on your MongoDB workload footprint, you can better understand your MongoDB databases. This ultimately enables you to update your systems and gain the performance enhancement of using MongoDB Atlas on Google Cloud, all while saving money. Learn more about Google Cloud Migration Center from the documentation . Visit our product page to learn more about MongoDB Atlas . Get started with MongoDB Atlas on Google Cloud today.

April 8, 2025
Updates

Firebase & MongoDB Atlas: A Powerful Combo for Rapid App Development

Firebase and MongoDB Atlas are powerful tools developers can use together to build robust and scalable applications. Firebase offers build and runtime solutions for AI-powered experiences, while MongoDB Atlas provides a fully managed cloud database service optimized for generative AI applications. We’re pleased to announce the release of the Firebase extension MongoDB Atlas , a direct MongoDB connector for Firebase that further streamlines the development process by enabling seamless integration between the two platforms. This extension enables developers to directly interact with MongoDB collections and documents from within their Firebase projects, simplifying data operations and reducing development time. A direct MongoDB connector, built as a Firebase extension , facilitates real-time data synchronization between Firebase and MongoDB Atlas. This enables data consistency across both platforms, empowering developers to build efficient, data-driven applications using the strengths of Firebase and MongoDB. MongoDB as a backend database for Firebase applications Firebase offers a streamlined backend for rapid application development, providing offerings like authentication, hosting, and real-time databases. However, applications requiring complex data modeling, high data volumes, or sophisticated querying often work well with MongoDB’s document store. Integrating MongoDB as the primary data store alongside Firebase addresses these challenges. MongoDB provides a robust document database with a rich query language (MongoDB Query Language), powerful indexing (including compound, geospatial, and text indexes), and horizontal scalability for handling massive datasets. This architecture enables developers to use Firebase’s convenient backend services while benefiting from MongoDB’s powerful data management capabilities. Developers commonly use Firebase Authentication for user management, then store core application data, including complex relationships and large volumes of information, in MongoDB. This hybrid approach combines Firebase’s ease of use with MongoDB’s data-handling prowess. Furthermore, the integration of MongoDB Atlas Vector Search significantly expands the capabilities of this hybrid architecture. Modern applications increasingly rely on semantic search and AI-driven features, which require efficient handling of vector embeddings. MongoDB Atlas Vector Search enables developers to perform similarity searches on vector data, unlocking powerful use cases Quick-start guide for Firebase’s MongoDB Atlas extension With the initial release of the MongoDB Atlas extension in Firebase, we are targeting the extension to perform operations such as findOne , insertOne , and vectorSearch on MongoDB. This blog will not cover how to create a Firebase application but will walk you through creating a MongoDB backend for connecting to MongoDB using our Firebase extension. To learn more about how to integrate the deployed backend into a Firebase application, see the official Firebase documentation . Install the MongoDB Atlas extension in Firebase. Open the Firebase Extensions Hub. Find and select the MongoDB Atlas extension. Or use the search bar to find “MongoDB Atlas.” Click on the extension card. Click the “Install” button. You will be redirected to the Firebase console. On the Firebase console, choose the Firebase project where you want to install the extension. Image 1. Image of the MongoDB Atlas extension’s installation page. On the installation page: Review “Billing and Usage.” Review “API Endpoints.” Review the permissions granted to the function that will be created. Configure the extension: Provide the following configuration details: MongoDB URI: The connection string for your MongoDB Atlas cluster Database Name: The name of the database you want to use Collection Name: The name of the collection you want to use Vertex AI Embedding to use: The type of embedding model from Vertex AI Vertex AI LLM model name: The name of the large language model (LLM) model from Vertex AI MongoDB Index Name: The name of the index in MongoDB MongoDB Index Field: The field that the index is created upon MongoDB Embedding Field: The field that contains the embedding vectors LLM Prompt: The prompt that will be sent to the LLM Click on “Install Extension.” Image 2. Image of the MongoDB Atlas extension created from Firebase extension hub. Once the extension is created, you can interact with it through the associated Cloud Function. Image 3. Firebase extension created cloud run function In conclusion, the synergy between Firebase extensions and MongoDB Atlas opens up exciting possibilities for developers seeking to build efficient, scalable, AI-powered applications. By using Firebase’s streamlined backend services alongside MongoDB’s robust data management and vector search capabilities, developers can create applications that handle complex data and sophisticated AI functionalities with ease. The newly introduced Firebase extension for MongoDB Atlas, specifically targeting operations like findOne , insertOne , and vectorSearch , marks a significant step toward simplifying this integration. While this initial release provides a solid foundation, the potential for further enhancements, such as direct connectors and real-time synchronization, promises to further empower developers. As demonstrated through the quick-start guide, setting up this powerful combination is straightforward, enabling developers to quickly harness the combined strength of these platforms. Ultimately, this integration fosters a more flexible and powerful development environment, enabling the creation of innovative, data-driven applications that meet the demands of modern users. Build your application with a pre-packaged solution using Firebase . Visit our product page to learn more about MongoDB Atlas .

April 7, 2025
Updates

Next-Generation Mobility Solutions with Agentic AI and MongoDB Atlas

Driven by advancements in vehicle connectivity, autonomous systems, and electrification, the automotive and mobility industry is currently undergoing a significant transformation. Vehicles today are sophisticated machines, computers on wheels, that generate massive amounts of data, driving demand for connected and electric vehicles. Automotive players are embracing artificial intelligence (AI), battery electrical vehicles (BEVs), and software-defined vehicles (SDVs) to maintain their competitive advantage. However, managing fleets of connected vehicles can be a challenge. As cars get more sophisticated and are increasingly integrated with internal and external systems, the volume of data they produce and receive greatly increases. This data needs to be stored, transferred, and consumed by various downstream applications to unlock new business opportunities. This will only grow: the global fleet management market is projected to reach $65.7 billion by 2030, growing at a rate of almost 10.8% annually. A 2024 study conducted by Webfleet showed that 32% of fleet managers believe AI and machine learning will significantly impact fleet operations in the coming years; optimizing route planning and improving driver safety are the two most commonly cited use cases. As fleet management software providers continue to invest in AI, the integration of agentic AI can significantly help with things like route optimization and driver safety enhancement. For example, AI agents can process real-time traffic updates and weather conditions to dynamically adjust routes, ensuring timely deliveries while advising drivers on their car condition. This proactive approach contrasts with traditional reactive methods, improving vehicle utilization and reducing operational and maintenance costs. But what are agents? In short, they are operational applications that attempt to achieve goals by observing the world and acting upon it using the data and tools the application has at its disposal. The term "agentic" denotes having agency, as AI agents can proactively take steps to achieve objectives without constant human oversight. For example, rather than just reporting an anomaly based on telemetry data analysis, an agent for a connected fleet could autonomously cross-check that anomaly against known issues, decide whether it's critical or not, and schedule a maintenance appointment all on its own. Why MongoDB for agentic AI Agentic AI applications are dynamic by nature as they require the ability to create a chain of thought, use external tools, and maintain context across their entire workflow. These applications generate and consume diverse data types, including structured and unstructured data. MongoDB’s flexible document model is uniquely suited to handle both structured and unstructured data as vectors. It allows all of an agent’s context, chain-of-thought, tools metadata, and short-term and long-term memory to be stored in a single database. This means that developers can spend more time on innovation and rapidly iterate on agent designs without being constrained by rigid schemas of a legacy relational database. Figure 1. Major components of an AI agent. Figure 1 shows the major components of an AI agent. The agent will first receive a task from a human or via an automated trigger, and will then use a large language model (LLM) to generate a chain of thought or follow a predetermined workflow. The agent will use various tools and models during its run and store/retrieve data from a memory provider like MongoDB Atlas . Tools: The agent utilizes tools to interact with the environment. This can contain API methods, database queries, vector search, RAG application, anything to support the model Models: can be a large language model (LLM), vision language model (VLM), or a simple supervised machine learning model. Models can be general purpose or specialized, and agents may use more than one. Data: An agent requires different types of data to function. MongoDB’s document model allows you to easily model all of this data in one single database. An agentic AI spans a wide range of functional tools and context. The underlying data structures evolve throughout the agentic workflow and as an agent uses different tools to complete a task. It also builds up memory over time. Let us list down the typical data types you will find in an agentic AI application. Data types: Agent profile: This contains the identity of the agent. It includes instructions, goals and constraints. Short-term memory: This holds temporary, contextual information—recent data inputs or ongoing interactions—that the agent uses in real-time. For example, short-term memory could store sensor data from the last few hours of vehicle activity. In certain agentic AI frameworks like Langgraph, short term memory is implemented through a checkpointer. The checkpointer stores intermediate states of the agent’s actions and/or reasoning. This memory allows the agent to seamlessly pause and resume operations. Long-term memory: This is where the agent stores accumulated knowledge over time. This may include patterns, trends, logs and historical recommendations and decisions. By storing each of these data types into rich, nested documents in MongoDB, AI developers can create a single-view representation of an agent’s state and behavior. This enables fast retrieval and simplifies development. In addition to the document model advantage, building agentic AI solutions for mobility requires a robust data infrastructure. MongoDB Atlas offers several key advantages that make it an ideal foundation for these AI-driven architectures. These include: Scalability and flexibility: Connected Car platforms like fleet management systems need to handle extreme data volumes and variety. MongoDB Atlas is proven to scale horizontally across cloud clusters, letting you ingest millions of telemetry events per minute and store terabytes of telemetry data with ease. For example, the German company ZF uses MongoDB to process 90,000 vehicle messages per minute (over 50 GB of data per day) from hundreds of thousands of connected cars​. The flexibility of the document model accelerates development and ensures your data model stays aligned with the real-world entities it represents. Built-in vector search: AI agents require a robust set of tools to work with. One of the most widely used tools is vector search, which allows agents to perform semantic searches on unstructured data like driver logs, error codes descriptions, and repair manuals. MongoDB Atlas Vector Search allows you to store and index high-dimensional vectors alongside your documents and to perform semantic search over unstructured data. In practice, this means your AI embeddings live right next to the relevant vehicle telemetry and operational data in the database, simplifying architectures for use cases like the connected car incident advisor, in which a new issue can be matched against past issues before passing contextual information to the LLM. For more, check out this example of how an automotive OEM leverages vector search for audio based diagnostics with MongoDB Atlas Vector Search. Time series collections and real-time data processing: MongoDB Atlas is designed for real-time applications. It provides time series collections for connected car telemetry data storage, change streams, and triggers that can react to new data instantly. This is crucial for agentic AI feedback loops, where ongoing data ingestion and learning are happening continuously. Best-in-class embedding models with Voyage AI: In early 2025, MongoDB acquired Voyage AI , a leader in embedding and reranking models. Voyage AI embedding models are currently being integrated into MongoDB Atlas, which means developers will no longer need to manage external embedding APIs, standalone vector stores, or complex search pipelines. AI retrieval will be built into the database itself, making semantic search, vector retrieval, and ranking as seamless as traditional queries. This will reduce the time required for developing agentic AI applications. Agentic AI in action: Connected fleet incident advisor Figure 2 shows a list of use cases in the Mobility sector, sorted by various capabilities that an agent might demonstrate. AI agents excel at managing multi-step tasks via context management across tasks, they automate repetitive tasks better than Robotic process automation (RPA), and they demonstrate human-like reasoning by revisiting and revising past decisions. These capabilities enable a wide range of applications both during the manufacturing of a vehicle and while it's on the road, connected and sending telemetry. We will review a use case in detail below, and will see how it can be implemented using MongoDB Atlas, LangGraph, Open AI, and Voyage AI. Figure 2. Major use cases of agentic AI in the mobility and manufacturing sectors. First, the AI agent connects to traditional fleet management software and supports the fleet manager in diagnosing and advising the drivers. This is an example of a multi-step diagnostic workflow that gets triggered when a driver submits a complaint about the vehicle's performance (for example, increased fuel consumption). Figure 3 shows the sequence diagram of the agent. Upon receiving the driver complaint, it creates a chain of thought that follows a multi-step diagnostic workflow where the system ingests vehicle data such as engine codes and sensor readings, generates embeddings using the Voyage AI voyage-3-large embedding model, and performs a vector search using MongoDB Atlas to find similar past incidents. Once relevant cases are identified, those–along with selected telemetry data–are passed to OpenAI gpt-4o LLM to generate a final recommendation for the driver (for example, to pull off immediately or to keep driving and schedule regular maintenance). All data, including telemetry, past issues, session logs, agent profiles, and recommendations are stored in MongoDB Atlas, ensuring traceability and the ability to refine diagnostics over time. Additionally, MongoDB Atlas is used as a checkpointer by LangGraph, which defines the agent's workflow. Figure 3. Sequence diagram for a connected fleet advisor agentic workflow. Figure 4 shows the agent in action, from receiving an issue to generating a recommendation. So by leveraging MongoDB’s flexible data model and powerful Vector Search capabilities, we can agentic AI can transform fleet management through predictive maintenance and proactive decision-making. Figure 4. The connected fleet advisor AI agent in action. To set up the use case shown in this article, please visit our GitHub repository . And to learn more about MongoDB’s role in the automotive industry, please visit our manufacturing and automotive webpage . Want to learn more about why MongoDB is the best choice for supporting modern AI applications? Check out our on-demand webinar, “ Comparing PostgreSQL vs. MongoDB: Which is Better for AI Workloads? ” presented by MongoDB Field CTO, Rick Houlihan.

April 4, 2025
Artificial Intelligence

Why MongoDB is the Perfect Fit for a Unified Namespace

Smart manufacturing is transforming the industrial world by combining IoT, AI, and cloud technologies to create connected, data-driven production environments. Manufacturers embracing this shift are seeing real, measurable benefits: Deloitte reports that smart factory initiatives can boost manufacturing productivity by up to 12% and improve overall equipment effectiveness by up to 20%. But achieving these gains isn’t always straightforward. Many manufacturers still face the challenge of siloed data and legacy systems, making it difficult to get a real-time, holistic view of operations. Shop floor data, enterprise resource planning (ERP) systems, manufacturing execution system (MES) platforms, and other sources often operate in isolation, limiting the potential for optimization. The concept of a Unified Namespace model, which provides a single source of truth for all operational data, is a game-changing approach that helps unify these siloed systems into a cohesive ecosystem. MongoDB, with its powerful document-based model , is perfectly suited to be the backbone of this Unified Namespace model, acting as a flexible, scalable, highly available, and real-time repository that can seamlessly integrate and manage complex manufacturing data. In this blog post, we’ll explore how MongoDB’s architecture and capabilities align perfectly with the needs of a UNS, and how our "Leafy Factory" demo serves as a strong proof point of this alignment. Understanding the Unified Namespace and its importance in manufacturing A Unified Namespace (UNS) is an architecture in which production data across an organization is consolidated into one central data repository. In a manufacturing setup, a UNS enables the integration of diverse sources like ERP for business operations, MES for production monitoring, and real-time shop floor data. This centralized model provides a single, consistent view of data, allowing teams across the organization to access reliable information for decision-making. By unifying data from various systems, a UNS makes it significantly easier to connect disparate systems and ensures that data can be shared seamlessly across platforms, reducing complexity and integration overhead. Unlike the traditional automation pyramid, in which information flows hierarchically from sensors up through control systems, MES, and finally to ERP, the UNS breaks down these layers. It creates a flat, real-time data model that allows every system to access and contribute to a shared source of truth—eliminating delays, redundancies, and disconnects between layers. One of the most impactful advantages of a UNS is real-time data visibility. By centralizing live data streams from the production floor, it provides stakeholders—from operators to executives—with up-to-the-second insights. This immediacy empowers teams to make informed decisions quickly, respond to issues as they arise, and continuously optimize operations. And because the UNS consolidates all data into one namespace, it also unlocks cross-functional insights. Teams can correlate metrics across departmental boundaries—for instance, comparing machine uptime with production targets and financial performance. This integrated perspective enables more strategic planning, better alignment across departments, and continuous improvement initiatives grounded in data. The importance of flexible data to UNS success A key prerequisite for a successful UNS implementation is high adaptability. The model must be capable of easily incorporating new data sources, machines, or production lines without requiring a complete overhaul of the data architecture. This flexibility ensures that as operations evolve and scale, the UNS can grow with them—maintaining a unified and responsive data environment. While the UNS itself does not perform functions like predictive maintenance or cost optimization, it serves as the foundational data layer that enables such advanced applications. By centralizing and contextualizing historical and real-time data on machinery, materials, and production workflows, a UNS provides the essential infrastructure for building IoT-based solutions. With this data in place, manufacturers can develop predictive maintenance strategies, detect anomalies, and optimize costs—leading to reduced downtime, better resource utilization, and smarter decision-making. Together, these capabilities make the Unified Namespace a foundational element for smart manufacturing—bridging systems, enhancing visibility, and enabling data-driven transformation at scale. Figure 1. The automation pyramid versus a Unified Namespace. MongoDB as the ideal central repository for a UNS model The requirements of a UNS model map directly to MongoDB's strengths, making it an ideal choice for manufacturing environments seeking to unify their data. Manufacturing environments always deal with highly variable and constantly evolving data structures, ranging from raw machine sensor data to structured ERP records. This diversity presents a challenge for traditional relational databases, which rely on rigid schemas that are difficult to adapt. MongoDB, with its document-oriented design, offers a more flexible solution. By storing data in JSON-like structures, it allows manufacturers to easily accommodate changes—such as adding new sensors or modifying machine attributes—without the need to redesign a fixed schema. Another key requirement in smart manufacturing is the ability to process data in real time. With streaming data from multiple sources flowing into a UNS, manufacturers can maintain up-to-date information that supports timely interventions and data-driven decision-making. MongoDB supports real-time data ingestion through technologies like Kafka, change streams, and MQTT. This makes it simple to capture live data directly from shop floor machines into a time series collection and synchronize it with information from ERP and MES. Live shopfloor data, ERP, and MES information in one database—combined with MongoDB’s powerful querying, aggregating and analytics capabilities—allows teams to analyze and correlate diverse data streams in one platform. For instance, production teams can cross-reference MES quality metrics with sensor data to uncover patterns that lead to improved quality control. Finance teams can blend ERP cost data with MES output to gain a more comprehensive view of operational efficiency and cost drivers. What’s more, MongoDB’s distributed architecture supports horizontal scaling, which is crucial for large manufacturing operations where data volumes grow quickly. As more machines and production lines are brought online, MongoDB clusters can be expanded seamlessly, ensuring the UNS remains performant and responsive under increasing load. And by serving as a central repository for historical machine sensor data, a UNS allows manufacturers to analyze long-term patterns, detect anomalies, and anticipate maintenance needs. This approach helps reduce unplanned downtime, optimize maintenance schedules, and ultimately lower operational costs. However, with a UNS acting as a centralized data hub, high availability becomes critical—since any failure could disrupt the entire data ecosystem. MongoDB addresses this with replica sets, which provide ultra-high availability and allow updates without any downtime, eliminating the risk of a single point of failure. Proof point: Building a UNS on MongoDB in the "leafy factory" As shown below, MongoDB’s "Leafy Factory" demo offers a hands-on example of how MongoDB serves as an ideal central repository within a UNS for manufacturing. The demo simulates a realistic industrial environment, combining data from SQL-based ERP and MES systems with real-time MQTT streams from shop floor machines. This setup showcases MongoDB’s ability to consolidate and manage diverse data types into a single, accessible, and continuously updated source of truth. Figure 2. Leafy factory UNS architecture. In the demo, SQL data from a simulated MES is ingested into MongoDB. This includes key production planning, monitoring, and quality metrics—all seamlessly captured using MongoDB’s flexible, document-based JSON format. This structure allows the MES data to remain both organized and accessible for real-time analysis and reporting. Similarly, SQL-based ERP data (like work orders, material tracking, and cost breakdowns) is integrated using a combination of Kafka change streams and the MongoDB Sink connector . The SQL data is captured into Kafka topics using the Debezium connector, with SQL acting as a Kafka producer. The data is then consumed, transformed, and inserted into MongoDB via the MongoDB Sink connector, creating a seamless connection between SQL, Kafka, and MongoDB. This approach keeps ERP data continuously synchronized in MongoDB, demonstrating its reliability as a live source of business-critical information. At the same time, simulated MQTT data streams feed real-time shop floor data into the database, including machine status, quality outputs, and sensor readings like temperature and vibration. MongoDB’s support for real-time ingestion ensures that this data is immediately available, enabling up-to-date machine monitoring and faster response times. Change streams play a central role by enabling real-time data updates across systems. For instance, when a work order is updated in the ERP system, the change is automatically reflected downstream in MES and shop floor views—illustrating MongoDB’s capability for bi-directional data flows and live synchronization within a unified data model. Another critical capability shown in the demo is data contextualization and enrichment. As data enters the UNS, MongoDB enriches it with metadata such as machine ID, operator name, and location according to the ISA95 structure. This enriched model allows for fine-grained analysis and filtering, which is crucial for generating actionable, cross-functional insights across manufacturing, operations, and business teams. Together, the Leafy Factory demo not only validates MongoDB’s technical strengths—like real-time processing, flexible data modeling, and scalable architecture—but also demonstrates how these capabilities come together to support a robust, dynamic, and future-ready Unified Namespace for smart manufacturing. Conclusion A Unified Namespace is essential for modern manufacturing, offering a single, consistent view of data that drives operational efficiency, cross-functional insights, and cost savings. MongoDB, with its flexible schema, real-time data processing, and scalability, is uniquely suited to serve as the central repository in a UNS. The Leafy Factory demo showcases MongoDB’s potential in consolidating ERP, MES, and shop floor data, illustrating how MongoDB can transform manufacturing data management, enabling real-time insights and data-driven decision-making. In choosing MongoDB as the backbone of a UNS, manufacturers gain a powerful data infrastructure that not only meets current operational needs but also scales with future growth, creating an agile, responsive, and insight-driven manufacturing environment. Set up the use case shown in this article using our repository . And, to learn more about MongoDB’s role in the automotive industry, please visit our manufacturing and automotive webpage .

April 3, 2025
Applied

MongoDB 8.0: Improving Performance, Avoiding Regressions

MongoDB 8.0 is the most secure, durable, available and performant version of MongoDB yet: it’s 36% faster in read workloads and 32% faster in mixed read and write workloads than MongoDB 7.0 In addition to benefitting customers, MongoDB 8.0’s performance has also brought significant benefits to our own internal applications, as my colleague Jonathan Brill recently noted in his own blog post . To achieve these improvements, we created an 8.0 multi-disciplinary tiger team focused on performance, and eventually expanded this team into a broad “performance army.” The work by these engineers led to new ideas on how to process simple queries and how writes are replicated. Combined with a new way of measuring performance, we also added a new way to catch the gradual performance loss over time due to many miniscule regressions. Figure 1. MongoDB 8.0 benchmark results. Benchmarking at MongoDB The MongoDB Engineering team runs a set of benchmarks internally to measure MongoDB’s performance. Industry standard benchmarks like YCSB, Linkbench, TPCC, and TPCH are run periodically on a variety of configurations and architectures, and these benchmarks are augmented by custom benchmarks based on customer workloads. By running these benchmarks in our continuous integration system, we ensure that developers do not make commits that are detrimental to performance. For instance, if any commit would regress a benchmark by more than 5% for our most important workloads, we would revert the commit. However, this threshold does not detect regressions of 0.1%, and there are thousands of commits per release (e.g., more than 9000 in MongoDB 8.0). During the release of MongoDB 7.0, we started to take this gradual accumulation of performance loss by tiny regressions of release over release regressions seriously, so we changed the rules of the game. We decided we could not ship MongoDB 7.0 unless it at least matched MongoDB 6.0’s performance on the most important benchmarks. We began investigating regressions and made changes to get performance back. Typically, we use tools like Intel VTune and Linux perf to find regressions across releases. With the release of MongoDB 7.0 approaching, engineers limited the scope of these fixes to reduce their risk to the release. Some proposed fixes were considered too risky. Other fixes didn’t deliver statistically significant performance improvements (Z-score > 1). Unfortunately, MongoDB lost performance with many tiny cuts at a time, and our team realized that it would take many tiny steps to improve it. We got performance back to MongoDB 6.0’s levels, but we weren't quite satisfied. We knew that what we started with MongoDB 7.0 would need to continue into MongoDB 8.0 as a first-tier concern from the start. The MongoDB 8.0 performance push For the release of MongoDB 8.0, we increased the priority of performance over other work and set the goal of matching MongoDB 4.4’s performance at the start. This release was chosen because it switched the default to Write Concern Majority for replica sets. This change in write concern improved MongoDB’s default durability guarantees but came with a loss in performance since the primary needs to wait for a second write on a second machine to be performed. Before the release of MongoDB 4.4, the default write concern was w:1; when a client inserted a document, the response was returned to the client as soon as the write was journaled to the local disk. With write concern majority, the MongoDB server waits for a majority of the nodes to write the document to disk before returning a response. On the primary, MongoDB server inserts the document in the collection, journals this change to disk, sends the change to the secondary where it also journals the document to disk and then inserts the document into its collection. Applying the change immediately to collection on the secondary minimizes the latency for the secondary reads. Figure 2. MongoDB replication writes in MongoDB 7.0. To start our journey to improving the performance of MongoDB 8.0, we created a multi-disciplinary tiger team of 10 people in August 2023, with myself as the leader. The team comprised two performance engineers, three staff engineers, two senior staff engineers, a senior lead, and one technical program manager. Our team of ten worked together to generate ideas, proofs of concept, and experiments. The team’s process was different from our normal process, as we focused on idea experimentation, versus making ideas production-ready. I gave the team free reign to make any changes they thought could help, and I encouraged experimentation—the MongoDB 8.0 performance tiger team was a safe space. This spirit of experimentation was both important and successful, as it led to new ideas that delivered several big improvements (which are highlighted below). We were able to try quick hacks and measure their performance without having to worry about making our work production quality. The big improvements Two of the big improvements we made to MongoDB 8.0 came out of this team: simple finds and replication latency. MongoDB supports a rich query language, but a lot of queries are simple ones to look up a document by a single _id field; the _id field always has a unique index. MongoDB optimized this with a special query plan stage called IDHACK—a query stage optimized to retrieve a single document with a minimal code path. When the tiger team looked at this code, we realized that it was spending a lot of time going through the general purpose query planning code paths before choosing the IDHACK plan. So, a tiger team member did an experiment to bypass the entire query planner and hard code reading from the storage engine. When this delivered significant improvements to the YCSB 100% read, we knew we had a winner. While we knew it could not be committed as-is, it did serve as motivation to improve the IDHACK code path in the server in a new code path called ExpressPlan. The query team took this idea and ran with it by expanding it further for updates, deletes, and other unique index lookups. Here are traces for MongoDB from LLVM XRay and Perfetto . The highlighted red areas show the difference between 7.0 and 8.0 for query planning for a db.coll.find({_id:1}) . Figure 3. Comparing MongoDB 7.0 and MongoDB 8.0. The second big change was how we viewed replicating changes in a cloud database. As explained above, on secondaries, MongoDB journals the writes and then applies it to the collection before acknowledging it back to the primary. During a team brainstorming session, a tiger team member asked, “what if we acknowledge the write as soon as it is journaled, but before we applied it to the collection in-memory?” This reduces the latency of the primary and speeds up writes in a replica set while still maintaining our durability guarantees. A second engineer ran with this idea, prototyped it quickly, and proved that it provided a significant performance boost in a week. Now that the idea was proven to be beneficial, we handed it to the replication team to ship this work. Shipping this change took three months because we had to prove it was correct in the TLA+ models for our replication system and all corner cases before we could ship it. Catching very small regressions To detect small regressions, it is important to have benchmarks with no or low noise. But if the threshold is too small, this creates needless noise and creates a very noisy or flakey test that developers will learn to ignore. Given the noisiness of various metrics such as latency and throughput, a tiger team member came up with the idea of simply counting instructions via Linux perf_event_open syscall. In this test, the code exercises the request processing code to do a simple MongoDB ping command. We run the ping command in a loop on a CI machine a few times and report the average instruction count. This test has a 0.2% tolerance and uses a hard code number. Developers can adjust the threshold up or down as needed, but this test has been a huge success as it allows us to detect regressions without spurious noise. Check out the benchmark on GitHub . From tiger team to (tiger) army A small tiger team can only do so much, and we didn’t want to create a situation in which one team ships features only for another team to clean up their work later. For example, the MongoDB 8.0 performance tiger team focused on a subset of benchmarks, but MongoDB’s performance is measured with dozens of benchmarks. From November 2023 to January 2024, we started implementing all the performance ideas that the tiger team implemented, but more work remained to improve performance. This was when we built a performance “army”—we enlisted 75 people from across the 11 MongoDB server teams to work on performance. In this phase of the project, engineers were charged with idea generation, and fixing performance issues allowed us to accomplish even more than the tiger team; the larger team finished eight performance projects and 140 additional tickets as part of this work. By bringing in additional team members, we were able to draw on ideas from a larger pool of database experts. This led to improvements in a wide variety of areas—like parsing of large $in queries, improvements to query yielding, making config.transactions a clustered collection, reworking locking in count less places, micro optimizations in authorization checks, and a change to a new TCMalloc memory allocator with lower fragmentation. Engineers also looked at improving common code such as namespace string handling, our custom code generation (we found tries helped speed up generated parsers), reducing memory usage, and choosing better data structures in some cases. To give people the time and space they needed to succeed, we gave them dedicated weeks of time to focus on this work in lieu of adding new features. We encouraged both experimentation and for people to go with their gut feelings for small improvements that didn’t appear to move the needle on performance. Because not every experiment succeeded, it was important to encourage each other to keep experimenting and trying in the face of failure. For example, in one failed experiment two engineers tried to use restartable sequences on Linux, but the change failed to deliver the improvements we wanted given their cost and complexity. On the other hand, custom containers and reader writer mutexes did deliver. For my part, the most impactful thing I did during this phase was to be a cheerleader and to support the team’s efforts in our performance push. Being positive and optimistic helped people push forward in their performance work even when ideas didn’t work out. Performance improvements take a village Overall, MongoDB 8.0 was our most successful release ever in terms of performance. Concerted, innovative work by a passionate team—and later an army—of engineers led to new ideas for performance and new ways of thinking. Performance work is neither easy nor straightforward. But by building a sense of community around our performance push, we supported each other and encouraged each other to deliver great performance improvements for MongoDB 8.0. To read more about how MongoDB raised the bar with the release of MongoDB 8.0, check out our Chief Technology Officer Jim Scharf’s blog post . And please visit the MongoDB 8.0 page to learn more about all of its features and upgrades.

April 2, 2025
Engineering Blog

Introducing MongoDB Atlas Service Accounts via OAuth 2.0

Authentication is a crucial aspect of interacting with the MongoDB Atlas Administration API , as it ensures that only authorized users or applications can access and manage resources within a MongoDB Atlas project. While MongoDB Atlas users currently have programmatic API keys (PAKs) as their primary authentication method, we recognize that development teams have varying authentication workflow requirements. To help developer teams meet these requirements, we’re excited to announce that Service Accounts via OAuth 2.0 for MongoDB Atlas is now generally available! MongoDB Atlas Service Accounts offer a more streamlined way of authenticating API requests for applications, enabling your developers to use their preferred authentication workflow. Addressing the challenges of using programmatic access keys At some point in your MongoDB Atlas journey, you have likely created PAKs. These API keys enable MongoDB Atlas project owners to authenticate access for their users. API keys include a public key and a private key. These two parts serve the same function as a username and a password when you make API requests to MongoDB Atlas. Each API key belongs to only one organization, but you can grant API keys access to any number of projects in that organization. PAKs use a method of authentication known as HTTP Digest, which is a challenge-response authentication mechanism that uses a hash function to securely transmit credentials without sending plaintext passwords over the network. MongoDB Atlas hashes the public key and the private key using a unique value called a nonce. The HTTP Digest authentication specifies that the nonce is only valid for a short amount of time. This is to prevent replay attacks so that you can’t cache a nonce and use it forever. It’s also why your API keys are a mix of random symbols, letters, and numbers and why you can only view a private key once. As a result, many teams must manage and rotate PAKs to maintain application access security. However, doing this across multiple applications can be cumbersome, especially for teams operating in complex environments. As a result, we’ve introduced support for an alternate authentication method through Service Accounts via OAuth 2.0, which enables users to take advantage of a more automated authentication method for application development. Using Service Accounts with an OAuth 2.0 client credentials flow OAuth 2.0 is a standard for interapplication authentication that relies on in-flight TLS encryption to secure its communication channels. This prevents unauthorized parties from intercepting or tampering with the data. The MongoDB Atlas Administration API supports in-flight TLS encryption and uses it to enable Service Accounts as an alternative method for authenticating users. MongoDB Atlas Service Accounts provide a form of OAuth 2.0 authentication that enables machine-to-machine communication. This enables applications, rather than users, to authenticate and access MongoDB Atlas resources. Authentication through Service Accounts follows the same access control model as PAKs, with full authentication lifecycle management. Service Accounts use the OAuth 2.0 client credentials flow, with MongoDB Atlas acting as both the identity provider and the authorization server. Like PAKs, Service Accounts are not tied to individual MongoDB Atlas users but are still ingrained with MongoDB Atlas. Figure 1. How it Works - MongoDB Atlas Service Accounts Experiencing benefits through Service Accounts Using Service Accounts to manage programmatic access offers a number of advantages: Automation Service Accounts offer an automated way to manage access. Users don’t need to manually manage authentication mechanisms, like recreating a Service Account to rotate the “client secrets.” Instead, they only need to regenerate the client secrets while keeping the other configuration of the existing Service Account intact. Furthermore, Service Accounts are broadly supported across many platforms, enabling easier integration between different services and tools and facilitating easier connections across applications and infrastructure components, regardless of the underlying technology. Seamless integration with MongoDB Atlas Service Accounts enable developers to manage authentication in the workflow of their choice. Users can manage the Service Account lifecycle at the organization and project levels via the MongoDB Atlas Administration API, the provided client library (currently, the Atlas Go SDK) , and the Atlas UI . They integrate with MongoDB Atlas via the OAuth 2.0 client credential flow, enabling seamless authentication using cloud-native identity systems. Granular access control and role management Service Accounts also have robust security features, providing a standardized and consistent way to manage access. Each organization or project can have its own Service Account, simplifying credential management and access control. Additionally, you can define granular roles for a Service Account to limit its access to only the necessary resources. This reduces the risk of over-permissioning and unauthorized access. Ready to uplevel your user authentication? Learn how to create your first Service Account by visiting our documentation . Not a MongoDB Atlas user yet? Sign up for free today.

April 2, 2025
Updates

MongoDB: Gateway to Open Finance and Financial Data Access

This is the second in a two-part series about open finance and the importance of a flexible data store to open finance innovation. Check out part one here! Open finance is reshaping the financial services industry, pushing traditional institutions to modernize with a data-driven approach. Consumers increasingly expect personalized experiences, making innovation key to customer retention and satisfaction. According to a number of studies 1 , there is an exponential increase of dynamic transformations in financial services, driven primarily by the impact of Banking-as-a-Service (BaaS), embedded banking services, and AI. All of these initiatives are mainly powered by API services intended for data sharing, and have become must-have technical capabilities for financial institutions. Open finance can also unlock massive opportunities for continuous innovation. As a result, financial institutions must provision themselves with the right tools and expertise to be fully aware of the potential risks and challenges of embarking on such a “data-driven” journey. Now, let’s dive deeper into an application of open finance with MongoDB. MongoDB as the open finance data store Integrating diverse financial data while ensuring its security, compliance, and scalability represents a series of considerable challenges for financial institutions. Bringing together data from a variety of backend systems entails a set of complex hurdles for financial ecosystem participants—banks, fintechs, and third-party providers (TPP). First, they need to be able to handle structured, semi-structured, and increasingly unstructured data types. Then, cybersecurity and regulatory compliance concerns must be addressed. What’s more, an increase in data-sharing scenarios can open up potential vulnerabilities, which lead to the risk of breach exposure and cyber-attacks (and, therefore, possible legal penalties and/or eventual reputational damage). Figure 1. The power of open finance. To implement open finance strategies, organizations must first determine the role they will play: whether they act as data holders, are in charge of sharing the data with TPP, or whether they will be data users, the ones able to provide enhanced financial capabilities to end-users. Then, they must choose the most suitable technology for the data management strategy—and this is where MongoDB comes in, functioning as the operational data store. Let’s explore how MongoDB can play a crucial role for both actors—data holders and data users—through an open finance functional prototype. Open finance in action: Aggregated financial view for banking users Figure 2 below shows a digital application from a fictional bank—Leafy Bank—that allows customers to aggregate all their bank accounts into a single platform. Figure 2. Architecture of MongoDB as the open finance data store. Four actors are involved in this scenario: a. Customer - User b. Data Users - Leafy Bank c. Data Holders - External Institution d. Open Finance Data Store - MongoDB Atlas Now let’s go through the steps from the customer experience. Step 1. Log in to the banking application Once logged in, the Leafy Bank digital banking application allows users to aggregate their external bank accounts. It is done behind the scenes, through a RESTFul API request that will usually interchange data in JSON format. For the Leafy Bank prototype, we are using MongoDB and FastAPI together, exposing and consuming RESTful APIs and therefore taking advantage of MongoDB Atlas’s high performance, scalability, and flexibility. Figure 3. Logging in to the banking application. Step 2. User authentication and authorization A crucial step to ensure security and compliance is user consent. End-users are responsible for granting access to their financial information (authorization). In our case, Leafy Bank emulates the OAuth 2.0 authentication. It generates the corresponding tokens for securing the service communication between participants. To achieve efficient interoperability without security issues, data holders must enable a secured technological “fence” for sharing data while preventing the operational risk of exposing core systems. Figure 4. User authorization. Step 3. Data exposure After the authorization has been granted, Leafy Bank will fetch the corresponding account data from the data custodian—external banks (in our fictional scenario, Green Bank or MongoDB Bank)—via APIs. Usually, participants expose customers’ financial data (accounts, transactions, and balances) through their exposed services in JSON format to ensure compatibility and seamless data exchange. Because MongoDB stores data in BSON, a superset of JSON , it provides a significant advantage by allowing seamless storage and retrieval of JSON-like data—making it an ideal backend for open finance. Figure 5. Data exposure. Step 4. Data fetching The retrieved financial data is then pushed into the open finance data store—in our case, in MongoDB Atlas—where it is centrally stored. Unlike rigid relational databases, MongoDB uses a flexible schema model, making it easy for financial institutions to aggregate diverse data structures from different sources, making it ideal for dynamic ecosystems and easy to adapt without costly migrations or downtime. Figure 6. Data fetching from data holder into MongoDB Atlas Data Store. Step 5. Data retrieval Now that the data has been aggregated in the operational data store (powered by MongoDB Atlas), Leafy Bank can leverage MongoDB Aggregation Pipelines for real-time data analysis and enrichment. To become “open finance” compliant, our Leafy Bank provides a holistic financial view and a global position accessible in a single application, thus improving individuals' experience with their finances. Furthermore, this set of features also benefits financial institutions. They can unveil useful insights for building unique services meant to enhance customers' financial well-being. Figure 7. Data retrieval from MongoDB Atlas Data Store. Step 6. Bank connected! In the end, customers can view all their finances in one place, while enabling banks to offer competitive, data-driven, tailored services. Figure 8. Displaying the bank connection in Leafy Bank. Demo in action Now, let’s combine these steps into a real-world demo application: Figure 9. Leafy Bank - MongoDB as the Open Finance Data Store. Advantages of MongoDB for open finance Open finance presents opportunities for all the ecosystem participants. On the one hand, bank customers can benefit from tailored experiences. For personal financial management, it can provide end-users central visibility of their bank accounts. And open finance can enable extended payment initiation services, financial product comparison, enhanced insurance premium assessments, more accurate loan and credit scoring, and more. From a technical standpoint, MongoDB can empower data holders, data users, and TPP to achieve open finance solutions. By offering a flexible schema , banks can adapt to open finance’s evolving requirements and regulatory changes while avoiding the complexity of rigid schemas, yet allowing a secure and manageable schema validation if required. Furthermore, a scalable ( vertical and horizontal ) and cloud-native ( multi-cloud ) platform like MongoDB can simplify data sharing in JSON format, as it has been widely adopted as the data interchange “defacto” format, making it ideal for open finance applications. Internally, MongoDB uses BSON, the binary representation of JSON, for efficient storage and data traversal. MongoDB’s rich extensions and connectors support a variety of frameworks to create RESTful API development. Besides FastAPI, there are libraries for Express.js (Node.js), Django (Python), Spring Boot (Java), and Flask (Python). The goal is to empower developers with an intuitive and easy-to-use data platform that boosts productivity and performance. Additionally, MongoDB offers key features like its aggregation pipeline , which is designed to process data more efficiently by simplifying complex transformations, real-time analytics, and detailed queries. Sophisticated aggregation capabilities from MongoDB allow financial institutions to improve their agility while maintaining their competitive edge, all by having data as their strategic advantage. Lastly, MongoDB provides financial institutions with critical built-in security controls, including encryption, role-based access controls (RBAC), and auditing. It seamlessly integrates with existing security protocols and compliance standards while enforcing privileged access controls and continuous monitoring to safeguard sensitive data, as detailed in the MongoDB Trust Center . Check out these additional resources to get started on your open finance journey with MongoDB: Read part-one of our series to discover why a flexible data store is vital for open finance innovation. Explore our GitHub repository for an in-depth guide on implementing this solution. Visit our solutions page to learn more about how MongoDB can support financial services.

April 1, 2025
Applied

MongoDB Powers M-DAQ’s Anti-Money Laundering Compliance Platform

Founded and headquartered in Singapore, M-DAQ Global is a fintech powerhouse providing seamless cross-border transactions for businesses worldwide. M-DAQ’s comprehensive suite of foreign exchange, collections, and payments solutions help organizations of all sizes navigate the complexities of global trade, offering FX clarity, certainty, and payment mobility. M-DAQ also offers AI-powered services like Know Your Business (KYB), onboarding, and advanced risk management tools. Amidst ever-evolving requirements, these enable business transactions across borders with ease, while staying compliant. One of M-DAQ's most innovative solutions, CheckGPT , is an AI-powered platform designed to streamline Anti-Money Laundering (AML) compliance. It was built on MongoDB Atlas , providing a strong foundation for designing multitenant data storage. This approach ensures that each client has a dedicated database, effectively preventing any data co-mingling. Traditional AML processes often involve tedious, time-consuming tasks, from document review, to background checks, to customer onboarding. By building CheckGPT, M-DAQ’s aim was to change this paradigm, and to leverage AI to automate (and speed) these manual processes. Today, CheckGPT allows businesses to process onboarding 30 times faster than traditional human processing. The platform also leverages MongoDB Atlas’s native Vector Search capabilities to power intelligent semantic searches across unstructured data. The challenge: Managing unstructured, sensitive data, and performing complex searches One of CheckGPT’s priorities was to improve processes around collecting, summarizing, and analyzing data, while flagging potential risks to customers quickly and accurately. Considering the vast number and complexity of data sets its AI platform had to handle, and the strict regulatory landscape the company operates in, it was crucial that M-DAQ chose a robust database. CheckGPT needed a database that could efficiently and accurately handle unstructured data, and adapt rapidly as the data evolved. The database also had to be highly secure; to function, the AI tool would have to handle highly sensitive data, and would need to be used by companies operating in highly regulated industries. Finally, CheckGPT was looking for the ability to perform complex, high-dimensional searches to power a wide range of complex searches and real-time information analysis. MongoDB Atlas: A complete platform with unique features According to M-DAQ, there are many benefits of using MongoDB Atlas’ document model: Flexibility: MongoDB Atlas’s document model accommodates the evolving nature of compliance data, providing the flexibility needed to manage CheckGPT's dynamic data structures, such as onboarding documents and compliance workflows. Security and performance: The MongoDB Atlas platform also ensures that data remains secure throughout its lifecycle. M-DAQ was able to implement a multi-tenancy architecture that securely isolates data across its diverse client base. This ensures that the platform can handle varying compliance demands while maintaining exceptional performance, giving M-DAQ’s customers the confidence that the AML processes handled by CheckGPT are compliant with stringent regulatory standards. Vector search capabilities: MongoDB Atlas provides a unified development experience. Particularly, MongoDB Atlas Vector Search enables real-time searches across a vast amount of high-dimensional datasets. This makes it easier to verify documents, conduct background checks, and continuously monitor customer activity, ensuring fast and accurate results during AML processes. “AI, together with the flexibility of MongoDB, has greatly impacted CheckGPT, enabling us to scale operations and automate complex AML compliance processes,” said Andrew Marchen, General Manager, Payments and Co-founder, Wallex at M-DAQ Global. “This integration significantly reduces onboarding time, which typically took between 4-8 hours to three days depending on the document’s complexity, to less than 10 minutes. With MongoDB, M-DAQ is able to deliver faster and more accurate results while meeting customer needs in a secure and adaptable environment." The future of CheckGPT, powered by MongoDB M-DAQ believes that AI and data-driven technologies and tools will continue to play a central role in automating complex processes. By employing AI, M-DAQ aims to improve operational efficiency, enhance customer experiences, and scale rapidly—while maintaining high service standards. MongoDB’s flexibility and multi-cloud support will be key as M-DAQ plans to use single/multi-cluster and multi-region capabilities in the future. M-DAQ aims to explore additional features that could enhance CheckGPT's scalability and performance. The company, for example, plans to expand its use of MongoDB for future projects involving automating complex processes like compliance, onboarding, and risk management in 2025. Learn more about CheckGPT on their site . Visit our product page to learn more about MongoDB Atlas. Get started with MongoDB Atlas Vector Search today with our Atlas Vector Search Quick Start guide .

April 1, 2025
Artificial Intelligence

LangChainGo and MongoDB: Powering RAG Applications in Go

MongoDB is excited to announce our integration with LangChainGo, making it easier to build Go applications powered by large language models (LLMs). This integration streamlines LLM-based application development by leveraging LangChainGo’s abstractions to simplify LLM orchestration, MongoDB’s vector database capabilities, and Go’s strengths as a performant, scalable, and easy-to-use production-ready language. With robust support for retrieval-augmented generation (RAG) and AI agents, MongoDB enables efficient knowledge retrieval, contextual understanding, and real-time AI-driven workflows. Read on to learn more about this integration and the advantages of using MongoDB as a vector database for AI/ML applications in Go. LangChainGo: Bringing LangChain to the Go ecosystem LangChain is an open-source framework that simplifies building LLM-powered applications. It offers tools and abstractions to integrate LLMs with diverse data sources, APIs, and workflows, supporting use cases like chatbots, document processing, and autonomous agents. While LangChain currently supports only Python and JavaScript, the need for a similar solution in the Go ecosystem led to the development of LangChainGo. LangChainGo is a community-driven, third-party port of the LangChain framework for the Go programming language. It allows Go developers to directly integrate LLMs into their Go applications, bringing the capabilities of the original LangChain framework into the Go ecosystem. LangChainGo enables users to embed data using various services, including OpenAI, Ollama, Mistral, and others. It also supports integration with a variety of vector stores, such as MongoDB. MongoDB’s role as an operational and vector database MongoDB excels as a unified data layer for AI applications with native vector search capabilities due to its simplicity, scalability, security, and rich set of features. With Atlas Vector Search built into the core database, there's no need to sync operational and vector data separately—everything stays in one place, saving time and reducing complexity when you develop AI-powered applications. You can easily combine semantic searches with metadata filters, graph lookups, aggregation pipelines, and even geo-spatial or lexical search, enabling powerful hybrid queries all within a single platform. MongoDB’s distributed architecture allows the usage of vector search to scale independently from the core database, ensuring optimized vector query performance and workload isolation for superior scalability. Plus, with enterprise-grade security and high availability, MongoDB provides the reliability and peace of mind you need to power your AI-driven applications at scale. MongoDB, Go, and AI/ML As the Go AI/ML landscape grows, MongoDB continues to drive innovation with its powerful vector search capabilities and LangChainGo integration, empowering developers to build RAG implementations and AI agents. This integration is powered by the MongoDB Go Driver , which supports vector search and allows developers to interact with MongoDB directly from their Go applications, streamlining development and reducing friction. Figure 1. RAG architecture with MongoDB and LangChainGo. While Python and JavaScript dominate the AI/ML ecosystem, Go’s AI/ML ecosystem is still emerging—yet its potential is undeniable. Go’s simplicity, scalability, runtime safety, concurrency, and single-binary deployment make it an ideal production-ready language for AI. With MongoDB’s powerful database and helpful learning resources, developers can seamlessly build next-generation AI solutions in Go. Ready to dive in? Explore the tutorials below to get started! Getting Started with MongoDB and LangChainGo MongoDB was added as a vector store in LangChainGo’s v0.1.13 release. It is packaged as mongovector , a component that enables developers to use MongoDB as a powerful vector store in LangChainGo. Usage guidance is provided through the mongovector-vectorstore-example , along with the in-depth tutorials linked below. Dive into this integration to unlock the full potential of Go AI applications with MongoDB. We’re excited for you to work with LangChainGo. Here are some tutorials to help you get started: Get Started with the LangChainGo Integration Retrieval-Augmented Generation (RAG) with Atlas Vector Search Build a Local RAG Implementation with Atlas Vector Search Get started with Atlas Vector Search (select Go from the dropdown menu)

March 31, 2025
Updates

Announcing the 2025 MongoDB PhD Fellowship Recipients

At MongoDB, we’re committed to fostering collaboration between academia and industry to support emerging research leaders. Now in its second year, the aim of the MongoDB PhD Fellowship Program is to advance cutting-edge research in computer science. Fellows receive financial support, mentorship, and opportunities to engage with MongoDB’s researchers and engineers throughout the year-long fellowship. They are also invited to present their research at MongoDB events. It’s hardly groundbreaking—but nonetheless true—to say that the world runs on software. As a result, investing in the future of software development is of paramount importance. So MongoDB is excited and honored to help these students push the frontiers of knowledge in their fields, and to contribute to innovations that will redefine the future of technology. Celebrating the 2025 MongoDB PhD Fellows This year, the selection process was extremely competitive, and the quality of the applications was excellent. The review panel of MongoDB researchers and engineers was impressed with the applicants' accomplishments to date, as well as with their ambitious goals for future research. Without further ado, I’m delighted to announce the recipients of the 2025 MongoDB PhD Fellowship. Congratulations to Xingjian Bai , William Zhang , and Renfei Zhou ! These three exceptional scholars stood out for their innovative research and potential to drive significant advancements in their field. Xingjian Bai , PhD candidate at MIT Xingjian Bai is a first-year PhD student in Electrical Engineering and Computer Science at MIT, supervised by Associate Professor Kaiming He. He obtained his master's and bachelor's degrees in Mathematics and Computer Science from the University of Oxford. His research lies at the intersection of classic algorithms and deep learning, with a focus on physics-inspired generative models and learning-augmented algorithms. More broadly, he is driven by research directions that are scientifically impactful or intellectually stimulating. In his spare time, he enjoys playing tennis and jogging. “I sincerely appreciate MongoDB’s support for Xingjian and contributions to fundamental research on artificial intelligence, deep learning, and machine learning.” - Kaiming He, Associate Professor of the Department of Electrical Engineering and Computer Science (EECS) at MIT William Zhang , PhD candidate at Carnegie Mellon University William Zhang is a third-year PhD student in the Computer Science Department, School of Computer Science, at Carnegie Mellon University. His research interest focuses on "self-driving" database management systems (DBMSs), specifically focusing on machine-learning-based techniques for optimizing their performance. He is advised by Associate Professor Andy Pavlo and is a member of the Database Group (CMU-DB) and Parallel Data Lab. "Will Zhang's PhD research at Carnegie Mellon University seeks to solve the problem all developers have struggled with since the 1970s: how to automate tuning and optimizing a database. Will is using an AI-based approach to develop database optimization algorithms that automatically learn how to exploit similarities between tuning options to reduce the complexity of database optimization. If successful his research will make it easier for anyone to deploy a database and maintain it as it grows over its lifetime. Removing the human burden of maintaining a database is especially important in the modern era of data-intensive AI applications. The Carnegie Mellon Database Group is grateful for MongoDB's support for Will's research through their PhD Fellowship program. Working with his mentor at MongoDB as part of the program provides Will with invaluable guidance and insight into the challenges developers face with databases, especially in a cloud setting like MongoDB Atlas." - Andy Pavlo, Associate Professor of Computer Science at CMU Renfei Zhou , PhD candidate at Carnegie Mellon University Renfei Zhou is a first-year PhD student studying theoretical computer science at CMU, co-advised by Assistant Professor William Kuszmaul and U.A. and Helen Whitaker Professor Guy Blelloch. He completed his bachelor’s degree in the Yao Class at Tsinghua University. He mainly works on classical data structures, especially hash tables and succinct data structures. He is also known for his work on fast matrix multiplication. "Renfei's research focuses on answering basic questions about how space- and time-efficient data structures can be. This is a research area that has a lot of potential for impact—both on how we, as theoreticians, think about data structures, but also on how data structures are implemented in the real world. Renfei isn't just a great researcher, he's also a great collaborator, and his research will almost certainly benefit from the mentorship that he will receive from researchers and engineers at MongoDB." - William Kuszmaul, Assistant Professor of Computer Science at CMU Seny Kamara, Head of Research at MongoDB, shared his thoughts on the program’s second year: “The applications we received for the fellowship were outstanding, but Renfei's, Will's and Xingjian’s research stood out for their depth and ambition. Their work tackles important problems in computer science and has the potential to impact both the wider industry as well as MongoDB’s efforts. We are very excited to collaborate with these exceptional students and to support their research.” We proudly congratulate this year’s winners and thank everyone who took the time to apply! The nomination window for the 2026 MongoDB PhD Fellowship Program will open on September 2, and we invite all PhD students with innovative ideas to apply. For more information about the MongoDB PhD Fellowship Program, the application process, and deadlines for next year's fellowships, please visit our PhD Fellowship Program page . Join a global community of educators and students, and access a wealth of resources, including free curriculum, specialized training, and certification pathways designed to enhance your teaching and student outcomes.

March 27, 2025
News

Secure and Scale Data with MongoDB Atlas on Azure and Google Cloud

MongoDB is committed to simplifying the development of robust, data-driven applications—regardless of where the data resides. Today, we’re announcing two major updates that enhance the security, scalability, and flexibility of MongoDB Atlas across cloud providers. Private, secure connectivity with Azure Private Link for MongoDB Atlas Data Federation, Atlas Online Archive, and Atlas SQL Developers building on Microsoft Azure can now establish private, secure connections to MongoDB Atlas Data Federation , MongoDB Atlas Online Archive , and MongoDB Atlas SQL using Azure Private Link, enabling: End-to-end security: Reduce exposure to security risks by keeping sensitive data off the public internet. Low-latency performance: Ensure faster and more reliable access through direct, private connectivity. Scalability: Build applications that scale while maintaining secure, seamless data access. Imagine a financial services company that needs to run complex risk analysis across multiple data sources, including live transactional databases and archived records. With MongoDB Atlas Data Federation and Azure Private Link, the company can securely query and aggregate this data without exposing it to the public internet, helping it achieve compliance with strict regulatory standards. Similarly, an e-commerce company managing high volumes of customer orders and inventory updates can use MongoDB Atlas Online Archive to seamlessly move older transaction records to cost-effective storage—all while ensuring real-time analytics dashboards still have instant access to historical trends. With Azure Private Link, these applications benefit from secure, low-latency connections, enabling developers to focus on innovation instead of on managing complex networking and security policies. General availability of MongoDB Atlas Data Federation and Atlas Online Archive on Google Cloud Developers working with Google Cloud can now use MongoDB Atlas Data Federation and Atlas Online Archive, which are now generally available in GA. This empowers developers to: Query data across sources: Run a single query across live databases, cloud storage, and data lakes without complex extract, transform, and load (ETL) pipelines. Optimize storage costs: Automatically move infrequently accessed data to lower-cost storage while keeping it queryable with MongoDB Atlas Online Archive. Achieve multi-cloud flexibility: Run applications across Amazon Web Services (AWS), Azure, and Google Cloud without being locked in. For example, a media streaming service might store frequently accessed content metadata in a high-performance database while archiving older user activity logs in Google Cloud Storage. With MongoDB Atlas Data Federation, the streaming service can analyze both live and archived data in a single query, making it easier to surface personalized recommendations without complex ETL processes. For a healthcare analytics platform, keeping years’ worth of patient records in a primary database can be expensive. By using MongoDB Atlas Online Archive, the platform can automatically move older records to lower-cost storage—while still enabling fast access to historical patient data for research and reporting. These updates give developers more control over building and scaling in the cloud. Whether they need secure access on Azure or seamless querying and archiving on Google Cloud, MongoDB Atlas simplifies security, performance, and cost efficiency. These updates are now live! Log in to your MongoDB Atlas account to start exploring the possibilities today.

March 27, 2025
Updates

How Cognistx’s SQUARY AI is Redefining Information Access

In a world where information is abundant but often buried, finding precise answers can be tedious and time-consuming. People spend hours a week simply searching for the information they need. Cognistx, an applied AI startup and a member of the MongoDB for Startups program, is on a mission to eliminate this inefficiency. Through its flagship product, SQUARY AI, the company is building tools to make information retrieval faster, more reliable, and radically simpler. As Cognistx seeks to unlock the future of intuitive search with speed, accuracy, and innovation, MongoDB Atlas serves as a reliable backbone for the company’s data operations. A company journey: From bespoke AI projects to a market-ready solution Cognistx started its journey with a focus on developing custom AI solutions for clients. Over time, the company identified a common pain point across industries: the need for efficient, high-quality tools to extract actionable insights from large volumes of data. This realization led it to pivot toward a product-based approach, culminating in the development of SQUARY AI—a next-generation intelligent search platform. SQUARY AI’s first iteration was born out of a bespoke project. The goal was to build a smart search engine capable of extracting answers to open-ended questions across multiple predefined categories. Early on, the team incorporated features like source tracking to improve trustworthiness and support human-assisted reviews, ensuring that the AI’s answers could be verified and trusted. Seeing the broader potential of its technology, Cognistx began using advancements in natural language processing and machine learning, transforming its early work into a stand-alone product designed for diverse industries. The evolution of SQUARY AI: Using state-of-the-art large language models Cognistx initially deployed traditional machine learning approaches to power SQUARY AI’s search capabilities, such as conversation contextualization and multihop reasoning (the ability to combine information from multiple sources to form a more complete answer). Before the rise of large language models (LLMs), this was no small feat. Today, SQUARY AI incorporates state-of-the-art LLMs to elevate both speed and precision. The platform uses a combination of retrieval-augmented generation (RAG), custom text-cleaning methods, and advanced vector search techniques. MongoDB Atlas integrates seamlessly into this ecosystem. MongoDB Atlas Vector Search powers SQUARY AI’s advanced search capabilities and lays the groundwork for even faster and more accurate information retrieval. With MongoDB Atlas, the company can store vectorized data alongside the rest of its operational data. There’s no need to add a separate, stand-alone database to handle vector search. MongoDB Atlas serves as both the operational data store and vector data store. Cognistx offers multiple branches of SQUARY AI, including: SQUARY Chat: Designed for public-facing or intranet deployment, these website chatbots provide instant, 24/7 access to website content, eliminating the need for human agents. It also empowers website owners with searchable, preprocessed AI insights from user queries. These analytics enable organizations to directly address customer needs, refine marketing strategies, and ensure that their sites contain the most relevant and valuable information for their audiences. SQUARY Enterprise: Built with businesses in mind, this enterprise platform helps companies retrieve precise answers from vast and unorganized knowledge bases. Whether it’s assisting employees or streamlining review processes, this tool helps organizations save time, improve team efficiency, and deliver actionable insights. One of the standout features of SQUARY AI is it's AI-driven metrics that assess system performance and provide insights into user interests and requirements. This is particularly valuable for public-facing website chatbots. A powerful database: How MongoDB powers SQUARY AI Cognistx attributes much of its technical success to MongoDB. The company’s history with MongoDB spans years, and its trust in MongoDB’s performance and reliability made the database the obvious choice for powering SQUARY AI. “MongoDB has been pivotal in our journey,” said Cognistx Data Scientist Ihor Markevych. “The scalable, easy-to-use database has allowed us to focus on innovating and refining SQUARY AI without worrying about infrastructure constraints. With MongoDB’s support, we’ve been able to confidently scale as our product grows, ensuring both performance and reliability.” The team’s focus when selecting a database was on cost, convenience, and development effort. MongoDB checked all those boxes, said Markevych. The company’s expertise with MongoDB, coupled with years of consistent satisfaction with its performance, made it the obvious choice. With no additional ramp-up effort necessary, the team was able to deploy very quickly. In addition to MongoDB Atlas Vector Search, the other critical feature of MongoDB is its scalability, which Markevych described as seamless. “Its intuitive structure enables us to monitor usage patterns closely and scale up or down as needed. This flexibility ensures we’re always operating efficiently without overcommitting resources,” Markevych said. The MongoDB for Startups program has also been instrumental in the company’s success. The program provides early-stage startups with free MongoDB Atlas credits, technical guidance, co-marketing opportunities, and access to a network of partners. With help from MongoDB technical advisors, the Cognistx team is now confidently migrating data from OpenSearch to MongoDB Atlas to achieve better performance at a reduced cost. The free MongoDB Atlas credits enabled the team to experiment with various configurations to optimize the product further. It also gained access to a large network of like-minded innovators. “The MongoDB for Startups community has provided invaluable networking opportunities, enhancing our visibility and connections within the industry,” Markevych said. The future: Scaling for more projects Looking ahead, Cognistx is focusing on making SQUARY AI even more accessible and customizable. Key projects include automating the onboarding process, which will enable users to define and fine-tune system behavior from the start. The company also aims to expand SQUARY AI’s availability across various marketplaces. With a successful launch on AWS Marketplace, the company next hopes to offer its product on WordPress, making it simple for businesses to integrate SQUARY Chat into their websites. Cognistx is continuing to refine SQUARY AI’s balance between speed, accuracy, and usability. By blending cutting-edge technologies with a user-centric approach, the company is shaping the future of how people access and interact with information. See it in action Cognistx isn’t just building a tool; it’s building a movement toward intuitive, efficient, and conversational search. Experience the possibilities for yourself— schedule a demo of SQUARY AI today . To get started with vector search in MongoDB, visit our MongoDB Atlas Vector Search Quick Start guide .

March 26, 2025
Applied

Ready to get Started with MongoDB Atlas?

Start Free