startups-old

3107 results

Digital Receipts: Mining for Customer & Business Insight with MongoDB

Imagine walking out of your favorite store and moments later receiving a personalized recommendation for a matching item, based not only on what you just bought, but your entire purchase history. This level of tailored experience has long been difficult to achieve in brick-and-mortar retail, but that’s changing thanks to digital receipts. Digital receipts are gaining traction, with Realtimes UK reporting that a quarter of UK retailers now offer them exclusively . In physical stores, traditional paper receipts represent missed opportunities: static, one-time records that serve little purpose beyond proof of purchase. In contrast, digital receipts unlock a dynamic stream of customer insights, which are a gateway to AI-powered personalization, enabling retailers to transform sales data into timely, relevant recommendations. Retailers are also seeing greater adoption of their customer loyalty apps by embedding features like digital receipts and personalized offers, giving shoppers more reasons to engage after leaving the store. Retailers are increasingly investing in digital receipts, and MongoDB enables them to digitize in-store transactions, understand shopper behavior, and deliver personalized product suggestions immediately after checkout. With MongoDB’s flexible document model , retailers can efficiently store and analyze rich transactional data, powering real-time personalization and adaptive customer experiences. It’s a smarter, data-driven approach to customer engagement, built for the physical retail world. The challenge in capturing the in-store customer journey Personalized shopping experiences are a proven driver of customer loyalty and revenue, but to deliver them effectively, retailers need a complete view of each customer’s journey. For retailers who have a brick-and-mortar presence, that’s where the gap lies. Today, many retailers are making personalization decisions based on incomplete data. While loyalty programs and customer profiles may capture some purchase history, in-store transactions often go unrecorded or take too long to turn into actionable insights. Paper receipts dominate the checkout process, and without a digital trail, these interactions are lost to the retailer’s systems. This means that even a highly engaged, in-store shopper may appear invisible when it comes to targeting and recommendations. The impact of this is twofold. First, it limits the retailer’s ability to offer relevant product suggestions, personalized promotions, or timely follow-ups, missing key opportunities to increase basket size and repeat visits. Second, it affects the customer experience, particularly in the retailer’s mobile app. Shoppers who frequent physical stores often find that their app doesn’t reflect their recent purchases or preferences, making it feel disconnected and less useful. By digitizing receipts, retailers can close this gap. Every in-store purchase becomes a rich source of insight, directly tied to the customer profile. This enables more accurate, real-time personalization, both right after checkout and in future interactions. It also adds meaningful value to the retailer’s mobile app: customers see their full purchase history, receive smarter recommendations, and access personalized offers that feel relevant. The business impact is significant: better personalization drives more revenue, while a more engaging app experience leads to higher adoption, increased usage, and stronger loyalty. Getting the most out of day-to-day data: Building a digital receipt solution Retailers aiming to enhance personalization must first digitize in-store transactional data, particularly the information generated at checkout from point-of-sale (POS) systems. However, the majority of existing POS systems have fixed, non-changeable data formats, designed primarily for payment processing. These systems often vary across store locations, lack integration with customer profiles, and don't support rapid data access. To address these challenges, retailers should centralize transaction data from all stores into a consistent and accessible format. Ensuring each purchase is reliably linked to a customer identity, through loyalty sign-ins or digital prompts, and storing that information in a manner that supports immediate, personalized engagement is crucial. Integration with POS systems is essential, allowing retailers to capture transaction data instantly and store it. A flexible document model (like MongoDB’s) stores structured, unstructured, and AI-ready data in one format, making it ideal for managing complex customer profiles and purchase histories. It captures detailed transaction data, including items, prices, context, and nested info like product attributes, preferences, and loyalty activity, all within a single document. Figure 1. MongoDB’s document model contains the data used to render the digital receipts. This image shows how MongoDB's document model supports digital receipts by instantly ingesting all receipt details. It features a MongoDB document (left) containing both purchased product information and personalized recommendations, and the digital receipt on PDF (right). It also makes the data instantly usable for personalization engines and AI models, without the need for heavy transformation or complex joins across multiple systems. Should the retailer have several different brands or types of PoS systems which data in different formats, the flexible document model allows them to be combined more easily, including fast onboarding if new types are introduced. Seamless integration allows connectivity with existing POS systems and third-party analytics tools, reducing friction in adoption. MongoDB enables this through features like real-time data ingestion with change streams, flexible data connectors for systems like Kafka, and an API-driven approach that supports REST. Combined with MongoDB Atlas ’s multi-cloud deployment support, retailers can connect and scale across diverse infrastructures without needing to re-architect their existing systems. Retailers can surface digital receipts directly in the customer-facing app, enhancing the post-purchase experience. Shoppers gain instant access to their full purchase history, enabling features like receipt lookups, easy reorders, warranty tracking, and personalized product suggestions. This drives more app adoption and keeps customers engaged beyond the store visit. To support this experience at scale, retailers need an architecture that can handle high volumes of receipt data from numerous store locations. MongoDB Atlas supports this through horizontal scalability and workload isolation, ensuring operational workloads like customer app interactions remain fast and reliable as data grows. Some retailers optimize storage by keeping receipt metadata in MongoDB while storing the full receipt in an object store like Azure Blob Storage or Google Cloud Storage, enabling a cost-effective approach. Figure 2. Architecture diagram showing the Digital Receipts components. MongoDB’s ability to serve real-time queries with low latency ensures that every tap or search in the app feels instant, helping reinforce customer trust and satisfaction. This makes the app not just a digital companion but a key driver of loyalty and repeat visits. By making digital receipts easily accessible in the app, alongside personalized recommendations and seamless post-purchase interactions, retailers create a more engaging and convenient experience that keeps customers coming back. Increased app adoption leads to more touchpoints, better data collection, and more opportunities to upsell or cross-sell, ultimately boosting revenue and retention. A notable example of a retailer leveraging MongoDB for digital receipts is Albert Heijn, the largest supermarket chain in the Netherlands . By utilizing MongoDB Atlas, Albert Heijn developed a digital receipts feature within their customer-facing app, providing shoppers with real-time and historical insights into their in-store purchases. This adoption of MongoDB Atlas led to annual savings of 25%, improved developer productivity, and a more efficient customer experience. Retailers use digital receipt data to improve personalized recommendations by combining purchase history, preferences, and behavior. Digitized receipts enable tracking of items, frequency, and context, allowing real-time linking of in-store purchases to customer profiles for more accurate, timely offers. Figure 3. Diagram showing the Digital Receipts process flow. The image illustrates the digital receipts process: 1. A customer makes a purchase in-store, 2. receives a digital receipt via email or SMS, 3. verifies it through an app, 4. accesses purchase history and personalized recommendations, and 5. can repurchase items through the app. Using MongoDB’s aggregation pipelines and change streams, retailers can process data efficiently and enable AI-driven personalization immediately after checkout. This streamlined handling of structured and unstructured receipt data supports rapid analysis of customer preferences and purchasing patterns. MongoDB's workload isolation ensures that analytical processes do not impact the performance of customer-facing applications, maintaining a seamless user experience. Retailers can enhance customer engagement by leveraging this data to offer personalized promotions, loyalty rewards, and cross-selling opportunities. Ready to embrace digital receipts? Digital receipts are reshaping how brick-and-mortar retailers unlock customer insights and deliver AI-driven personalization. With MongoDB Atlas, retailers can instantly analyze transactional data, customer preferences, and purchase history within a flexible document model, powering real-time, tailored recommendations that increase basket size, drive repeat purchases, and boost conversions. Beyond personalization, digital receipts reduce printing costs and support sustainability by eliminating paper waste, while offering customers a convenient, app-based way to access and search past purchases. The real value lies in the data: by capturing rich, real-time insights from every in-store transaction, retailers can unify physical and digital touchpoints, improving customer engagement and business agility. MongoDB’s scalable architecture and real-time processing empower retailers to adapt quickly to changing behavior and deliver seamless, data-driven experiences. Now is the time to modernize your customer engagement strategy. Digital receipts aren’t just a convenience; they’re a competitive advantage! Discover how MongoDB Atlas can help you deliver seamless customer experiences across all channels through our solutions page .

June 12, 2025

PointHealth AI: Scaling Precision Medicine for Millions

For years, the healthcare industry has grappled with a persistent, frustrating challenge: the absence of a unified, precise approach to patient treatment. Patients often endure "trial-and-error prescribing," leading to delayed recovery and a system bogged down by inefficiency. The core problem lies in scaling precision medicine—making advanced, individualized care accessible to millions of people. This was the big obstacle that Rachel Gollub, CTO and co-founder of the VC-backed startup PointHealth AI , set out to overcome. With a vision to integrate precision medicine into mainstream healthcare, Gollub and her team are transforming how care is delivered, a mission significantly bolstered by their pivotal partnership with MongoDB . Uncovering the gaps in healthcare treatment decisions Over a decade working within the insurance industry, Gollub and her co-founder, Joe Waggoner, observed a frustrating reality: persistent gaps in how treatment decisions were made. This wasn't just about inefficiency; it directly impacted patients, who often experienced "trial-and-error prescribing" that delayed their recovery. As Gollub states, they witnessed "the frustrating gaps in treatment decision-making." It motivated them to seek a better solution. The fundamental challenge they faced was scaling precision medicine. How could something so powerful be made accessible to millions rather than just a select few hundred? The biggest obstacle wasn't solely about the technology itself; it was about seamlessly integrating that technology into existing healthcare workflows. How PointHealth AI eliminates treatment guesswork PointHealth AI's approach involves a proprietary AI reinforcement learning model. This system analyzes a range of data, including similar patient cases, detailed medical histories, drug interactions, and pharmacogenomic insights. When a physician enters a diagnosis into their health record system, PointHealth AI generates a comprehensive patient report. This report offers tailored treatments, actionable insights, and clinical considerations, all designed to guide decision-making. Gollub explains the company’s mission: "to integrate precision medicine into mainstream healthcare, ensuring every diagnosis leads to the right treatment from the start." Its focus is on "eliminating guesswork and optimizing care from the very first prescription." The objective is "to deliver personalized, data-driven treatment recommendations." Its strategy for implementation involves direct partnerships with insurance companies and employers. By embedding its technology directly into these healthcare workflows, PointHealth AI aims to ensure widespread accessibility across the entire system. It’s also collaborating with health systems, electronic health record (EHR) companies, and other insurers. The natural choice: Why PointHealth AI chose MongoDB Atlas A significant enabler of this progress has been PointHealth AI's partnership with MongoDB. Gollub's prior experience with both self-hosted and managed MongoDB provided confidence in its performance and reliability. MongoDB Atlas was a "natural choice" when selecting a data platform for PointHealth AI. It offered the features the team was looking for, including vector search , text search , and managed scalability . The provision of Atlas credits also swayed the decision. PointHealth AI had specific requirements for its data platform. It needed "high security, HIPAA compliance, auto-scaling, fast throughput, and powerful search capabilities." The fact that MongoDB Atlas provided these features within a single, managed solution was huge. MongoDB Atlas ensures seamless backups and uptime through its managed database infrastructure. Its vector and text search capabilities are critical for effectively training AI models. The scaling experience has been "seamless," according to Gollub. The MongoDB team has offered "invaluable guidance in architecting a scalable system." This support has enabled PointHealth AI to optimize for performance while remaining on budget. Gollub emphasizes that "HIPAA compliance, scalability, expert support, and advisory sessions have all played critical roles in shaping our infrastructure." The MongoDB for Startups program has proven impactful. The "free technical advisor sessions provided a clear roadmap for our database architecture." The Atlas credits offered flexibility, allowing the team to "fine-tune our approach without financial strain." Furthermore, the "invaluable expert recommendations and troubleshooting support from the MongoDB advisor team" have been a vital resource. Gollub extends a "huge thank you to the MongoDB Atlas team for their support in building and scaling our system, and handling such an unusual use case." From pilots to Series A: PointHealth AI's next steps Looking forward, PointHealth AI has an ambitious roadmap for the current year. Its focus includes launching pilot installations and expanding partnerships with insurance and EHR companies. It’s also dedicated to refining its AI model to support a wider range of health conditions beyond depression. The overarching goal is to bring "precision-driven treatment recommendations to physicians and patients." The aim, Gollub said, is to "launch successful pilots, acquire new customers, and complete our Series A round." As Gollub states, "Precision medicine isn’t the future—it’s now." The team possesses the technology to deliver targeted treatment options, aiming to ensure patients receive the correct care from the outset. Their vision is to shape a healthcare system where personalized treatments are the standard. Visit PointHealth AI to learn more about how this innovative startup is making advanced, individualized care accessible to millions. Join the MongoDB for Startups program to start building faster and scaling further with MongoDB!

June 11, 2025

Scaling Vector Search with MongoDB Atlas Quantization & Voyage AI Embeddings

Key Takeaways Vector quantization fundamentals: A technique that compresses high-dimensional embeddings from 32-bit floats to lower precision formats (scalar/int8 or binary/1-bit), enabling significant performance gains while maintaining semantic search capabilities Performance vs. precision trade-offs: Binary quantization provides maximum speed (80% faster queries) with minimal resources; scalar quantization offers balanced performance and accuracy; float32 maintains highest fidelity at significant resource cost Resource optimization: Vector quantization can reduce RAM usage by up to 24x (binary) or 3.75x (scalar); storage footprint decreases by 38% using BSON binary format Scaling benefits: Performance advantages multiply at scale; most significant for vector databases exceeding 1M embeddings Semantic preservation: Quantization-aware models like Voyage AI's retain high representation capacity even after compression Search quality control: Binary quantization may require rescoring for maximum accuracy; scalar quantization typically maintains 90%+ retention of float32 results Implementation ease: MongoDB's automatic quantization requires minimal code changes to leverage quantization techniques As vector databases scale into the millions of embeddings, the computational and memory requirements of high-dimensional vector operations become critical bottlenecks in production AI systems. Without effective scaling strategies, organizations face: Infrastructure costs that grow exponentially with data volume Unacceptable query latency that degrades user experience and limits real-time applications Limited and restricted deployment options, particularly on edge devices or resource-constrained environments Diminished competitive advantage as AI capabilities become limited by technical constraints and bottlenecks rather than use case innovation This technical guide demonstrates advanced techniques for optimizing vector search operations through precision-controlled quantization—transforming resource-intensive 32-bit float embeddings into performance-optimized representations while preserving semantic fidelity. By leveraging MongoDB Atlas Vector Search ’s automatic quantization capabilities with Voyage AI's quantization-aware embedding models, we'll implement systematic optimization strategies that dramatically reduce both computational overhead and memory footprint. This guide provides an empirical analysis of the critical performance metrics: Retrieval latency benchmarking: Quantitative comparison of search performance across binary, scalar, and float32 precision levels with controlled evaluation of HNSW(hierarchical navigable small world) graph exploration parameters and k-retrieval variations. Representational capacity retention: Precise measurement of semantic information preservation through direct comparison of quantized vector search results against full-fidelity retrieval, with particular attention to retention curves across varying retrieval depths. We'll present implementation strategies and evaluation methodologies for vector quantization that simultaneously optimize for both computational efficiency and semantic fidelity—enabling you to make evidence-based architectural decisions for production-scale AI retrieval systems handling millions of embeddings. The techniques demonstrated here are directly applicable to enterprise-grade RAG architectures, recommendation engines, and semantic search applications where millisecond-level latency improvements and dramatic RAM reduction translate to significant infrastructure cost savings. The full end to end implementation for automatic vector quantization and other operations involved in RAG/Agent pipelines can be found on our Github repository . Auto-quantization of Voyage AI embeddings with MongoDB Our approach addresses the complete optimization cycle for vector search operations, covering: Generating embeddings with quantization-aware models Implementing automatic vector quantization in MongoDB Atlas Creating and configuring specialized vector search indices Measuring and comparing latency across different quantization strategies Quantifying representational capacity retention Analyzing performance trade-offs between binary, scalar, and float32 implementations Making evidence-based architectural decisions for production AI retrieval systems Figure 1. Vector quantization architecture with MongoDB Atlas and Voyage AI. Using text data as an example, we convert documents into numerical vector embeddings that capture semantic relationships. MongoDB then indexes and stores these embeddings for efficient similarity searches. By comparing queries run against float32, int8, and binary embeddings, you can gauge the trade-offs between precision and performance and better understand which quantization strategy best suits large-scale, high-throughput workloads. One key takeaway from this article is that representational capacity retention is highly dependent on the embedding model used. With quantization-aware models like Voyage AI’s voyage-3-large at appropriate dimensionality (1024 dimensions), our tests demonstrate that we can achieve 95%+ recall retention at reasonable numCandidate values. This means organizations can significantly reduce memory and computational requirements while preserving semantic search quality, provided they select embedding models specifically designed to maintain their representation capacity after quantization. For more information on why vector quantization is crucial for AI workloads, refer to this blog post . Dataset information Our quantization evaluation framework leverages two complementary datasets designed specifically to benchmark semantic search performance across different precision levels. Primary Dataset ( Wikipedia-22-12-en-voyage-embed ): Contains approximately 300,000 Wikipedia article fragments with pre-generated 1024-dimensional embeddings from Voyage AI’s voyage-3-large model. This dataset serves as a diverse vector corpus for testing vector quantization effects in semantic search. Throughout this tutorial, we'll use the primary dataset to demonstrate the technical implementation of quantization. Embedding generation with Voyage AI For generating new embeddings for AI Search applications, we use Voyage AI's voyage-3-large model, which is specifically designed to be quantization-aware. The voyage-3-large model generates 1024-dimensional vectors and has been specifically trained to maintain semantic properties even after quantization, making it ideal for our AI retrieval optimization strategy. For more information on how MongoDB and Voyage AI work together for optimal retrieval, see our previous article, Rethinking Information Retrieval with MongoDB and Voyage AI . import voyageai # Initialize the Voyage AI client client = voyageai.Client() def get_embedding(text, task_prefix="document"): """ Generate embeddings using the voyage-3-large model for AI Retrieval. Parameters: text (str): The input text to be embedded. task_prefix (str): A prefix describing the task; this is prepended to the text. Returns: list: The embedding vector (1024 dimensions). """ if not text.strip(): print("Attempted to get embedding for empty text.") return [] # Call the Voyage API to generate the embedding result = client.embed([text], model="voyage-3-large", input_type=task_prefix) # Return the first embedding from the result return result.embeddings[0] Converting embeddings to BSON BinData format A critical optimization step is converting embeddings to MongoDB's BSON BinData format , which significantly reduces storage and memory requirements. The BinData vector format provides significant advantages: Reduces disk space by approximately 3x compared to arrays Enables more efficient indexing with alternate types (int8, binary) Reduces RAM usage by 3.75x for scalar and 24x for binary quantization from bson.binary import Binary, BinaryVectorDtype def generate_bson_vector(array, data_type): return Binary.from_vector(array, BinaryVectorDtype(data_type)) # Convert embeddings to BSON BinData vector format wikipedia_data_df["embedding"] = wikipedia_data_df["embedding"].apply( lambda x: generate_bson_vector(x, BinaryVectorDtype.FLOAT32) ) Vector index creation with different quantization strategies The cornerstone of our performance optimization framework lies in creating specialized vector indices with different quantization strategies. This process leverages MongoDB for general-purpose database functionalities, more specifically, its high-performance vector database capabilities of efficiently handling million-scale embedding collections. This implementation step focuses on how to set up MongoDB's vector search capabilities with automatic quantization, focusing on two primary quantization strategies: scalar (int8) and binary. Two indices are created to measure and evaluate the retrieval latency and recall performance of various precision data types, including the full fidelity vector representation. The MongoDB database uses the vector index HNSW, which is a graph-based indexing algorithm that organizes vectors in a hierarchical structure of layers. In this structure, vector data points within a layer are contextually similar, while higher layers are sparse compared to lower layers, which are denser and contain more vector data points. The code snippet below showcases the implementation of two quantization strategies in parallel; this enables the systematic evaluation of the latency, memory usage, and representational capacity trade-offs across the precision spectrum, enabling data-driven decisions about the optimal approach for specific application requirements. MongoDB Atlas automatic quantization is activated entirely through the vector index definition. By including the "quantization" attribute and setting its value to either "scalar" or "binary", you enable automatic compression of your embeddings at index creation time. This declarative approach means no separate preprocessing of vectors is required—MongoDB handles the dimensional reduction transparently while maintaining the original embeddings for potential rescoring operations. from pymongo.operations import SearchIndexModel def setup_vector_search_index(collection, index_definition, index_name="vector_index"): """Setup a vector search index with the specified configuration""" ... # 1. Scalar Quantized Index (int8) vector_index_definition_scalar_quantized = { "fields": [{ "type": "vector", "path": "embedding", "quantization": "scalar", # Uses int8 quantization "numDimensions": 1024, "similarity": "cosine", }] } # 2. Binary Quantized Index (1-bit) vector_index_definition_binary_quantized = { "fields": [{ "type": "vector", "path": "embedding", "quantization": "binary", # Uses binary (1-bit) quantization "numDimensions": 1024, "similarity": "cosine", }] } # 3. Float32 ANN Index (no quantization) vector_index_definition_float32_ann = { "fields": [{ "type": "vector", "path": "embedding", "numDimensions": 1024, "similarity": "cosine", }] } # Create the indices setup_vector_search_index( wiki_data_collection, vector_index_definition_scalar_quantized, "vector_index_scalar_quantized" ) setup_vector_search_index( wiki_data_collection, vector_index_definition_binary_quantized, "vector_index_binary_quantized" ) setup_vector_search_index( wiki_data_collection, vector_index_definition_float32_ann, "vector_index_float32_ann" ) Implementing vector search functionality Vector search serves as the computational foundation of modern generative AI systems. While LLMs provide reasoning and generation capabilities, vector search delivers the contextual knowledge necessary for grounding these capabilities in relevant information. This semantic retrieval operation forms the backbone of RAG architectures that power enterprise-grade AI applications, such as knowledge-intensive chatbots and domain-specific assistants. In more advanced implementations, vector search enables agentic RAG systems where autonomous agents dynamically determine what information to retrieve, when to retrieve it, and how to incorporate it into complex reasoning chains. The implementation below provides the technical overview that transforms raw embedding vectors into intelligent search components that move beyond lexical matching to true semantic understanding. Our implementation below supports both approximate nearest neighbor (ANN) search and exact nearest neighbor (ENN) search through the use_full_precision parameter: Approximate nearest neighbor (ANN) search: When use_full_precision = False , the system performs an approximate search using: The specified quantized index (binary or scalar) The HNSW graph navigation algorithm A controlled exploration breadth via numCandidates This approach sacrifices perfect accuracy for dramatic performance gains, particularly at scale. The HNSW algorithm enables sub-linear time complexity by intelligently sampling the vector space, making it possible to search billions of vectors in milliseconds instead of seconds. When combined with quantization, ANN delivers order-of-magnitude improvements in both speed and memory efficiency. Exact nearest neighbor (ENN) search: When use_full_precision = True , the system performs exact search using: The original float32 embeddings (regardless of the index specified) An exhaustive comparison approach The exact = True directive to bypass approximation techniques ENN guarantees finding the mathematically optimal nearest neighbors by computing distances between the query vector and every single vector in the database. This brute-force approach provides perfect recall but scales linearly with collection size, becoming prohibitively expensive as vector counts increase beyond millions. We include both search modes for several critical reasons: Establishing ground truth: ENN provides the "perfect" baseline against which we measure the quality degradation of approximation techniques. The representational retention metrics discussed later directly compare ANN results against this ENN ground truth. Varying application requirements: Not all AI applications prioritize the same metrics. Time-sensitive applications (real-time customer service) might favor ANN's speed, while high-stakes applications (legal document analysis) might require ENN's accuracy. def custom_vector_search( user_query, collection, embedding_path, vector_search_index_name="vector_index", top_k=5, num_candidates=25, use_full_precision=False, ): """ Perform vector search with configurable precision and parameters for AI Search applications. """ # Generate embedding for the query query_embedding = get_embedding(user_query, task_prefix="query") # Define the vector search stage vector_search_stage = { "$vectorSearch": { "index": vector_search_index_name, "queryVector": query_embedding, "path": embedding_path, "limit": top_k, } } # Configure search precision approach if not use_full_precision: # For approximate nearest neighbor (ANN) search vector_search_stage["$vectorSearch"]["numCandidates"] = num_candidates else: # For exact nearest neighbor (ENN) search vector_search_stage["$vectorSearch"]["exact"] = True # Project only needed fields project_stage = { "$project": { "_id": 0, "title": 1, "text": 1, "wiki_id": 1, "url": 1, "score": {"$meta": "vectorSearchScore"} } } # Build and execute the pipeline pipeline = [vector_search_stage, project_stage] ... # Execute the query results = list(collection.aggregate(pipeline)) return {"results": results, "execution_time_ms": execution_time_ms} Measuring the retrieval latency of various quantized vectors In production AI retrieval systems, query latency directly impacts user experience, operational costs, and system throughput capacity. Vector search operations typically constitute the primary performance bottleneck in RAG architectures, making latency optimization a critical engineering priority. Sub-100ms response times are often necessary for interactive applications and mission-critical applications, while batch processing systems may tolerate higher latencies but require consistent predictability for resource planning. Our latency measurement methodology employs a systematic, parameterized approach that models real-world query patterns while isolating the performance characteristics of different quantization strategies. This parameterized benchmarking enables us to: Construct detailed latency profiles across varying retrieval depths Identify performance inflection points where quantization benefits become significant Map the scaling curves of different precision levels as the data volume increases Determine optimal configuration parameters for specific throughput targets def measure_latency_with_varying_topk( user_query, collection, vector_search_index_name, use_full_precision=False, top_k_values=[5, 10, 50, 100], num_candidates_values=[25, 50, 100, 200, 500, 1000, 2000], ): """ Measure search latency across different configurations. """ results_data = [] for top_k in top_k_values: for num_candidates in num_candidates_values: # Skip invalid configurations if num_candidates < top_k: continue # Get precision type from index name precision_name = vector_search_index_name.split("vector_index")[1] precision_name = precision_name.replace("quantized", "").capitalize() if use_full_precision: precision_name = "_float32_ENN" # Perform search and measure latency vector_search_results = custom_vector_search( user_query=user_query, collection=collection, embedding_path="embedding", vector_search_index_name=vector_search_index_name, top_k=top_k, num_candidates=num_candidates, use_full_precision=use_full_precision, ) latency_ms = vector_search_results["execution_time_ms"] # Store results results_data.append({ "precision": precision_name, "top_k": top_k, "num_candidates": num_candidates, "latency_ms": latency_ms, }) print(f"Top-K: {top_k}, NumCandidates: {num_candidates}, " f"Latency: {latency_ms} ms, Precision: {precision_name}") return results_data Latency results analysis Our systematic benchmarking reveals dramatic performance differences between quantization strategies across different retrieval scenarios. The visualizations below capture these differences for top-k=10 and top-k=100 configurations. Figure 2. Search latency vs the number candidates for top-k=10 Figure 3. Search latency vs the number of candidates for top-k=100. Several critical patterns emerge from these latency profiles: Quantization delivers exponential performance gains: The float32_ENN approach (purple line) demonstrates latency measurements an order of magnitude higher than any quantized approach. At top-k=10, ENN latency starts at ~1600ms and never drops below 500ms, while quantized approaches maintain sub-100ms performance until extremely high candidate counts. This performance gap widens further as data volume scales. Scalar quantization offers the best performance profile: Somewhat surprisingly, scalar quantization (orange line) consistently outperforms both binary quantization and float32 ANN across most configurations. This is particularly evident at higher num_candidates values, where scalar quantization maintains near-flat latency scaling. This suggests scalar quantization achieves an optimal balance in the memory-computation trade-off for HNSW traversal. Binary quantization shows linear latency scaling: While binary quantization (red line) starts with excellent performance, its latency increases more steeply as num_candidates grows, eventually exceeding scalar quantization at very high exploration depths. This suggests that while binary vectors require less memory, their distance computation savings are partially offset by the need for more complex traversal patterns in the HNSW graph and rescoring. All quantization methods maintain interactive-grade performance: Even with 10,000 candidate explorations and top-k=100, all quantized approaches maintain sub-200ms latency, well within interactive application requirements. This demonstrates that quantization enables order-of-magnitude increases in exploration depth without sacrificing user experience, allowing for dramatic recall improvements while maintaining acceptable latency. These empirical results validate our theoretical understanding of quantization benefits and provide concrete guidance for production deployment: scalar quantization offers the best general-purpose performance profile, while binary quantization excels in memory-constrained environments with moderate exploration requirements. In the images below we employ logarithmic scaling for both axes in our latency analysis because search performance data typically spans multiple orders of magnitude. When comparing different precision types (scalar, binary, float32_ann) across varying numbers of candidates, the latency values can range from milliseconds to seconds, while candidate counts may vary from hundreds to millions. Linear plots would compress smaller values and make it difficult to observe performance trends across the full range(as we see above). Logarithmic scaling transforms exponential relationships into linear ones, making it easier to identify proportional changes, compare relative performance improvements, and detect patterns that would otherwise be obscured. This visualization approach is particularly valuable for understanding how each precision type scales with increasing workload and for identifying the optimal operating ranges where certain methods outperform others(as shown below). Figure 4. Search latency vs the number of candidates (log scale) for top-k=10. Figure 5. Search latency vs the number of candidates (log scale) for top-k=100. The performance characteristics observed in the logarithmic plots above directly reflect the architectural differences inherent in binary quantization's two-stage retrieval process. Binary quantization employs a coarse-to-fine search strategy: an initial fast retrieval phase using low-precision binary representations, followed by a refinement phase that rescores the top-k candidates using full-precision vectors to restore accuracy. This dual-phase approach creates a fundamental performance trade-off that manifests differently across varying candidate pool sizes. For smaller candidate sets, the computational savings from binary operations during the initial retrieval phase can offset the rescoring overhead, making binary quantization competitive with other methods. However, as the candidate pool expands, the rescoring phase—which must compute full-precision similarity scores for an increasing number of retrieved candidates—begins to dominate the total latency profile. Measuring representational capacity retention While latency optimization is critical for operational efficiency, the primary concern for AI applications remains semantic accuracy. Vector quantization introduces a fundamental trade-off: computational efficiency versus representational capacity. Even the most performant quantization approach is useless if it fails to maintain the semantic relationships encoded in the original embeddings. To quantify this critical quality dimension, we developed a systematic methodology for measuring representational capacity retention—the degree to which quantized vectors preserve the same nearest-neighbor relationships as their full-precision counterparts. This approach provides an objective, reproducible framework for evaluating semantic fidelity across different quantization strategies. def measure_representational_capacity_retention_against_float_enn( ground_truth_collection, collection, quantized_index_name, top_k_values, num_candidates_values, num_queries_to_test=1, ): """ Compare quantized search results against full-precision baseline. For each test query: 1. Perform baseline search with float32 exact search 2. Perform same search with quantized vectors 3. Calculate retention as % of baseline results found in quantized results """ retention_results = {"per_query_retention": {}} overall_retention = {} # Initialize tracking structures for top_k in top_k_values: overall_retention[top_k] = {} for num_candidates in num_candidates_values: if num_candidates < top_k: continue overall_retention[top_k][num_candidates] = [] # Get precision type precision_name = quantized_index_name.split("vector_index")[1] precision_name = precision_name.replace("quantized", "").capitalize() # Load test queries from ground truth annotations ground_truth_annotations = list( ground_truth_collection.find().limit(num_queries_to_test) ) # For each annotation, test all its questions for annotation in ground_truth_annotations: ground_truth_wiki_id = annotation["wiki_id"] ... # Calculate average retention for each configuration avg_overall_retention = {} for top_k, cand_dict in overall_retention.items(): avg_overall_retention[top_k] = {} for num_candidates, retentions in cand_dict.items(): if retentions: avg = sum(retentions) / len(retentions) else: avg = 0 avg_overall_retention[top_k][num_candidates] = avg retention_results["average_retention"] = avg_overall_retention return retention_results Our methodology takes a rigorous approach to retention measurement: Establishing ground truth: We use float32 exact nearest neighbor (ENN) search as the baseline "perfect" result set, acknowledging that these are the mathematically optimal neighbors. Controlled comparison: For each query in our annotation dataset, we perform parallel searches using different quantization strategies, carefully controlling for top-k and num_candidates parameters. Retention calculation: We compute retention as the ratio of overlapping results between the quantized search and the ENN baseline: |quantized_results ∩ baseline_results| / |baseline_results|. Statistical aggregation: We average retention scores across multiple queries to account for query-specific variations and produce robust, generalizable metrics. This approach provides a direct, quantitative measure of how much semantic fidelity is preserved after quantization. A retention score of 1.0 indicates that the quantized search returns exactly the same results as the full-precision search, while lower scores indicate divergence. Representational capacity results analysis The findings from the representational capacity retention evaluation provide empirical validation that properly implemented quantization—particularly scalar quantization—can maintain semantic fidelity while dramatically reducing computational and memory requirements. Note that in the chart below, the scalar curve (yellow) exactly matches the float32_ann performance (blue)—so much so that the blue line is completely hidden beneath the yellow. The near-perfect retention of scalar quantization should alleviate concerns about quality degradation, while binary quantization's retention profile suggests it's suitable for applications with higher performance demands that can tolerate slight quality trade-offs or compensate with increased exploration depth. Figure 6. Retention score vs the number of candidates for top-k=10. Figure 7. Retention score vs the number of candidates for top-k=50. Figure 8. Retention score vs the number of candidates for top-k=100. Scalar quantization achieves near-perfect retention: The scalar quantization approach (orange line) demonstrates extraordinary representational capacity preservation, achieving 98-100% retention across nearly all configurations. At top-k=10, it reaches perfect 1.0 retention with just 100 candidates, effectively matching full-precision ENN results while using 4x less memory. This remarkable performance validates the effectiveness of int8 quantization when implemented with MongoDB's automatic quantization. Binary quantization shows retention-exploration trade-off: Binary quantization (red line) exhibits a clear correlation between exploration depth and retention quality. At top-k=10, it starts at ~91% retention with minimal candidates but improves to 98% at 500 candidates. The effect is more pronounced at higher top-k values (50 and 100), where initial retention drops to ~74% but recovers substantially with increased exploration. This suggests that binary quantization's information loss can be effectively mitigated by exploring more of the vector space. Retention dynamics change with retrieval depth: As top-k increases from 10 to 100, the retention patterns become more differentiated between quantization strategies. This reflects the increasing challenge of maintaining accurate rankings as more results are requested. While scalar quantization remains relatively stable across different top-k values, binary quantization shows more sensitivity, indicating it's better suited for targeted retrieval scenarios (low top-k) than for broad exploration. Exploration depth compensates for precision loss: A fascinating pattern emerges across all quantization methods: increased num_candidates consistently improves retention. This demonstrates that reduced precision can be effectively counterbalanced by broader exploration of the vector space. For example, binary quantization at 500 candidates achieves better retention than scalar quantization at 25 candidates, despite using 32x less memory per vector. Float32 ANN vs. scalar quantization: The float32 ANN approach (blue line) shows virtually identical retention to scalar quantization at higher top-k values, while consuming 4x more memory. This suggests scalar quantization represents an optimal balance point, offering full-precision quality with significantly reduced resource requirements. Conclusion This guide has demonstrated the powerful impact of vector quantization in optimizing vector search operations through MongoDB Atlas Vector Search and automatic quantization feature, using Voyage AI embeddings. These findings provide empirical validation that properly implemented quantization—particularly scalar quantization—can maintain semantic fidelity while dramatically reducing computational and memory requirements. The near-perfect retention of scalar quantization should alleviate concerns about quality degradation, while binary quantization's retention profile suggests it's suitable for applications with higher performance demands that can tolerate slight quality trade-offs or compensate with increased exploration depth. Binary quantization achieves optimal latency and resource efficiency, particularly valuable for high-scale deployments where speed is critical. Scalar quantization provides an effective balance between performance and precision, suitable for most production applications. Float32 maintains maximum accuracy but incurs significant performance and memory costs. Figure 9. Performance and memory usage metrics for binary quantization, scalar quantization, and float32 implementation. Based on the image above our implementation demonstrated substantial efficiency gains: Binary Quantized Index achieves the most compact disk footprint at 407.66MB, representing approximately 4KB per document. This compression comes from representing high-dimensional vectors as binary bits, dramatically reducing storage requirements while maintaining retrieval capability. Float32 ANN Index requires 394.73MB of disk space, slightly less than binary due to optimized index structures, but demands the full storage footprint be loaded into memory for optimal performance. Scalar Quantized Index shows the largest storage requirement at 492.83MB (approximately 5KB per document), suggesting this method maintains higher precision than binary while still applying compression techniques, resulting in a middle-ground approach between full precision and extreme quantization. The most striking difference lies in memory requirements. Binary quantization demonstrates a 23:1 memory efficiency ratio, requiring only 16.99MB in RAM versus the 394.73MB needed by float32_ann. Scalar quantization provides a 3:1 memory optimization, requiring 131.42MB compared to float32_ann's full memory footprint. For production AI Retrieval implementation, general guidance is as follows: Use scalar quantization for general use cases requiring good balance of speed and accuracy. Use binary quantization for large-scale applications (1M+ vectors) where speed is critical. Use float32 only for applications requiring maximum precision, where accuracy is paramount. Vector quantization becomes particularly valuable for databases exceeding 1M vectors, where it enables significant scalability improvements without compromising retrieval accuracy. When combined with MongoDB Atlas Search Nodes , this approach effectively addresses both cost and performance constraints in advanced vector search applications. Boost your MongoDB skills today through our Atlas Learning Hub . Head over to our quick start guide to get started with Atlas Vector Search.

June 10, 2025

Enhancing AI Observability with MongoDB and Langtrace

Building high-performance AI applications isn’t just about choosing the right models—it’s also about understanding how they behave in real-world scenarios. Langtrace offers the tools necessary to gain deep insights into AI performance, ensuring efficiency, accuracy, and scalability. San Francisco-based Langtrace AI was founded in 2024 with a mission of providing cutting-edge observability solutions for AI-driven applications. While still in its early stages, Langtrace AI has rapidly gained traction in the developer community, positioning itself as a key player in AI monitoring and optimization. Its open-source approach fosters collaboration, enabling organizations of all sizes to benefit from advanced tracing and evaluation capabilities. The company’s flagship product, Langtrace AI, is an open-source observability tool designed for building applications and AI agents that leverage large language models (LLMs). Langtrace AI enables developers to collect and analyze traces and metrics, optimizing performance and accuracy. Built on OpenTelemetry standards, Langtrace AI offers real-time tracing, evaluations, and metrics for popular LLMs, frameworks, and vector databases, with integration support for both TypeScript and Python. Beyond its core observability tools, Langtrace AI is continuously evolving to address the challenges of AI scalability and efficiency. By leveraging OpenTelemetry, the company ensures seamless interoperability with various observability vendors. Its strategic partnership with MongoDB enables enhanced database performance tracking and optimization, ensuring that AI applications remain efficient even under high computational loads. Langtrace AI's technology stack Langtrace AI is built on a streamlined—yet powerful—technology stack, designed for efficiency and scalability. Its SDK integrates OpenTelemetry libraries, ensuring tracing without disruptions. On the backend, MongoDB works with the rest of their tech stack, to manage metadata and trace storage effectively. For the client-side, Next.js powers the interface, utilizing cloud-deployed API functions to deliver robust performance and scalability. Figure 1. How Langtrace AI uses MongoDB Atlas to power AI traceability and feedback loops “We have been a MongoDB customer for the last three years and have primarily used MongoDB as our metadata store. Given our longstanding confidence in MongoDB's capabilities, we were thrilled to see the launch of MongoDB Atlas Vector Search and quickly integrated it into our feedback system, which is a RAG (retrieval-augmented generation) architecture that powers real-time feedback and insights from our users. Eventually, we added native support to trace MongoDB Atlas Vector Search to not only trace our feedback system but also to make it natively available to all MongoDB Atlas Vector Search customers by partnering officially with MongoDB.” Karthik Kalyanaraman, Co Founder and CTO, Langtrace AI. Use cases and impact The integration of Langtrace AI with MongoDB has proven transformative for developers using MongoDB Atlas Vector Search . As highlighted in Langtrace AI's MongoDB partnership announcement , our collaboration equips users with the tools needed to monitor and optimize AI applications, enhancing performance by tracking query efficiency, identifying bottlenecks, and improving model accuracy. The partnership enhances observability within the MongoDB ecosystem, facilitating faster, more reliable application development. Integrating MongoDB Atlas with advanced observability tools like Langtrace AI offers a powerful approach to monitoring and optimizing AI-driven applications. By tracing every stage of the vector search process—from embedding generation to query execution—MongoDB Atlas provides deep insights that allow developers to fine-tune performance and ensure smooth, efficient system operations. To explore how Langtrace AI integrates with MongoDB Atlas for real-time tracing and optimization of vector search operations, check out this insightful blog by Langtrace AI, where they walk through the process in detail. Opportunities for growth and the evolving AI ecosystem Looking ahead, Langtrace AI is excited about the prospects of expanding the collaboration with MongoDB. As developers craft sophisticated AI agents using MongoDB Atlas, the partnership aims to equip them with the advanced tools necessary to fully leverage these powerful database solutions. Together, both companies support developers in navigating increasingly complex AI workflows efficiently. As the AI landscape shifts towards non-deterministic systems with real-time decision-making, the demand for advanced observability and developer tools intensifies. MongoDB is pivotal in this transformation, providing solutions that optimize AI-driven applications and ensuring seamless development as the ecosystem evolves. Explore further Interested in learning more about Langtrace AI and MongoDB partnership? Discover the enriching capabilities Langtrace AI brings to developers within the MongoDB ecosystem. Learn about tracing MongoDB Atlas Vector Search with Langtrace AI to improve AI model performance. Access comprehensive documentation for integrating Langtrace AI with MongoDB Atlas. Start enhancing your AI applications today and experience the power of optimized observability. To learn more about building AI-powered apps with MongoDB, check out our AI Learning Hub and stop by our Partner Ecosystem Catalog to read about our integrations with MongoDB’s ever-evolving AI partner ecosystem.

June 9, 2025

What I Wish I’d Known Before Becoming a Solutions Architect

My journey to becoming a solutions architect (SA) has been anything but straightforward. After working as an engineer in telecom, receiving my PhD in computer science, and spending time in the energy efficiency and finance industries, I joined MongoDB to work at the intersection of AI and data solutions, guiding enterprises to success with MongoDB’s flexible, scalable database platform. It’s a role that requires having both deep technical knowledge and business acumen, and while the nature of the SA role has evolved over time, one thing has remained constant: the need to understand people, their problems, and how the technology we use can solve them. As I reflect on my career journey, here are some key lessons I’ve learned about being an SA—and things I wish I’d known when I first started. 1. Influence comes from understanding In my earlier roles, I thought that presenting clients with a perfect technical solution was the key to success. However, I quickly learned that being a successful solutions architect requires much more than technical excellence. Instead, the solutions that you offer need to be aligned with customers’ business needs. You also need to understand the underlying challenges driving the conversation. In my role, I frequently work with clients facing complex data challenges, whether in real-time analytics, scaling operations, or AI applications. The first step is always understanding their business goals and technical pain points, which is more important than simply proposing the “best” solution. By stepping back and listening, you can not only better design a solution that addresses their needs but also gain their trust. I’ve found that the more I understand the context, the better I can guide clients through the complexities of data architecture—whether they're building on MongoDB Atlas, optimizing for performance, or leveraging our data products to drive innovation. What I wish I’d known: Influence doesn’t come from showing how much you know—it comes from showing how much you understand. Listening is your most powerful design tool. 2. Building champions drives success You can build the most scalable, secure, and elegant system in the world — but if it doesn’t align with stakeholder priorities, it will stall. In reality, architecture is rarely a purely technical exercise. Success depends on alignment with a diverse set of stakeholders, each with their own priorities. Whether you're collaborating with engineering teams, product managers, security specialists, or leadership, the key to success is to engage everyone early and often. Stakeholders are not just passive recipients of your solution; they are active participants who co-own the outcome. In many cases, your design will be shaped by their feedback, and finding a champion within the organization can make all the difference. This champion—whether from the technical side or the business side—will help advocate for your solution internally, align the team, and overcome any resistance. This is particularly important for MongoDB SAs because we’re often addressing diverse needs, from data privacy concerns to performance scalability. Building a strong internal advocate ensures that your design gains the necessary momentum and credibility within the client’s organization. What I wish I’d known: Success doesn’t come from being right—it comes from being aligned. Influence is earned through empathy, clarity, and trust. As a solutions architect, your greatest value is not just in solving technical problems—it’s in helping diverse teams pull in the same direction. And nothing accelerates that more than having a strong, trusted internal champion on your side. 3. Winning deals requires teamwork At MongoDB, we’re not just selling a product—we’re selling a solution. Winning deals involves close collaboration with Sales, Engineering, and Client Services. The most successful deals come when the entire team is aligned, from understanding the customer’s unique needs to crafting a solution that fits their long-term goals. You want to win? Here’s what that actually looks like: You prep with sales like it’s a final exam. Know the account history, know the politics, know what was promised six months ago that never landed. Be the person who connects past pain to future value. You do dry runs and anticipate the tough questions. Then you hand those questions to someone else on your team who can knock them out of the park. That’s trust. You turn strategy decks into conversations . A flashy diagram is great, but asking “Does this actually solve the headache you told us about last week?” — that’s where momentum starts. You loop in Professional Services early to pressure-test feasibility. You loop in CSMs to ask, “If we win this, what does success look like a year from now?” You help sales write the follow-up  — not just with a thank-you, but with a crisp summary of what we heard, what we proposed, and what comes next. You make the path forward obvious. One of the most valuable lessons I’ve learned is that winning a deal doesn’t rely solely on delivering a flawless demo. It’s the little things that matter—anticipating questions, making quick adjustments based on client feedback, and being agile in your communication. Being part of a unified team that works seamlessly together is the key to winning deals and ensuring client success. What I wish I’d known: Winning a deal is a series of micro-decisions made together, not a solo act. Great architecture doesn’t close a deal—great alignment does. Your best asset isn’t the system you design—it’s the trust you build with your team and the confidence you project to your client that we’ve got this. Together. 4. You don’t have to know everything When I first transitioned into this role, I felt the pressure to master every piece of the tech stack—especially at MongoDB, where our solutions touch on everything from cloud data platforms to AI, real-time data processing, and beyond. It was overwhelming to think that I needed to be an expert in all of it. But here’s the truth: As a solutions architect, your real value lies not in knowing every detail, but in understanding how the pieces fit together. You don’t need to be the deepest expert in each technology—what’s important is knowing how MongoDB’s platform integrates with client needs and when to bring in the right specialists. The role is about connecting the dots, asking the right questions, and collaborating across teams. The more you embrace curiosity and rely on your colleagues, the better your solutions will be. What I wish I’d known: Mastery isn’t about knowing all the answers. It’s about knowing which questions to ask, and who to ask them to. Focus on principles, patterns, and clarity. Let go of the pressure to be the smartest person at the table—you’re there to make the table work better together. Curiosity is your compass, and collaboration is your fuel. 5. Architecture lives beyond the diagram When most people think of a solutions architect, they picture designing systems, building technical architectures, and drawing elegant diagrams. While that’s part of the job, the true value lies in how well those designs are communicated, understood, and adopted by the client. Specifically, your architecture needs to work in real-world scenarios. You’re not just drawing idealized diagrams on a whiteboard—you’re helping clients translate those ideas into actionable steps. That means clear communication, whether through shared documentation, interactive walkthroughs, or concise explanations. Understanding your client’s needs and constraints is just as important as the technical design itself. And when it comes to sizing and scaling, MongoDB’s flexibility makes it easy to adapt and grow as the business evolves. What I wish I knew: Architecture doesn’t end at the diagram—it begins there. The real value is realized in how well the design is communicated, contextualized, sized, and adopted. Use whatever format helps people get it. And before you document the system, understand the system of people and infrastructure you’re building it for. 6. It’s not just about data Data may be the foundation of my work as a solutions architect, but the real magic happens when you connect with people. Being a great architect means being a great communicator, listener, and facilitator. You’ll frequently find yourself between business leaders seeking faster insights and developers looking for the right data model. Translating these needs and building consensus is a big part of the role. The solutions we design are only valuable if they meet the diverse needs of the client’s teams. Whether it’s simplifying data operations, optimizing query performance, or enabling AI-driven insights, your ability to connect with stakeholders and address their unique challenges is key. Emotional intelligence, empathy, and collaboration are essential. What I wish I’d known: Being a great architect means being a great communicator, listener, and facilitator. Emotional intelligence is your secret weapon. The more time you invest in understanding your stakeholders’ pain points, motivations, and language, the more successful your architecture will be—because people will actually use it. 7. The job is constantly evolving and so are you The field of data architecture is rapidly evolving, and MongoDB is at the forefront of this change. From cloud migrations to AI-driven data products, the technology landscape is always shifting. As a solutions architect, you have to be adaptable and prepared for the next big change. At MongoDB, we work with cutting-edge technologies and constantly adapt to new trends, whether it’s AI, machine learning, or serverless computing. The key is to embrace change and continuously learn. The more you stay curious and open to new ideas, the more you’ll grow in your role and your ability to drive client success. As MongoDB continues to innovate, the learning curve is steep, but that’s what keeps the job exciting. What I wish I knew: You don’t “arrive” as a solutions architect—you evolve. And that evolution doesn’t stop. But everything you learn builds on itself. No effort is wasted. Every challenge adds depth. Every mistake adds clarity. The technologies may change, but the thinking compounds—and that’s what makes you valuable over the long run. It’s not just a role–it’s a journey Reflecting on my path to becoming a solutions architect at MongoDB, I realize that the journey is far from linear. From network protocols to financial systems and AI-driven data solutions, each role added a new layer to my experience. Becoming a solutions architect didn’t mean leaving behind my past—it meant integrating it into a broader perspective. At MongoDB, every day brings new challenges and opportunities. Whether you’re designing a solution for a global enterprise or helping a startup scale their data operations, the core of the job remains the same: solving problems, connecting people, and helping others succeed. And as you grow in the role, you’ll find that the most powerful thing you bring to the table isn’t just your expertise—it’s your ability to keep learning, to show up with intention, and to simplify complexity for everyone around you. To anyone stepping into this role at MongoDB: welcome. The journey is just beginning! Join our talent community for the latest MongoDB culture and careers content.

June 5, 2025

Navigating the AI Revolution: The Importance of Adaptation

In 1999, Steve Ballmer gave a famous speech in which he said that the “key to industry transformation, the key to success is developers developers developers developers developers developers developers, developers developers developers developers developers developers developers! Yes!” A similar mantra applies when discussing how to succeed with AI: adaptation, adaptation, adaptation! Artificial intelligence has already begun to transform how we work and live, and the changes AI is bringing to the world will only accelerate. Businesses rely ever more heavily on software to run and execute their strategies. So, to keep up with competitors, their processes and products must deliver what end-users increasingly expect: speed, ease of use, personalization—and, of course, AI features. Delivering all of these things (and doing so well) requires having the right tech stack and software foundation in place and then successfully executing. To better understand the challenges organizations adopting AI face, MongoDB and Capgemini recently worked with the research organization TDWI to assess the state of AI readiness across industries. The road ahead Based on a survey “representing a diverse mix of industries and company sizes,” TDWI’s “The State of Data and Operational Readiness for AI ” contains lots of super interesting findings. One I found particularly compelling is the percentage of companies with AI apps in production: businesses largely recognize the potential AI holds, but only 11% of survey respondents indicated that they had AI applications in production. Still only 11%! We’re well past the days of exploring whether AI is relevant. Now, every organization sees the value. The question is no longer ‘if’ but ‘how fast and how effectively’ they can scale it. Mark Oost, VP, AI and Generative AI Group Offer Leader, Capgemini There’s clearly work to be done; data readiness challenges highlighted in the report include managing diverse data types, ensuring accessibility, and providing sufficient compute power. Less than half (39%) of companies surveyed manage newer data formats, and only 41% feel they have enough compute. The report also shows how much AI has changed the very definition of software, and how software is developed and managed. Specifically, AI applications continuously adapt, and they learn and respond to end-user behavior in real-time; they can also autonomously make decisions and execute tasks. All of which depends on having a solid, flexible software foundation. Because the agility and adaptability of software are intrinsically linked to the data infrastructure upon which it's built, rigid legacy systems cannot keep pace with the demands of AI-driven change. So modern database solutions (like, ahem, MongoDB)—built with change in mind—are an essential part of a successful AI technology stack. Keeping up with change The tech stack can be said to comprise three layers: at the “top,” the interface or user experience layer; then the business logic layer; and a data foundation at the bottom. With AI, the same layers are there, but they’ve evolved: Unlike traditional software applications, AI applications are dynamic . Because AI-enriched software can reason and learn, the demands placed on the stack have changed. For example, AI-powered experiences include natural language interfaces, augmented reality, and those that anticipate user needs by learning from other interactions (and from data). In contrast, traditional software is largely static: it requires inputs or events to execute tasks, and its logic is limited by pre-defined rules. A database underpinning AI software must, therefore, be flexible and adaptable, and able to handle all types of data; it must enable high-quality data retrieval; it must respond instantly to new information; and it has to deliver the core requirements of all data solutions: security, resilience, scalability, and performance. So, to take action and generate trustworthy, reliable responses, AI-powered software needs access to up-to-date, context-rich data. Without the right data foundation in place, even the most robust AI strategy will fail. Figure 1. The frequency of change across eras of technology. Keeping up with AI can be head-spinning, both because of the many players in the space (the number of AI startups has jumped sharply since 2022, when ChatGPT was first released 1 ), and because of the accelerating pace of AI capabilities. Organizations that want to stay ahead must evolve faster than ever. As the figure above dramatically illustrates, this sort of adaptability is essential for survival. Execution, execution, execution But AI success requires more than just the right technology: expert execution is critical. Put another way, the difference between success and failure when adapting to any paradigm shift isn’t just having the right tools; it’s knowing how to wield those tools. So, while others experiment, MongoDB has been delivering real-world successes, helping organizations modernize their architectures for the AI era, and building AI applications with speed and confidence. For example, MongoDB teamed up with the Swiss bank Lombard Odier to modernize its banking tech systems. We worked with the bank to create customizable generative AI tooling, including scripts and prompts tailored for the bank’s unique tech stack, which accelerated its modernization by automating integration testing and code generation for seamless deployment. And, after Victoria’s Secret transformed its database architecture with MongoDB Atlas , the company used MongoDB Atlas Vector Search to power an AI-powered visual search system that makes targeted recommendations and helps customers find products. Another way MongoDB helps organizations succeed with AI is by offering access to both technology partners and professional services expertise. For example, MongoDB has integrations with companies across the AI landscape—including leading tech companies (AWS, Google Cloud, Microsoft), system integrators (Capgemini), and innovators like Anthropic, LangChain, and Together AI. Adapt (or else) In the AI era, what organizations need to do is abundantly clear: modernize and adapt, or risk being left behind. Just look at the history of smartphones, which have had an outsized impact on business and communication. For example, in its Q4 2007 report (which came out a few months after the first iPhone’s release), Apple reported earnings of $6.22 billion, of which iPhone sales comprised less than 2% 2 ; in Q1 2025, the company reported earnings of $124.3 billion, of which 56% was iPhone sales. 3 The mobile application market is now estimated to be in the hundreds of billions of dollars, and there are more smartphones than there are people in the world. 4 The rise of smartphones has also led to a huge increase in the number of people globally who use the internet. 5 However, saying “you need to adapt!” is much easier said than done. TWDI’s research, therefore, is both important and useful—it offers companies a roadmap for the future, and helps them answer their most pressing questions as they confront the rise of AI. Click here to read the full TDWI report . To learn more about how MongoDB can help you create transformative, AI-powered experiences, check out MongoDB for Artificial Intelligence . P.S. ICYMI, here’s Steve Ballmer’s famous “developers!” speech. 1 https://ourworldindata.org/grapher/newly-funded-artificial-intelligence-companies 2 https://www.apple.com/newsroom/2007/10/22Apple-Reports-Fourth-Quarter-Results/ 3 https://www.apple.com/newsroom/pdfs/fy2025-q1/FY25_Q1_Consolidated_Financial_Statements.pdf 4 ttps://www.weforum.org/stories/2023/04/charted-there-are-more-phones-than-people-in-the-world/ 5 https://ourworldindata.org/grapher/number-of-internet-users

June 4, 2025

Luna AI and MongoDB Throw Lifeline to Product Teams

Product and engineering leaders face a constant battle: making crucial real-time decisions amidst a sea of fragmented, reactive, and disconnected progress data. The old ways—chasing updates, endlessly pinging teams on Slack, digging through Jira, and enduring endless status meetings—simply aren't cutting it. This struggle leaves product and engineering leads wasting precious hours on manual updates, while critical risks silently slip through the cracks. This crucial challenge is precisely what Luna AI , powered by its robust partnership with MongoDB , is designed to overcome. Introducing Luna AI: Your intelligent program manager Luna AI was founded to tackle this exact problem, empowering product and engineering leaders with the visibility and context they need, without burying their PMs in busy work. Imagine having an AI program manager dedicated to giving you clear insights into goals, roadmap ROI, initiative progress, and potential risks throughout the entire product lifecycle. Luna AI makes this a reality by intelligently summarizing data from your existing tools like Jira and Slack. It can even automatically generate launch and objective and key result (OKR) status updates, create your roadmap, and analyze your Jira sprints, drastically reducing the need for manual busywork. From concept to command center: The evolution of Luna AI Luna AI’s Co-founder, Paul Debahy, a seasoned product leader with experience at Google, personally felt the pain of fragmented data during his time as a CPO. Inspired by Google's internal LaunchCal, which provided visibility into upcoming launches, Luna AI initially began as a launch management tool. However, a key realization quickly emerged: Customers primarily needed help "managing up." This insight led to a pivotal shift, focusing Luna AI on vertical management—communicating status, linking execution to strategy, and empowering leaders, especially product leaders, to drive decisions. Today, Luna AI has evolved into a sophisticated AI-driven insights platform. Deep Jira integration and advanced LLM modules have transformed it from a simple tracker into a strategic visibility layer. Luna AI now provides essential capabilities like OKR tracking, risk detection, resource and cost analysis, and smart status summaries. Luna AI believes product leadership is increasingly strategic, aiming to be the system of record for outcomes, not just tasks. Its mission: to be everyone’s AI program manager, delivering critical strategy and execution insights for smarter decision-making. The power under the hood: Building with MongoDB Atlas Luna AI’s robust technology stack includes Node.js, Angular, and the latest AI/LLM models. Its infrastructure leverages Google Cloud and, crucially, MongoDB Atlas as its primary database. When selecting a data platform, Luna AI prioritized flexibility, rapid iteration, scalability, and security. Given the dynamic, semi-structured data ingested from diverse sources like Jira, Slack, and even meeting notes, a platform that could handle this complexity was essential. Key requirements included seamless tenant separation, robust encryption, and minimal operational overhead. MongoDB proved to be the perfect fit for several reasons. The developer-friendly experience was a major factor, as was the flexible schema of its document database, which naturally accommodated Luna AI’s complex and evolving data model. This flexibility was vital for tracking diverse information such as Jira issues, OKRs, AI summaries, and Slack insights, enabling quick adaptation and iteration. MongoDB also offered effortless support for the startup’s multi-tenant architecture. Scaling with MongoDB Atlas has been smooth and fast, according to Luna AI. Atlas effortlessly scaled as the company added features and onboarded workspaces ranging from startups to enterprises. The monitoring dashboard has been invaluable, offering insights that helped identify performance bottlenecks early. In fact, index suggestions from the dashboard directly led to significant improvements to speed. Debahy even remarked, "Atlas’s built-in insights make it feel like we have a DB ops engineer on the team." Luna AI relies heavily on Atlas's global clusters and automated scaling . The monitoring and alerting features provide crucial peace of mind, especially during launches or data-intensive tasks like Jira AI epic and sprint summarization. The monitoring dashboard was instrumental in resolving high-latency collections by recommending the right indexes. Furthermore, in-house backups are simple, fast, and reliable, with painless restores offering peace of mind. Migrating from serverless to dedicated instances was seamless and downtime-free. Dedicated multi-tenant support allows for unlimited, isolated databases per customer. Auto-scaling is plug-and-play, with Atlas handling scaling across all environments. Security features like data-at-rest encryption and easy access restriction management per environment are also vital benefits. The support team has consistently been quick, responsive, and proactive. A game-changer for startups: The MongoDB for Startups program Operating on a tight budget as a bootstrapped and angel-funded startup, Luna AI found the MongoDB for Startups program to be a true game changer. It stands out as one of the most founder-friendly programs the company has encountered. The Atlas credits completely covered the database costs, empowering the team to test, experiment, and even make mistakes without financial pressure. This freedom allowed them to scale without worrying about database expenses or meticulously tracking every compute and resource expenditure. Access to technical advisors and support was equally crucial, helping Luna AI swiftly resolve issues ranging from load management to architectural decisions and aiding in designing a robust data model from the outset. The program also opened doors to a valuable startup community, fostering connections and feedback. Luna AI’s vision: The future of product leadership Looking ahead, Luna AI is focused on two key areas: Building a smarter, more contextual insights layer for strategy and execution. Creating a stakeholder visibility layer that requires no busy work from product managers. Upcoming improvements include predictive risk alerts spanning Jira, Slack, and meeting notes. They are also developing ROI-based roadmap planning and prioritization, smart AI executive status updates, deeper OKR traceability, and ROI-driven tradeoff analysis. Luna AI firmly believes that the role of product leadership is becoming increasingly strategic. With the support of programs like MongoDB for Startups, they are excited to build a future where Luna AI is the definitive system of record for outcomes. Ready to empower your product team? Discover how Luna AI helps product teams thrive. Join the MongoDB for Startups program to start building faster and scaling further with MongoDB!

June 3, 2025

Conformance Checking at MongoDB: Testing That Our Code Matches Our TLA+ Specs

Some features mentioned below have been sunset since this paper was originally written. Visit our docs to learn more. At MongoDB, we design a lot of distributed algorithms—algorithms with lots of concurrency and complexity, and dire consequences for mistakes. We formally specify some of the scariest algorithms in TLA+, to check that they behave correctly in every scenario. But how do we know that our implementations conform to our specs? And how do we keep them in sync as the implementation evolves? This problem is called conformance checking. In 2020, my colleagues and I experimented with two MongoDB products, to see if we could test their fidelity to our TLA+ specs. Here's a video of my presentation on this topic at the VLDB conference. (It'll be obvious to you that I recorded it from my New York apartment in deep Covid lockdown.) Below, I write about our experience with conformance checking from 2025's perspective. I'll tell you what worked for us in 2020 and what didn't, and what developments there have been in the field in the five years since our paper. Agile modelling Our conformance-checking project was born when I read a paper from 2011—"Concurrent Development of Model and Implementation"—which described a software methodology called eXtreme Modelling. The authors argued that there's a better way to use languages like TLA+, and I was convinced. They advocated a combination of agile development and rigorous formal specification: Multiple specifications model aspects of the system. Specifications are written just prior to the implementation. Specifications evolve with the implementation. Tests are generated from the model, and/or trace-checking verifies that test traces are legal in the specification. I was excited about this vision. Too often, an engineer tries to write one huge TLA+ spec for the whole system. It's too complex and detailed, so it's not much easier to understand than the implementation code, and state-space explosion dooms model checking. The author abandons the spec and concludes that TLA+ is impractical. In the eXtreme Modelling style, a big system is modeled by a collection of small specs, each focusing on an aspect of the whole. This was the direction MongoDB was already going, and it seemed right to me. In eXtreme Modelling, the conformance of the spec and implementation is continuously tested. The authors propose two conformance checking techniques. To understand these, let's consider what a TLA+ spec is: it's a description of an algorithm as a state machine. The state machine has a set of variables, and each state is an assignment of specific values to those variables. The state machine also has a set of allowed actions, which are transitions from one state to the next state. You can make a state graph by drawing states as nodes and allowed actions as edges. A behavior is any path through the graph. This diagram shows the whole state graph for some very simple imaginary spec. One of the spec's behaviors is highlighted in green. Figure 1. A formal spec's state graph, with one behavior highlighted. The spec has a set of behaviors B spec , and the implementation has a set of behaviors B impl . An implementation refines a spec if B impl ⊂ B spec . If the converse is also true, if B spec ⊂ B impl , then this is called bisimulation , and it's a nice property to have, though not always necessary for a correctly implemented system. You can test each direction: Test-case generation: For every behavior in B spec , generate a test case that forces the implementation to follow the same sequence of transitions. If there's a spec behavior the implementation can't follow, then B spec ⊄ B impl , and the test fails. Trace-checking: For every behavior in B impl , generate a trace: a log file that records the implementation's state transitions, including all implementation variables that match spec variables. If the behavior recorded in the trace isn't allowed by the spec, then B impl ⊄ B spec and the test fails. Figure 2. Two ways to test that the spec's behaviors are the same as the implementation's. Non-conforming behaviors are highlighted in red. Both techniques can be hard, of course. For test-case generation, you must somehow control every decision the implementation makes, squash all nondeterminism, and force it to follow a specific behavior. If the spec's state space is huge, you have to generate a huge number of tests, or choose an incomplete sample. Trace-checking, on the other hand, requires you to somehow map the implementation's state back to the spec's, and log a snapshot of the system state each time it changes—this is really hard with multithreaded programs and distributed systems. And you need to make the implementation explore a variety of behaviors, via fault-injection and stress-testing, and so on. Completeness is usually impossible. We found academic papers that demonstrated both techniques on little example applications, but we hadn’t seen them tried on production-scale systems like ours. I wanted to see how well they work, and what it would take to make them practical. I recruited my colleagues Judah Schvimer and Max Hirschhorn to try it with me. Judah and I tried trace-checking the MongoDB server (in the next section), and Max tried test-case generation with the MongoDB Mobile SDK (the remainder of this article). Figure 3. We tried two conformance checking techniques on two MongoDB products. Trace-checking the MongoDB server For the trace-checking experiment, the first step Judah and I took was to choose a TLA+ spec. MongoDB engineers had already written and model-checked a handful of specs that model different aspects of the MongoDB server (see this presentation and this one ). We chose RaftMongo.tla , which focuses on how servers learn the commit point, which I'll explain now. MongoDB is typically deployed as a replica set of cooperating servers, usually three of them. They achieve consensus with a Raft-like protocol . First, they elect one server as the leader. Clients send all writes to the leader, which appends them to its log along with a monotonically increasing logical timestamp. Followers replicate the leader's log asynchronously, and they tell the leader how up-to-date they are. The leader keeps track of the commit point—the logical timestamp of the newest majority-replicated write. All writes up to and including the commit point are committed, all the writes after it are not. The commit point must be correctly tracked even when leaders and followers crash, messages are lost, a new leader is elected, uncommitted writes are rolled back, and so on. RaftMongo.tla models this protocol, and it checks two invariants: A safety property, which says that no committed write is ever lost, and a liveness property, which says that all servers eventually learn the newest commit point. Figure 4. MongoDB replica set servers and their logs. Judah and I wanted to test that MongoDB's C++ implementation matched our TLA+ spec, using trace-checking. Here are the steps: Run randomized tests of the implementation. Collect execution traces. Translate the execution traces into TLA+. Check the trace is permitted by the spec. Figure 5. The trace-checking workflow. The MongoDB server team has hundreds of integration tests handwritten in JavaScript, from which we chose about 300 for this experiment. We also have randomized tests; we chose one called the "rollback fuzzer" which does random CRUD operations while randomly creating and healing network partitions, causing uncommitted writes to be logged and rolled back. We added tracing code to the MongoDB server and ran each test with a three-node replica set. Since all server processes ran on one machine and communicated over localhost, we didn't worry about clock synchronization: we just merged the three logs, sorting by timestamp. We wrote a Python script to read the combined log and convert it into a giant TLA+ spec named Trace.tla with a sequence of states for the whole three-server system. Trace.tla asserted only one property: "This behavior conforms to RaftMongo.tla." Here's some more detail about the Python script. At each moment during the test, the system has some state V, which is the values of the state variables for each node. The script tries to reconstruct all the changes to V and record them in Trace.tla. It begins by setting V to a hardcoded initial state V0, and outputs it as the first state of the sequence: \* Each TLA+ tuple is \* <<action, committedEntries, currentTerm, log, role, commitPoint, \* serverLogLocation>> \* We know the first state: all nodes are followers with empty logs. Trace == << <<"Init", \* action name <<"Follower","Follower","Follower">>, \* role per node <<1, 1, 1>>, \* commitPoint per node <<<<...>>,<<...>>,<<...>>>>, \* log per node "">>, \* trace log location (empty) \* ... more states will follow ... The script reads events from the combined log and updates V. Here's an example where Node 1 was the leader in state Vi, then Node 2 logs that it became leader. The script combines these to produce Vi+1 where Node 2 is the leader and Node 1 is now a follower. Note, this is a lie. Node 1 didn't actually become a follower in the same instant Node 2 became leader. Foreshadowing! This will be a problem for Judah and me. Figure 6. Constructing the next state from a trace event. Anyway, the Python script appends a state to the sequence in Trace.tla: Trace == << \* ... thousands of events ... <<"BecomePrimary", \* action name for debugging <<"Follower","Leader","Follower">>, \* role per node <<1, 1, 1>>, \* commitPoint per node <<<<...>>,<<...>>,<<...>>>>, \* log per node \* trace log location, for debugging: "/home/emptysquare/RollbackFuzzer/node2.log:12345">>, \* ... thousands more events ... >> We used the Python script to generate a Trace.tla file for each of the hundreds of tests we'd selected: handwritten JavaScript tests and the randomized "rollback fuzzer" test. Now we wanted to use the model-checker to check that this state sequence was permitted by our TLA+ spec, so we know our C++ code behaved in a way that conforms to the spec. Following a technique published by Ron Pressler , we added these lines to each Trace.tla: VARIABLES log, role, commitPoint \* Instantiate our hand-written spec, RaftMongo.tla. Model == INSTANCE RaftMongo VARIABLE i \* the trace index \* Load one trace event. Read == /\ log = Trace[i][4] /\ role = Trace[i][5] /\ commitPoint = Trace[i][6] ReadNext == /\ log' = Trace[i'][4] /\ role' = Trace[i'][5] /\ commitPoint' = Trace[i'][6] Init == i = 1 /\ Read Next == \/ i < Len(Trace) /\ i' = i + 1 /\ ReadNext \/ UNCHANGED <<i, vars>> \* So that we don’t get a deadlock error in TLC TraceBehavior == Init /\ [][Next]_<<vars, i>> \* To verify, we check the spec TraceBehavior in TLC, with Model!SpecBehavior \* as a temporal property. We run the standard TLA+ model-checker ("TLC"), which tells us if this trace is an allowed behavior in RaftMongo.tla. But this whole experiment failed. Our traces never matched our specification. We didn't reach our goal, but we learned three lessons that could help future engineers. What disappointment taught us Lesson one: It's hard to snapshot a multithreaded program's state. Each time a MongoDB node executes a state transition, it has to snapshot its state variables in order to log them. MongoDB is highly concurrent with fairly complex locking within each process—it was built to avoid global locking. It took us a month to figure out how to instrument MongoDB to get a consistent snapshot of all these values at one moment. We burned most of our budget for the experiment, and we worried we'd changed MongoDB too much (on a branch) to test it realistically. The 2024 paper "Validating Traces of Distributed Programs Against TLA+ Specifications" describes how to do trace-checking when you can only log some of the values (see my summary at the bottom of this page). We were aware of this option back in 2020, and we worried it would make trace-checking too permissive; it wouldn't catch every bug. Lesson two: The implementation must actually conform to the spec. This is obvious to me now. After all, conformance checking was the point of the project. In our real-life implementation, when an old leader votes for a new one, first the old leader steps down, then the new leader steps up. The spec we chose for trace-checking wasn't focused on the election protocol, though, so for simplicity, the spec assumed these two actions happened at once. (Remember I said a few paragraphs ago, "This is a lie"?) Judah and I knew about this discrepancy—we'd deliberately made this simplification in the spec. We tried to paper over the difference with some post-processing in our Python script, but it never worked. By the end of the project, we decided we should have backtracked, making our spec much more complex and realistic, but we'd run out of time. The eXtreme Modelling methodology says we should write the spec just before the implementation. But our spec was written long after most of the implementation, and it was highly abstract. I can imagine another world where we knew about eXtreme Modelling and TLA+ at the start, when we began coding MongoDB. In that world, we wrote our spec before the implementation, with trace-checking in mind. The spec and implementation would've been structured similarly, and this would all have been much easier. Lesson three: Trace-checking should extend easily to multiple specs. Judah and I put in 10 weeks of effort without successfully trace-checking one spec, and most of the work was specific to that spec, RaftMongo.tla. Sure, we learned general lessons (you're reading some of them) and wrote some general code, but even if we'd gotten trace-checking to work for one spec we'd be practically starting over with the next spec. Our original vision was to gather execution traces from all our tests, and trace-check them against all of our specifications, on every git commit. We estimated that the marginal cost of implementing trace-checking for more specs wasn't worth the marginal value, so we stopped the project. Practical trace-checking If we started again, we'd do it differently. We'd ensure the spec and implementation conform at the start, and we'd fix discrepancies by fixing the spec or the implementation right away. We'd model easily observed events like network messages, to avoid snapshotting the internal state of a multithreaded process. I still think trace-checking is worthwhile. I know it's worked for other projects. In fact MongoDB is sponsoring a grad student Finn Hackett , whom I'm mentoring, to continue trace-checking research. Let's move on to the second half of our project. Test-case generation for MongoDB Mobile SDK The MongoDB Mobile SDK is a database for mobile devices that syncs with a central server (since we wrote the paper, MongoDB has sunsetted the product ). Mobile clients can make changes locally. These changes are periodically uploaded to the server and downloaded by other clients. The clients and the server all use the same algorithm to resolve write conflicts: Operational Transformation , or OT. Max wanted to test that the clients and server implement OT correctly, meaning they resolve conflicts the same way, eventually resulting in identical data everywhere. Originally, the clients and server shared one C++ implementation of OT, so we knew they implemented the same algorithm. But in 2020, we'd recently rewritten the server in Go, so testing their conformance became urgent. Figure 7. MongoDB mobile SDK. My colleague Max Hirschhorn used test-case generation to check conformance. This technique goes in the opposite direction from trace-checking: trace-checking starts with an implementation and checks that its behaviors are allowed by the spec, but test-case generation starts with a spec and checks that its behaviors are in the implementation. But first, we needed a TLA+ spec. Before this project, the mobile team had written out the OT algorithm in English and implemented it in C++. Max manually translated the algorithm from C++ to TLA+. In the mobile SDK, clients can do 19 kinds of operations on data; six of these can be performed on arrays, resulting in 21 array merge rules, which are implemented in about 1000 lines of C++. Those 21 rules are the most complex, and Max focused his specification there. He used the model-checker to verify that his TLA+ spec ensured all participants eventually had the same data. This translation was a gruelling job, but the model-checker caught Max's mistakes quickly, and he finished in two weeks. There was one kind of write conflict that crashed the model-checker: if one participant swapped two array elements, and another moved an element, then the model-checker crashed with a Java StackOverflowError. Surprisingly, this was an actual infinite-recursion bug in the algorithm. Max verified that the bug was in the C++ code. It had hidden there until he faithfully transcribed it into TLA+ and discovered it with the model-checker. He disabled the element-swap operation in his TLA+ spec, and the mobile team deprecated it in their implementation. To test conformance, Max used the model-checker to output the entire state graph for the spec. He constrained the algorithm to three participants, all editing a three-element array, each executing one (possibly conflicting) write operation. With these constraints, the state space is a DAG, with a finite number of behaviors (paths from an initial state to a final state). There are 30,184 states and 4913 behaviors. Max wrote a Go program to parse the model-checker's output and write out a C++ unit test for each behavior. Here’s an example unit test. (It's edited down from three participants to two.) At the start, there's an array containing {1, 2, 3}. One client sets the third element of an array to 4 and the second client removes the second element from the array. The test asserts that both clients agree the final array is {1, 4}. TEST(Transform_Array) { size_t num_clients = 2; TransformArrayFixture fixture{test_context, num_clients, {1, 2, 3}}; fixture.transaction(0, [](TableRef array) { array->set_int(0, 2, 4); }); fixture.transaction(1, [](TableRef array) { array->remove(1); }); fixture.sync_all_clients(); fixture.check_array({1, 4}); fixture.check_ops(0, {ArrayErase{1}}); fixture.check_ops(1, {ArraySet{1, 4}}); } These 4913 tests immediately achieved 100% branch coverage of the implementation, which we hadn't accomplished with our handwritten tests (21%) or millions of executions with the AFL fuzzer (92%). Retrospective Max's test-case generation worked quite well. He discovered a bug in the algorithm, and he thoroughly checked that the mobile SDK's Operational Transformation code conforms to the spec. Judah's and my trace-checking experiment didn't work: our spec and code were too far apart, and adding tracing to MongoDB took too long. Both techniques can work, given the right circumstances and strategy. Both techniques can fail, too! We published our results and lessons as a paper in VLDB 2020, titled " eXtreme Modelling in Practice ." In the subsequent five years, I've seen some progress in conformance checking techniques. Test-case generation: Model Checking Guided Testing for Distributed Systems . The "Mocket" system generates tests from a TLA+ spec, and instruments Java code (with a fair amount of human labor) to force it to deterministically follow each test, and check that its variables have the same values as the spec after each action. The authors tested the conformance of three Java distributed systems and found some new bugs. Their technique is Java-specific but could be adapted for other languages. Multi-Grained Specifications for Distributed System Model Checking and Verification . The authors wrote several new TLA+ specs of Zookeeper, at higher and lower levels of abstraction. They checked conformance between the most concrete specs and the implementation, with a technique similar to Mocket: a human programmer instruments some Java code to map Java variables to spec variables, and to make all interleavings deterministic. The model-checker randomly explores spec behaviors, while the test framework checks that the Java code can follow the same behaviors. SandTable: Scalable Distributed System Model Checking with Specification-Level State Exploration . This system is not language-specific: it overrides system calls to control nondeterminism and force the implementation to follow each behavior of the spec. It samples the spec's state space to maximize branch coverage and event diversity while minimizing the length of each behavior. As in the "Multi-Grained" paper, the SandTable authors wisely developed new TLA+ specs that closely matched the implementations they were testing, rather than trying to use existing, overly abstract specs like Judah and I did. Plus, my colleagues Will Schultz and Murat Demirbas are publishing a paper in VLDB 2025 that uses test-case generation with a new TLA+ spec of MongoDB's WiredTiger storage layer, the paper is titled "Design and Modular Verification of Distributed Transactions in MongoDB." Trace-checking: Protocol Conformance with Choreographic PlusCal . The authors write new specs in an extremely high-level language that compiles to TLA+. From their specs they generate Go functions for trace-logging, which they manually add to existing Go programs. They check that the resulting traces are valid spec behaviors and find some bugs. Validating Traces of Distributed Programs Against TLA+ Specifications . Some veteran TLA+ experts demonstrate in detail how to trace-log from a Java program and validate the traces with TLC, the TLA+ model-checker. They've written small libraries and added TLC features for convenience. This paper focuses on validating incomplete traces: if you can only log some of the variables, TLC will infer the rest. Smart Casual Verification of the Confidential Consortium Framework . The authors started with an existing implementation of a secure consensus protocol. Their situation was like mine in 2020 (new specs of a big old C++ program) and so was their goal: to continuously check conformance and keep the spec and implementation in sync. Using the new TLC features announced in the "Validating Traces" paper above, they toiled for months, brought their specs and code into line, found some bugs, and realized the eXtreme Modelling vision. Finn Hackett is a PhD student I'm mentoring, he's developed a TLA+-to-Go compiler . He's now prototyping a trace-checker to verify that the Go code he produces really conforms to its source spec. We're doing a summer project together with Antithesis to thoroughly conformance-check the implementation's state space. I'm excited to see growing interest in conformance checking, because I think it's a serious problem that needs to be solved before TLA+ goes mainstream. The "Validating Traces" paper announced some new trace-checking features in TLC, and TLC's developers are discussing a better way to export a state graph for test-case generation . I hope these research prototypes lead to standard tools, so engineers can keep their code and specs in sync. Join our MongoDB Community to learn about upcoming events, hear stories from MongoDB users, and connect with community members from around the world.

June 2, 2025

Mongoose Now Natively Supports QE and CSFLE

Mongoose 8.15.0 has been released, which adds support for the industry-leading encryption solutions available from MongoDB. With this update, it’s simpler than ever to create documents leveraging MongoDB Queryable Encryption (QE) and Client-Side Level Field Encryption (CSFLE), keeping your data secure when it is in use. Read on to learn more about approaches to encrypting your data when building with MongoDB and Mongoose. What is Mongoose? Mongoose is a library that enables elegant object modeling for Node.js applications working with MongoDB. Similar to an Object-Relational Mapper (ORM), the Mongoose Object Document Mapper (ODM) simplifies programmatic data interaction through schemas and models. It allows developers to define data structures with validation and provides a rich API for CRUD operations, abstracting away many of the complexities of the underlying MongoDB driver. This integration enhances productivity by enabling developers to work with JavaScript objects instead of raw database queries, making it easier to manage data relationships and enforce data integrity. What is QE and CSFLE? Securing sensitive data is paramount. It must be protected at every stage—whether in transit, at rest, or in use. However, implementing in-use encryption can be complex. MongoDB offers two approaches to make it easier: Queryable Encryption (QE) and Client-Side Level Field Encryption (CSFLE). QE allows customers to encrypt sensitive application data, store it securely in an encrypted state in the MongoDB database, and perform equality and range queries directly on the encrypted data. An industry-first innovation, QE eliminates the need for costly custom encryption solutions, complex third-party tools, or specialized cryptography knowledge. It employs a unique structured encryption schema, developed by the MongoDB Cryptography Research Group , that simplifies the encryption of sensitive data while enabling equality and range queries to be performed directly on data without having to decrypt it. The data remains encrypted at all stages, with decryption occurring only on the client side. This architecture supports solidified strict access controls, where MongoDB and even an organization’s own database administrators (DBAs) don’t have access to sensitive data. This design enhances security by keeping the server unaware of the data it processes, further mitigating the risk of exposure and minimizing the potential for unauthorized access. Adding QE/CSFLE auto-encryption support for Mongoose The primary goal of the Mongoose integration with QE and CSFLE is to provide idiomatic support for automatic encryption, simplifying the process of creating encrypted models. With native support for QE and CSFLE, Mongoose allows developers to define encryption options directly within their schemas without the need for separate configurations. This first-class API enables developers to work within Mongoose without dropping down to the driver level, minimizing the need for significant code changes when adopting QE and CSFLE. Mongoose streamlines configuration by automatically generating the encrypted field map. This ensures that encrypted fields align perfectly with the schema and simplifies the three-step process typically associated with encryption setup, shown below. Mongoose also keeps the schema and encrypted fields in sync, reducing the risk of mismatches. Developers can easily declare fields with the encrypt property and configure encryption settings, using all field types and encryption schemes supported by QE and CSFLE. Additionally, users can manage their own encryption keys, enhancing control over their encryption processes. This comprehensive approach empowers developers to implement robust encryption effortlessly while maintaining operational efficiency. Pre-integration experience const kmsProviders = { local: { key: Buffer.alloc(96) }; const keyVaultNamespace = 'data.keys'; const extraOptions = {}; const encryptedDatabaseName = 'encrypted'; const uri = '<mongodb URI>'; const encryptedFieldsMap = { 'encrypted.patent': { encryptedFields: EJSON.parse('<EJSON string containing encrypted fields, either output from manual creation or createEncryptedCollection>', { relaxed: false }), } }; const autoEncryptionOptions = { keyVaultNamespace, kmsProviders, extraOptions, encryptedFieldsMap }; const schema = new Schema({ patientName: String, patientId: Number, field: String, patientRecord: { ssn: String, billing: String } }, { collection: 'patent' }); const connection = await createConnection(uri, { dbName: encryptedDatabaseName, autoEncryption: autoEncryptionOptions, autoCreate: false, // If using createEncryptedCollection, this is false. If manually creating the keyIds for each field, this is true. }).asPromise(); const PatentModel = connection.model('Patent', schema); const result = await PatentModel.find({}).exec(); console.log(result); This example demonstrates the manual configuration required to set up a Mongoose model for QE and CSFLE, requiring three different steps to: Define an encryptedFieldsMap to specify which fields to encrypt Configure autoEncryptionOptions with key management settings Create a Mongoose connection that incorporates these options This process can be cumbersome, as it requires explicit setup for encryption. New experience with Mongoose 8.15.0 const schema = new Schema({ patientName: String, patientId: Number, field: String, patientRecord: { ssn: { type: String, encrypt: { keyId: '<uuid string of key id>', queries: 'equality' } }, billing: { type: String, encrypt: { keyId: '<uuid string of key id>', queries: 'equality' } }, } }, { encryptionType: 'queryableEncryption', collection: 'patent' }); const connection = mongoose.createConnection(); const PatentModel = connection.model('Patent', schema); const keyVaultNamespace = 'client.encryption'; const kmsProviders = { local: { key: Buffer.alloc(96) }; const uri = '<mongodb URI>'; const keyVaultNamespace = 'data.keys'; const autoEncryptionOptions = { keyVaultNamespace, kmsProviders, extraOptions: {} }; await connection.openUri(uri, { autoEncryption: autoEncryptionOptions}); const result = await PatentModel.find({}).exec(); console.log(result); This "after experience" example showcases how the integration of QE and CSFLE into Mongoose simplifies the encryption setup process. Instead of the previous three-step approach, developers can now define encryption directly within the schema. In this implementation, fields like ssn and billing are marked with an encrypt property, allowing for straightforward configuration of encryption settings, including the keyId and query types. The connection to the database is established with a single call that includes the necessary auto-encryption options, eliminating the need for a separate encrypted fields map and complex configurations. This streamlined approach enables developers to work natively within Mongoose, enhancing usability and reducing setup complexity while maintaining robust encryption capabilities. Learn more about QE/CSFLE for Mongoose We’re excited for you to build secure applications with QE/CSFLE for Mongoose. Here are some resources to get started with: Learn how to set up use Mongoose with MongoDB through our tutorial. Check out our documentation to learn when to choose QE vs. CSFLE . Read Mongoose CSFLE documentation .

June 2, 2025

MongoDB Atlas Stream Processing Now Supports Session Windows!

We're excited to announce that MongoDB Atlas Stream Processing now supports Session Windows ! This powerful feature lets you build streaming pipelines that analyze and process related events that occur together over time, grouping them into meaningful sessions based on periods of activity. For instance, you can now track all of a customer’s interactions during a shopping journey, treating it as a single session that ends when they’re inactive for a specified period of time. Whether you're analyzing user behavior, monitoring IoT device activities, or tracking system operations, Atlas Stream Processing’s Session Windows make it easy to transform your continuous data streams into actionable insight, and make the data available wherever you need to use it. What are Session Windows? Session Windows are a powerful way to analyze naturally occurring activity patterns in your data by grouping related events that happen close together in time. Think of how users interact with websites or apps—they tend to be active for a period, then take breaks, then return for another burst of activity. Session Windows automatically detect these patterns by identifying gaps in activity, allowing you to perform aggregations and transformations on these meaningful periods of activity. As an example, imagine you're an e-commerce company looking to better understand what your customers do during each browsing session to help improve conversions. With Atlas Stream Processing, you can build a pipeline that: Collects all the product pages a user visits during their browsing session Records the name, category, and price of each item viewed, plus whether items were added to a cart Automatically considers a session complete after 15 minutes of user inactivity Sends the session data to cloud storage to improve recommendation engines With this pipeline, you provide your recommendation engine with ready-to-use data about your user sessions to improve your recommendations in real time. Unlike fixed time-based windows ( tumbling or hopping ), Session Windows adapt dynamically to each user’s behavior patterns. How does it work? Session Windows work similarly to the hopping and tumbling windows Atlas Stream Processing already supports, but with a critical difference: while those windows open and close on fixed time intervals, Session Windows dynamically adjust based on activity patterns. To implement a Session Window, you specify three required components: partitionBy : This is the field or fields that group your records into separate sessions. For instance, if tracking user sessions, use unique user IDs to ensure each user’s activity is processed separately. gap : This is the period of inactivity that signals the end of a session. For instance, in the above example, we consider a user's session complete when they go 15 minutes without clicking on a link in the website or app. pipeline : These are the operations you want to perform on each session's data. This may include counting the number of pages a user visited, recording the page they spent the most time on, or noting which pages were visited multiple times. You then add this Session Window stage to your streaming aggregation pipeline, and Atlas Stream Processing continuously processes your incoming data, groups events into sessions based on your configuration, and applies your specified transformations. The results flow to your designated output destinations in real-time, ready for analysis or to trigger automated actions. A quick example Let’s say you want to build the pipeline that we mentioned above to track user sessions, notify them if they have items in their cart but haven’t checked out, and move their data downstream for analytics. You might do something like this: 1. Configure your source and sink stages This is where you define the connections to any MongoDB or external location you intend to receive data from (source) or send data to (sink). // Set your source to be change streams from the pageViews, cartItems, and orderedItems collections let sourceCollections = { $source: { connectionName: "ecommerce", "db": "customerActivity", "coll": ["pageViews", "cartItems", "orderedItems"] } } // Set your destination (sink) to be the userSessions topic your recommendation engine consumes data from let emitToRecommendationEngine = { $emit: { connectionName: "recommendationEngine", topic: "userSessions", } }; // Create a connection to your sendCheckoutReminder Lambda function that sends a reminder to users to check out if they have items in their cart when the session ends let sendReminderIfNeeded = { $externalFunction: { "connectionName": "operations", "as": "sendCheckoutReminder", "functionName": "arn:aws:lambda:us-east-1:123412341234:function:sendCheckoutReminder" } } 2. Define your Session Window logic This is where you specify how data will be transformed in your stream processing pipeline. // Step 1. Create a stage that pulls only the fields you care about from the change logs. // Every document will have a userId and itemId as all collections share that field. Fields not present will be null. let extractRelevantFields = { $project: { userId: "$fullDocument.userId", itemId: "$fullDocument.itemId", category: "$fullDocument.category", cost: "$fullDocument.cost", viewedAt: "$fullDocument.viewedAt", addedToCartAt: "$fullDocument.addedToCartAt", purchasedAt: "$fullDocument.purchasedAt" } }; // Step 2. By setting _id to $userId this group all the documents by the userId. Fields not present in any records will be null. let groupSessionData = { $group: { _id: "$userId", itemIds: { $addToSet: "$itemId" }, categories: { $addToSet: "$category" }, costs: { $addToSet: "$cost" }, viewedAt: { $addToSet: "$viewedAt" }, addedToCartAt: { $addToSet: "$addedToCartAt" }, purchasedAt: { $addToSet: "$purchasedAt" } } }; // Step 3. Create a session window that closes after 15 minutes of inactivity. The pipeline specifies all operations to be performed on documents sharing the same userId within the window. let createSession = { $sessionWindow: { partitionBy: "$_id", gap: { unit: "minute", size: 15}, pipeline: [ groupSessionData ] }}; 3. Create and start your stream processor The last step is simple: create and start your stream processor. // Create your pipeline array. The session data will be sent to the external function defined in sendReminderIfNeeded, and then it will be emitted to the recommendation engine Kafka topic. finalPipeline = [ sourceCollections, extractRelevantFields, createSession, sendReminderIfNeeded, emitToUserSessionTopic ]; // Create your stream processor sp.createStreamProcessor("userSessions", finalPipeline) // Start your stream processor sp.userSessions.start() And that's it! Your stream processor now runs continuously in the background with no additional management required. As users navigate your e-commerce website, add items to their carts, and make purchases, Atlas Stream Processing automatically: Tracks each user's activity in real-time Groups events into meaningful sessions based on natural usage patterns Closes sessions after your specified period of inactivity (15 minutes) Triggers reminders for users with abandoned carts Delivers comprehensive session data to your analytics systems All of this happens automatically at scale without requiring ongoing maintenance or manual intervention. Session Windows provide powerful, activity-based data processing that adapts to users' behavioral patterns rather than forcing their actions into arbitrary time buckets. Ready to get started? Log in or sign up for Atlas today to create stream processors. You can learn more about Session Windows or get started using our tutorial .

May 29, 2025

New Data Management Experience in the Atlas UI

For the modern developer, each day is a balancing act. Even a small workflow hiccup can throw off momentum, making seamless data management not just a convenience, but a necessity for staying productive. At MongoDB, our mission is to empower developers to innovate without friction, providing the tools they need, right when they need them. That's why we've enhanced Data Explorer—a data interaction tool in the MongoDB Atlas UI that helps developers stay in the zone, innovate faster, and further streamline their workflows. Data Explorer: Improved data exploration and management in the MongoDB Atlas UI MongoDB provides a powerful graphical user interface (GUI) called MongoDB Compass, trusted by over a million users a month throughout the software development lifecycle. They rely on Compass to build queries and aggregations during development, to refine their schemas during design, to manage data for local testing environments during testing, and to discover patterns and abnormalities in data to inform maintenance and optimization. For users who aren’t comfortable with shell syntax or who prefer working with a GUI, Compass makes it easy to visually interact with data stored in MongoDB. However, many developers prefer to work in the Atlas UI, so we're bringing the Compass experience to them. The new Data Explorer experience brings the familiarity and power of MongoDB Compass to the MongoDB Atlas UI, eliminating the need for developers to toggle between desktop and web interfaces to explore and interact with data. Our goal is to provide seamless data exploration that meets developers where they are in their workflows and caters to all experience levels with MongoDB and Atlas. This new Data Explorer enables developers to view and understand their data, as well as test and optimize queries directly within the browser, streamlining application development and enriching data management processes. It’s intuitive and super easy to find, too. Navigating Data Explorer in the MongoDB Atlas UI The existing Data Explorer experience sits within the 'Collections' tab of the Atlas UI. For easier accessibility, the updated interface will have its own tab called 'Data Explorer,' located under the “Data” navigational header in Atlas' revamped side navigation . Upon opening the “Data Explorer” tab, users are met with the same interface as MongoDB Compass. This brings the intuitive layout and powerful capabilities of Compass into the web UI, providing a guided experience that enhances data exploration and optimization tasks, while also creating a familiar environment for our developers who already know and love Compass. To get started, users can create a new cluster or connect to an existing one by clicking on the “Connect” box next to their chosen cluster. Figure 1. Getting started with Data Explorer With the updated interface, developers can effortlessly interact with data across all Atlas clusters in their projects within a single view, instead of only being able to interact with one cluster at a time. This consolidated view allows developers to focus their tasks directly in the browser, encouraging a streamlined workflow and higher productivity during development. Take advantage of a richer feature set with Data Explorer With the updated Data Explorer experience, you can now leverage the following features: Query with natural language: Create both queries and aggregations using natural language to accelerate your productivity. The intelligent query bar in Data Explorer allows you to ask plain text questions about your data, and teaches you the proper syntax for complex queries and aggregations, creating an initial query or aggregation pipeline that you can modify to fit your requirements. Figure 2. Using the natural language query bar Use advanced document viewing capabilities: Access data across all clusters in your Atlas project in the same browser window. View more documents per page and expand all nested fields across many documents to maximize the amount of data you’re able to view at once. Choose between the list, table, or JSON views to mirror how you work best. Figure 3. Viewing documents through the advanced document viewing capabilities Understand query performance: Visualize output from the Explain command for your queries and aggregations, gaining deeper insights into performance. Use performance insights to optimize your schema design and improve application performance. Figure 4. Visualizing outputs through the Explain Plan command Perform bulk operations: Easily run bulk updates and deletes to migrate or clean your data. Preview how updates will impact documents to ensure accuracy before execution, and get an accurate picture of how many documents will be influenced by the bulk operation. Figure 5. Running bulk updates and deletes Analyze your schemas and define schema validation rules: Utilize built-in schema analysis tools to understand the current shape of your data. The new Schema tab simplifies identifying anomalies and optimizing your data model. Leverage the new Validation tab to ensure data integrity by generating and enforcing JSON Schema validation rules . Figure 6. Analyzing schema and schema validation rules As the gifs show above, the updated Data Explorer in MongoDB Atlas brings powerful and intuitive data exploration tools directly into your browser, streamlining workflows and boosting productivity. With these enhancements, developers can focus on what they do best—building innovative applications—while we handle the complexity of data management. We’re excited for you to start working with Data Explorer in the Atlas UI. Here’s how to get started: Turn on the new experience in Atlas Project Settings or from the previous Data Explorer interface. Try it out now . Check out our documentation to read more about new features available in Data Explorer. Hear more about the changes in this short video .

May 28, 2025

Strengthening Security: Bug Bounty and GitHub Secret Scanning

Today, MongoDB is announcing two important updates that further strengthen its security posture: The free tier of MongoDB Atlas is now included in the company’s public bug bounty program . MongoDB has joined the GitHub secret scanning program . These updates empower MongoDB to identify and remediate security risks earlier in the development lifecycle. MongoDB has long been committed to proactively tackling security challenges, so the decision to open MongoDB Atlas to responsible testing by the security researcher community was an easy one. Its collaboration with GitHub further strengthens this approach by enabling the detection and validation of exposed MongoDB-specific credentials. Together, these efforts help protect customer data and support secure application development at scale. Expanding MongoDB’s bug bounty program to include MongoDB Atlas The free tier of MongoDB Atlas is now a part of the company’s public bug bounty program. This fully managed, multi-cloud database powers mission-critical workloads for thousands of customers, ranging from large enterprises to small startups and individual developers. MongoDB’s bug bounty program has already paid out over $140,000 in bounties to security researchers and has resolved over 130 bug reports. Integrating Atlas into the bug bounty program is the next step in hardening the database’s security posture, enabling earlier discovery and remediation of potential risks. The cyberthreat landscape is evolving faster than ever. Where organizations once faced a narrower set of risks, today’s threats are more diverse and sophisticated. These include emerging risks like generative AI misuse and supply chain compromises, alongside persistent threats such as phishing, software vulnerabilities, and insider attacks. One proven way to stay ahead of these threats is by working with the security research community through bug bounty programs. Researchers can help identify and report vulnerabilities early, enabling organizations to fix issues before attackers exploit them. Security researchers are expanding their expertise to address new attack vectors, according to HackerOne. In fact, 56% now specialize in API vulnerabilities and 10% focus on AI and large language models. 1 With MongoDB Atlas now included in the company’s bug bounty program, customers can expect: Continuous, real-world testing by a diverse security research community. Systems designed for faster detection of vulnerabilities than traditional penetration testing. Stronger confidence in MongoDB’s ability to safeguard sensitive data. By bringing MongoDB Atlas into its bug bounty program, MongoDB is doubling down on transparency, collaboration, and proactive defense. This is a critical step in reinforcing customer trust and ensuring MongoDB Atlas remains secure as threats evolve. Partnering with GitHub to detect credential leaks faster Building on its commitment to proactive threat detection, MongoDB has also joined GitHub’s secret scanning partner program to better protect customers from credential exposure. This program enables service providers like MongoDB to include their custom secret token formats in GitHub’s secret scanning functionality. This capability actively scans repositories to detect accidental commits of secrets such as API keys, credentials, and other sensitive data. Through this partnership, when GitHub detects a match of MongoDB Atlas–specific secrets, it will notify MongoDB. Then MongoDB can securely determine if the credential is active. As a result, MongoDB can rapidly identify potential security risks and notify customers. Stolen credentials remain one of the most common and damaging threats in cybersecurity. Stolen credentials have been involved in 31% of data breaches in the past decade, according to a Verizon report. Credential stuffing, where bad actors use stolen credentials to access unrelated services, is the most common attack type for web applications. 2 These breaches are particularly harmful, taking an average of 292 days to detect and contain. 3 By participating in GitHub’s secret scanning program, MongoDB helps ensure that MongoDB Atlas customers benefit from: Faster detection and remediation of exposed credentials. Reduced risk of unauthorized access or data leaks. More secure, developer-friendly workflows by default. Staying ahead of evolving security threats MongoDB is continuously evolving to help developers and enterprises stay ahead of security risks. By expanding its public bug bounty program to include MongoDB Atlas and by partnering with GitHub to detect exposed credentials in real time, MongoDB is deepening its investment in proactive, community-driven security. These updates reflect a broader commitment to helping developers and organizations build secure applications, detect risks early, and respond quickly to new and emerging threats. Learn more about these programs: MongoDB’s bug bounty program on HackerOne GitHub’s secret scanning partner program 1 Hacker-Powered Security Report , 8th Edition, HackerOne 2 Verizon Data Breach Investigations Report , 2024 3 IBM Cost of a Data Breach Report , 2024

May 27, 2025