Vector Search

23 results

Workload Isolation for More Scalability and Availability: Search Nodes Now on Google Cloud

Today we’re excited to take the next step in bringing scalable, dedicated architecture to your search experiences with the introduction of Atlas Search Nodes, now in general availability for Google Cloud. This post is also available in: Deutsch , Français , Español , Português , Italiano , 한국어 , 简体中文 . Since our initial announcement of Search Nodes in June of 2023, we’ve been rapidly accelerating access to the most scalable dedicated architecture, starting with general availability on AWS and now expanding to general availability on Google Cloud. We'd like to give you a bit more context on what Search Nodes are and why they're important to any search experience running at scale. Search Nodes provide dedicated infrastructure for Atlas Search and Vector Search workloads to enable even greater control over search workloads. They also allow you to isolate and optimize compute resources to scale search and database needs independently, delivering better performance at scale and higher availability. One of the last things developers want to deal with when building and scaling apps is having to worry about infrastructure problems. Any downtime or poor user experiences can result in lost users or revenue, especially when it comes to your database and search experience. This is one of the reasons developers turn to MongoDB, given the ease of use of having one unified system for your database and search solution. With the introduction of Atlas Search Nodes, we’ve taken the next step in providing our builders with ultimate control, giving them the ability to remain flexible by scaling search workloads without the need to over-provision the database. By isolating your search and database workloads while at the same time automatically keeping your search cluster data synchronized with operational data, Atlas Search and Atlas Vector Search eliminate the need to run a separate ETL tool, which takes time and effort to set up and is yet another fail point for your scaling app. This provides superior performance and higher availability while reducing architectural complexity and wasted engineering time recovering from sync failures. In fact, we’ve seen a 40% to 60% decrease in query time for many complex queries, while eliminating the chances of any resource contention or downtime. With just a quick button click, Search Nodes on Google Cloud offer our existing Atlas Search and Vector Search users the following benefits: Higher availability Increased scalability Workload isolation Better performance at scale Improved query performance We offer both compute-heavy search-specific nodes for relevance-based text search, as well as a memory-optimized option that is optimal for semantic and retrieval augmented generation (RAG) production use cases with Atlas Vector Search. This makes resource contention or availability issues a thing of the past. Search Nodes are easy to opt into and set up — to start, jump on into the MongoDB UI and follow the steps do the following: Navigate to your “Database Deployments” section in the MongoDB UI Click the green “+Create” button On the “Create New Cluster” page, change the radio button for Google Cloud for “Multi-cloud, multi-region & workload isolation” to enable Toggle the radio button for “Search Nodes for workload isolation” to enable. Select the number of nodes in the text box Check the agreement box Click “Create cluster” For existing Atlas Search users, click “Edit Configuration” in the MongoDB Atlas Search UI and enable the toggle for workload isolation. Then the steps are the same as noted above. Jump straight into our docs to learn more! MongoDB.local NYC Join us in person on May 2, 2024 for our keynote address, announcements, and technical sessions to help you build and deploy mission-critical applications at scale. Use Code Web50 for 50% off your ticket! Learn More

March 28, 2024

Using Generative AI and MongoDB to Tackle Cybersecurity’s Biggest Challenges

This post is also available in: Deutsch , Français , Español , Português , Italiano , 한국어 , 简体中文 . In the ever-evolving landscape of cybersecurity, organizations face a multitude of challenges that demand innovative solutions harnessing cutting-edge technologies. One of the most pressing issues is the increasing sophistication of cyber threats, including malware, ransomware, and phishing attacks, which are becoming more difficult to detect and mitigate. Additionally, the rapid expansion of digital infrastructures has widened the attack surface, making it harder for security teams to monitor and protect every entry and egress point. Another significant challenge is the shortage of skilled cybersecurity professionals — estimated by independent surveys to number around 4 million staff worldwide 1 — which leaves many organizations vulnerable to attack. These challenges underscore the need for advanced technologies that can augment human efforts to secure digital assets and data. How can generative AI help? Generative AI (gen AI) has emerged as a powerful tool in addressing these cybersecurity challenges. By leveraging large language models (LLMs) to generate new data or patterns based on existing datasets, generative AI can provide innovative solutions in several key areas: Enhanced threat detection and response Generative AI can be used to create simulations of cyber threats, including sophisticated malware and phishing attacks. These simulations can help in training machine learning models to detect new and evolving threats more accurately. Furthermore, gen AI can aid in the development of automated response systems that react to threats in real time. While this will never eliminate the need for human oversight, it will reduce the need for manual intervention and toil, allowing for quicker mitigation of attacks. For example, with the appropriate oversight it can automatically apply patches to vulnerable systems or adjust firewall rules to block attack vectors. This automated rapid response capability is particularly valuable in mitigating zero-day vulnerabilities, where the window between the discovery of a vulnerability and its exploitation by attackers can be very short. Actionable learnings from security event postmortems In the aftermath of a cybersecurity incident, conducting a thorough postmortem analysis is crucial for understanding what happened, why it happened, and how similar events can be prevented in the future. Generative AI can play a pivotal role in this process by synthesizing and summarizing complex data from a multitude of sources, including logs, network traffic, and security alerts. By analyzing this data, gen AI can identify patterns and anomalies that may have contributed to the security breach, offering insights that might be overlooked by human analysts due to the sheer volume and complexity of the information. Furthermore, it can generate comprehensive reports that highlight key findings, causative factors, and potential vulnerabilities, streamlining the postmortem process. This capability not only accelerates the recovery and learning process but also enables organizations to implement more effective remediation strategies, ultimately strengthening their cybersecurity posture. Generating synthetic data for deep model training The shortage of real-world data for training cybersecurity systems is a significant hurdle. Gen AI can create realistic, synthetic data sets that mirror genuine network traffic and user behavior without exposing sensitive information. This synthetic data can be used to train detection systems, improving their accuracy and effectiveness without compromising privacy or security. Automating phishing detection Phishing remains one of the most common attack vectors. Gen AI can analyze patterns in phishing emails and websites, generating models that predict and detect phishing attempts with high accuracy. By integrating these models into email systems and web browsers, organizations can automatically filter out phishing content, protecting users from potential threats. Putting it all together: The opportunities and the risks Generative AI holds the promise of transforming cybersecurity practices by automating complex processes, enhancing threat detection and response, and providing a deeper understanding of cyber threats. As the industry continues to integrate gen AI into cybersecurity strategies, it's crucial to remain vigilant about the ethical use of this technology and the potential for misuse. Nevertheless, the benefits it offers in strengthening digital defenses are undeniable, making it an invaluable asset in the ongoing battle against cyber threats. How does MongoDB help? With MongoDB, your development teams can build and deploy robust, correct, and differentiated real-time cyber defenses faster, and at any scale. To understand how MongoDB does this, consider that the the AI technology stack comprises three layers: The underlying compute (GPUs) and LLMs The tooling to fine-tune models along with the tooling for in-context learning and inference against the trained models The AI applications and related end-user experiences MongoDB operates at the second layer of the stack. It enables customers to bring their own proprietary data to any LLM running on any computing infrastructure to build gen AI-powered cybersecurity applications. MongoDB does this by addressing the hardest problems when adopting gen AI for cybersecurity. MongoDB Atlas securely unifies operational data, unstructured data, and vector data in a single, fully managed multi-cloud platform, avoiding the need to copy and sync data between different systems. MongoDB’s document-based architecture also allows development teams to easily model relationships between your application data and vector embeddings. This allows deeper and faster analytics and insights against security-related data. Figure 1: MongoDB Atlas brings together all of the data services needed to build modern cyber security applications in a unified API and developer data platform. MongoDB’s open architecture is integrated with a rich ecosystem of AI developer frameworks, LLMs, and embedding providers. This, combined with our industry-leading multi-cloud capabilities, allows your development teams the flexibility to move quickly and avoid lock-in to any particular cloud provider or AI technology in this rapidly evolving space. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Applying gen AI and MongoDB to real world cybersecurity applications Threat intelligence ExTrac utilizes AI-powered analytics and MongoDB Atlas to predict public safety risks by analyzing data from thousands of sources. The platform initially helped Western governments foresee conflicts but is expanding to enterprises for reputational management and more. MongoDB's document data model allows ExTrac to manage complex data efficiently, enhancing real-time threat identification. Atlas Vector Search aids in augmenting language models and managing vector embeddings for texts, images, and videos, speeding up feature development. This approach enables ExTrac to efficiently model trends, track evolving narratives, and predict risk for its customers, leveraging the flexibility and power of MongoDB to handle data of any shape and structure. Learn more in our ExTrac case study . Cybersec assessments VISO TRUST leverages AI to streamline the assessment of third-party cyber risks, making complex vendor security information quickly accessible for informed decision-making. Utilizing Amazon Bedrock and MongoDB Atlas, VISO TRUST's platform automates the due diligence of vendor security, significantly reducing the workload for security teams. Its AI-powered approach involves artifact intelligence that classifies security documents, detects organizations, and predicts security control locations within artifacts. MongoDB Atlas hosts text embeddings for a dense retrieval system that enhances the accuracy of LLMs through retrieval-augmented generation (RAG), providing instant, actionable security insights. This innovative use of technology enables VISO TRUST to offer rapid, scalable cyber risk assessments, boasting significant reductions in work and time for enterprises like InstaCart and Upwork. MongoDB's flexible document database and Atlas Vector Search play critical roles in managing and querying the vast amounts of data, supporting VISO TRUST's mission to deliver comprehensive cyber risk intelligence. Learn more in our Viso Trust case study . Steps to get started Generative AI powered by LLMs augmented with your own operational data encoded as vector embeddings is opening up many new possibilities in cyber security. If you want to learn more about the technology and its possibilities, take a look at our Atlas Vector Search learning byte . In just 10 minutes you’ll get an overview of different use cases and how to get started. 1 Hill, M. (2023, April 10). Cybersecurity workforce shortage reaches 4 million despite significant recruitment drive . CSO.

March 13, 2024

How MongoDB Enables Digital Twins in the Industrial Metaverse

The integration of MongoDB into the metaverse marks a pivotal moment for the manufacturing industry, unlocking innovative use cases across design and prototyping, training and simulation, and maintenance and repair. MongoDB's powerful capabilities — combined with Augmented Reality (AR) or Virtual Reality (VR) technologies — are reshaping how manufacturers approach these critical aspects of their operations, while also enabling the realization of innovative product features. But first: What is the metaverse, and why is it so important to manufacturers? We often use the term, "digital twin" to refer to a virtual replication of the physical world. It is commonly used for simulations and documentation. The metaverse goes one step further: Not only is it a virtual representation of a physical device or a complete factory, but the metaverse also reacts and changes in real time to reflect a physical object’s condition. The advent of the industrial metaverse over the past decade has given manufacturers an opportunity to embrace a new era of innovation, one that can enhance collaboration, visualization, and training. The industrial metaverse is also a virtual environment that allows geographically dispersed teams to work together in real time. Overall, the metaverse transforms the way individuals and organizations interact to produce, purchase, sell, consume, educate, and work together. This paradigm shift is expected to accelerate innovation and affect everything from design to production across the manufacturing industry. Here are some of the ways the metaverse — powered by MongoDB — is having an impact manufacturing. Design and prototyping Design and prototyping processes are at the core of manufacturing innovation. Within the metaverse, engineers and designers can collaborate seamlessly using VR, exploring virtual spaces to refine and iterate on product designs. MongoDB's flexible document-oriented structure ensures that complex design data, including 3D models and simulations, is efficiently stored and retrieved. This enables real-time collaboration, accelerating the design phase while maintaining the precision required for manufacturing excellence. Training and simulation Taking a digital twin and connecting it to physical assets enables training beyond traditional methods and provides immersive simulations in the metaverse that enhance skill development for manufacturing professionals. VR training, powered by MongoDB's capacity to manage diverse data types — such as time-series, key-values and events — enables realistic simulations of manufacturing environments. This approach allows workers to gain hands-on experience in a safe virtual space, preparing them for real-world challenges without affecting production cycles. Gamification is also one of the most effective ways to learn new things. MongoDB's scalability ensures that training data, including performance metrics and user feedback, is efficiently handled to continuously enlarge the training modules and the necessary resources for the ever-increasing amount of data. Maintenance and repair Maintenance and repair operations are streamlined through AR applications within the metaverse. The incorporation of AR and VR technologies into manufacturing processes amplifies the user experience, making interactions more intuitive and immersive. Technicians equipped with AR devices can access real-time information overlaid onto physical equipment, providing step-by-step guidance for maintenance and repairs. MongoDB's support for large volumes of diverse data types, including multimedia and spatial information, ensures a seamless integration of AR and VR content. This not only enhances the visual representation of data from the digital twin and the physical asset but also provides a comprehensive platform for managing the vast datasets generated during AR and VR interactions within the metaverse. Additionally, MongoDB's geospatial capabilities come into play, allowing manufacturers to manage and analyze location-based data for efficient maintenance scheduling and resource allocation. The result is reduced downtime through more efficient maintenance and improved overall operational efficiency. From the digital twin to metaverse with MongoDB The advantages of a metaverse for manufacturers are enormous, and according to Deloitte many executives are confident the industrial metaverse “ will transform research and development, design, and innovation, and enable new product strategies .” However, the realization is not easy for most companies. Challenges include managing system overload, handling vast amounts of data from physical assets, and creating accurate visualizations. The metaverse must also be easily adaptable to changes in the physical world, and new data from various sources must be continuously implemented seamlessly. Given these challenges, having a data platform that can contextualize all the data generated by various systems and then feed that to the metaverse is crucial. That is where MongoDB Atlas , the leading developer data platform, comes in, providing synchronization capabilities between physical and virtual worlds, enabling flexible data modeling, and providing access to the data via a unified query interface as seen in Figure 1. Figure 1: MongoDB connecting to a physical & virtual factory Generative AI with Atlas Vector Search With MongoDB Atlas, customers can combine three systems — database, search engine, and sync mechanisms — into one, delivering application search experiences for metaverse users 30% to 50% faster . Atlas powers use cases such as similarity search, recommendation engines, Q&A systems, dynamic personalization, and long-term memory for large language models (LLMs). Vector data is integrated with application data and seamlessly indexed for semantic queries, enabling customers to build easier and faster. MongoDB Atlas enables developers to store and access operational data and vector embeddings within a single unified platform. With Atlas Vector Search , users can generate information for maintenance, training, and all the other use cases from all possible information that is accessible. This information can come from text files such as Word, from PDFs, and even from pictures or sound streams from which an LLM then generates an accurate semantic answer. It’s no longer necessary to keep dozens of engineers busy, just creating useful manuals that are outdated at the moment a production line goes through first commissioning. Figure 2: Atlas Vector Search Transforming the manufacturing industry with MongoDB In the digital twin and metaverse-driven future of manufacturing, MongoDB emerges as a linchpin, enabling cost-effective virtual prototyping, enhancing simulation capabilities, and revolutionizing training processes. The marriage of MongoDB with AR and VR technologies creates a symbiotic relationship, fostering innovation and efficiency across design, training, and simulation. As the manufacturing industry continues its journey into the metaverse, the partnership between MongoDB and virtual technologies stands as a testament to the transformative power of digital integration in shaping the future of production. Learn more about how MongoDB is helping organizations innovate with the industrial metaverse by reading how we Build a Virtual Factory with MongoDB Atlas in 5 Simple Steps , how IIoT data can be integrated in 4 steps into MongoDB, or how MongoDB drives Innovations End-To-End in the whole Manufacturing Chain .

March 12, 2024

Building AI with MongoDB: Putting Jina AI’s Breakthrough Open Source Embedding Model To Work

Founded in 2020 and based in Berlin, Germany, Jina AI has swiftly risen as a leader in multimodal AI, focusing on prompt engineering and embedding models. With its commitment to open-source and open research, Jina AI is bridging the gap between advanced AI theory and the real world AI-powered applications being built by developers and data scientists. Over 400,000 users are registered to use the Jina AI platform. Dr. Han Xiao, Founder and CEO at Jina AI, describes the company’s mission: “We envision paving the way towards the future of AI as a multimodal reality. We recognize that the existing machine learning and software ecosystems face challenges in handling multimodal AI. As a response, we're committed to developing pioneering tools and platforms that assist businesses and developers in navigating these complexities. Our vision is to play a crucial role in helping the world harness the vast potential of multimodal AI and truly revolutionize the way we interpret and interact with information." Jina AI’s work in embedding models has caught significant industry interest. As many developers now know, embeddings are essential to generative AI (gen AI). Embedding models are sophisticated algorithms that transform and embed data of any structure into multi-dimensional numerical encodings called vectors. These vectors give data semantic meaning by capturing its patterns and relationships. This means we can analyze and search for unstructured data in the same way we’ve always been able to with structured business data. Considering that over 80% of the data we create every day is unstructured, we start to appreciate how transformational embeddings — when combined with a powerful solution such as MongoDB Atlas Vector Search — are for gen AI. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Jina AI's jina-embeddings-v2 is the first open-source 8K text embedding model. Its 8K token length provides deeper context comprehension, significantly enhancing accuracy and relevance for tasks like retrieval-augmented generation (RAG) and semantic search . Jina AI’s embeddings offer enhanced data indexing and search capabilities, along with bilingual support. The embedding models are focused on singular languages and language pairs, ensuring state-of-the-art performance on language-specific benchmarks. Currently, Jina Embeddings v2 includes bilingual German-English and Chinese-English models, with other bilingual models in the works. Jina AI’s embedding models excel in classification, reranking, retrieval, and summarization, making them suitable for diverse applications, especially those that are cross-lingual. Recent examples from multinational enterprise customers include the automation of sales sequences, skills matching in HR applications, and payment reconciliation with fraud detection. Figure 1:   Jina AI’s world-class embedding models improve search and RAG systems. In our recently published Jina Embeddings v2 and MongoDB Atlas article we show developers how to get started in bringing vector embeddings into their apps. The article covers: Creating a MongoDB Atlas instance and loading it with your data. (The article uses a sample Airbnb reviews data set.) Creating embeddings for the data set using the Jina Embeddings API. Storing and indexing the embeddings with Atlas Vector Search. Implementing semantic search using the embeddings. Dr. Xiao says, “Our Embedding API is natively integrated with key technologies within the gen AI developer stack including MongoDB Atlas, LangChain, LlamaIndex, Dify, and Haystack. MongoDB Atlas unifies application data and vector embeddings in a single platform, keeping both fully synced. Atlas Triggers keeps embeddings fresh by calling our Embeddings API whenever data is inserted or updated in the database. This integrated approach makes developers more productive as they build new, cutting-edge AI-powered apps for the business.” To get started with MongoDB and Jina AI, register for MongoDB Atlas and read the tutorial . If your team is building its AI apps, sign up for the AI Innovators Program . Successful companies get access to free Atlas credits and technical enablement, as well as connections into the broader AI ecosystem.

February 14, 2024

Building AI with MongoDB: Navigating the Path From Predictive to Generative AI

It should come as no surprise that the organizations unlocking the largest benefits from generative AI (gen AI) today have already been using predictive AI (a.k.a. classic, traditional, or analytical AI). McKinsey made this same observation back in June 2023 with its “Economic Potential of Generative AI 1 ” research. There would seem to be several reasons for this: An internal culture that is willing to experiment and explore what AI can do Access to skills — though we must emphasize that gen AI is way more reliant on developers than the data scientists driving predictive AI Availability of clean and curated data from across the organization that is ready to be fed into genAI models This doesn’t mean to say that only those teams with prior experience in predictive AI stand to benefit from gen AI. If you take a look at examples from our Building AI case study series , you’ll see many organizations with different AI maturity levels tapping MongoDB for gen AI innovation today. In this latest edition of the Building AI series, we feature two companies that, having built predictive AI apps, are now navigating the path to generative AI: MyGamePlan helps professional football players and coaches improve team performance. helps businesses and consumers build trust by running background checks using public domain data. In both cases, Predictive AI is central to data-driven decision-making. And now both are exploring gen AI to extend their services with new products that further deepen user engagement. The common factor for both? Their use of MongoDB Atlas and its flexibility for any AI use case. Let's dig in. MyGamePlan: Elevating the performance of professional football players with AI-driven insights The use of data and analytics to improve the performance of professional athletes isn’t new. Typically, solutions are highly complex, relying on the integration of multiple data providers, resulting in high costs and slow time-to-insight. MyGamePlan is working to change that for professional football clubs and their players. (For the benefit of my U.S. colleagues, where you see “football” read “soccer.”) MyGamePlan is used by staff and players at successful teams across Europe, including Bayer Leverkusen (current number one in the German Bundesliga), AFC Sunderland in the English Championship, CD Castellón (current number one in the third division of Spain), and Slask Wroclaw (the current number one in the Polish Ekstraklasa). I met with Dries Deprest, CTO and co-founder at MyGamePlan who explains, “We redefine football analysis with cutting-edge analytics, AI, and a user-friendly platform that seamlessly integrates data from match events, player tracking, and video sources. Our platform automates workflows, allowing coaches and players to formulate tactics for each game, empower player development, and drive strategic excellence for the team's success.” At the core of the MyGamePlay platform are custom, Python-based predictive AI models hosted in Amazon Sagemaker. The models analyze passages of gameplay to score the performance of individual players and their impact on the game. Performance and contribution can be tracked over time and used to compare with players on opposing teams to help formulate matchday tactics. Data is key to making the models and predictions accurate. The company uses MongoDB Atlas as its database, storing: Metadata for each game, including matches, teams, and players. Event data from each game such as passes, tackles, fouls, and shots. Tracking telemetry that captures the position of each player on the field every 100ms. This data is pulled from MongoDB into Python DataFrames where it is used alongside third-party data streams to train the company’s ML models. Inferences generated from specific sequences of gameplay are stored back in MongoDB Atlas for downstream analysis by coaches and players. Figure 1:   With MyGamePlans web and mobile apps, coaching staff, and players can instantly assess gameplay and shape tactics. On selecting MongoDB, Deprest says, We are continuously enriching data with AI models and using it for insights and analytics. MongoDB is a great fit for this use case. “We chose MongoDB when we started our development two years ago. Our data has complex multi-way relationships, mapping games to players to events and tracking. The best way to represent this data is with nested elements in rich document data structures. It's way more efficient for my developers to work with and for the app to process. Trying to model these relationships with foreign keys and then joining normalized tables in relational databases would be slow and inefficient.” In terms of development, Deprest says, “We use the PyMongo driver to integrate MongoDB with our Python ML data pipelines in Sagemaker and the MongoDB Node.js driver for our React-based, client-facing web and mobile apps.” Deprest goes on to say, "There are two key factors that differentiate MongoDB from NoSQL databases we also considered: the incredible level of developer adoption it has, meaning my team was immediately familiar and productive with it. And we can build in-app analytics directly on top of our live data, without the time and expense of having to move it out into some data warehouse or data lake. With MongoDB’s aggregation pipelines , we can process and analyze data with powerful roll-ups, transformations, and window functions to slice and dice data any way our users need it." Moving beyond predictive AI, the MyGamePlan team is now evaluating how gen AI can further improve user experience. Deprest says, "We have so much rich data and analytics in our platform, and we want to make it even easier for players and coaches to extract insights from it. We are experimenting with natural language processing via chat and question-answering interfaces on top of the data. Gen AI makes it easy for users to visualize and summarize the data. We are currently evaluating OpenAI’s ChatGPT LLM coupled with sophisticated approaches to prompt engineering, orchestration via Langchain, and retrieval augmented generation (RAG) using LlamaIndex and MongoDB Atlas Vector Search ." As our source data is in the MongoDB Atlas database already, unifying it with vector storage and search is a very productive and elegant solution for my developers. Dries Deprest, CTO and Co-founder, MyGamePlan By building on MongoDB Atlas, MyGamePlan’s team can use the breadth of functionality provided by a developer data platform to support almost any application and AI needs in the future. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Building trust with relationship intelligence powered by AI and MongoDB Atlas while cutting costs by 30% Across the physical and digital world, we are all constantly building relationships with others. Those relationships can be established through peer-to-peer transactions across online marketplaces, between tradespeople and professionals with their prospective clients, between investors and founders, or in creating new personal connections. All of those relationships rely on trust to work, but building it is hard. was founded to remove the guesswork from building that trust. Ferret is an AI platform architected from the ground up to empower companies and individuals with real-time, unbiased intelligence to identify risks and embrace opportunities. Leveraging cutting-edge predictive and generative AI, hundreds of thousands of global data sources, and billions of public documents, provides curated relationship intelligence and monitoring — once only available to the financial industry — making transparency the new norm. Al Basseri, CTO at Ferret tells us how it works: "We ingest information about individuals from public sources. This includes social networks, trading records, court documents, news archives, corporate ownership, and registered business interests. This data is streamed through Kafka pipelines into our Anyscale/Ray MLops platform where we apply natural language processing through our spaCy extraction and machine learning models. All metadata from our data sources — that's close to three billion documents — along with inferences from our models are stored in MongoDB Atlas . The data in Atlas is consumed by our web and mobile customer apps and by our corporate customers through our upcoming APIs." Figure 2:   Artificial intelligence + real-time data = Relationship Intelligence from Moving beyond predictive AI, the company’s developers are now exploring opportunities to use gen AI in the Ferret platform. "We have a close relationship with the data science team at Nvidia,” says Basseri. “We see the opportunity to summarize the data sources and analysis we provide to help our clients better understand and engage with their contacts. Through our experimentation, the Mistral model with its mixture-of-experts ensemble seems to give us better results with less resource overhead than some of the larger and more generic large language models." As well as managing the data from Ferret’s predictive and gen AI models, customer data and contact lists are also stored in MongoDB Atlas. Through Ferret’s continuous monitoring and scoring of public record sources, any change in an individual's status is immediately detected. As Basseri explains, " MongoDB Atlas Triggers watch for updates to a score and instantly send an alert to consuming apps so our customers get real-time visibility into their relationship networks. It's all fully event-driven and reactive, so my developers just set it and forget it." Basseri also described the other advantages MongoDB provides his developers: Through Atlas, it’s available as a fully managed service with best practices baked in. That frees his developers and data scientists from the responsibilities of running a database so they can focus their efforts on app and AI innovation MongoDB Atlas is mature, having seen it scale in many other high-growth companies The availability of engineers who know MongoDB is important as the team rapidly expands Beyond the database, Ferret is extending its use of the MongoDB Atlas platform into text search. As the company moves into Google Cloud, it is migrating from its existing Amazon OpenSearch service to Atlas Search . Discussing the drivers for the migration, Basseri says, "Unifying both databases and search behind a single API reduces cognitive load for my developers, so they are more productive and build features faster. We eliminate all of the hassle of syncing data between database and search. Again, this frees up engineering cycles. It also means our users get a better experience because previous latency bottlenecks are gone — so as they search across contacts and content on our platform, they get the freshest results, not stale and outdated data." By migrating from OpenSearch to Atlas Search, we also save money and get more freedom. We will reduce our total cloud costs by 30% per month just by eliminating unnecessary data duplication between the database and the search engine. And with Atlas being multi-cloud, we get the optionality to move across cloud providers as and when we need to. Al Basseri, CTO at Once the migration is complete, Basseri and the team will begin development with Atlas Vector Search as they continue to build out the gen AI side of the Ferret platform. What's next? No matter where you are in your AI journey, MongoDB can help. You can get started with your AI-powered apps by registering for MongoDB Atlas and exploring the tutorials available in our AI resources center . Our teams are always ready to come and explore the art of the possible with you. 1

February 13, 2024

Building AI with MongoDB: How Flagler Health's AI-Powered Journey is Revolutionizing Patient Care

Flagler Health is dedicated to supporting patients with chronic diseases by matching them with the right physician for the right care. Typically, patients grappling with severe pain conditions face limited options, often relying on prolonged opioid use or exploring costly and invasive surgical interventions. Unfortunately, the latter approach is not only expensive but also has a long recovery period. Flagler finds these patients and triages them to the appropriate specialist for an advanced and comprehensive evaluation. Current state without Flagler Flagler Health employs sophisticated AI techniques to rapidly process, synthesize, and analyze patient health records to aid physicians in treating patients with advanced pain conditions. This enables medical teams to make well-informed decisions, resulting in improved patient outcomes with an accuracy rate exceeding 90% in identifying and diagnosing patients. As the company built out its offerings, it identified the need to perform similarity searches across patient records to match conditions. Flagler’s engineers identified the need for a vector database but found standalone systems to be inefficient. They decided to use MongoDB Atlas Vector Search . This integrated platform allows the organization to store all data in a single location with a unified interface, facilitating quick access and efficient data querying. What Flagler can offer Will Hu, CTO, and Co-founder of Flagler Health, emphasizes the importance of a flexible database that can evolve with the company's growth. A relational model was deemed too rigid, leading the company to choose MongoDB's document model. This flexibility allows for easy customization of client configuration files, streamlining data editing and evolution. The managed services provided on MongoDB's developer data platform save time and offer reliability at scale throughout the development cycle. Flagler Health collaborates with many clinics, first processing millions of electronic health record (EHR) files in Databricks and transforming PDFs into raw text. Using the MongoDB Spark Connector and Atlas Data Federation , the company seamlessly streams data from AWS S3 to MongoDB. Combined with the transformed data from Databricks, Flagler’s real-time application data in MongoDB is used to generate accurate and personalized treatment plans for its users. MongoDB Atlas Search facilitates efficient data search across Flagler Health's extensive patient records. Beyond AI applications, MongoDB serves critical functions in Flagler Health's business, including its web application and patient engagement suite, fostering seamless communication between patients and clinics. This comprehensive application architecture, consolidated on MongoDB's developer data platform, simplifies Flagler Health's operations, enabling efficient development and increased productivity. By preventing administrative loops, the platform ensures timely access to potentially life-saving care for patients. Looking ahead, Flagler Health aims to enhance patient experiences by developing new features, such as a digital portal offering virtual therapy and mental health services, treatment and recovery tracking, and a repository of physical therapy videos. Leveraging MongoDB’s AI Innovators program for technical support and free Atlas credits, Flagler Health is rapidly integrating new AI-backed functionalities on the MongoDB Atlas developer data platform to further aid patients in need.

February 7, 2024

DocsGPT: Migrating One of the Industry’s Most Popular Open Source AI Assistants to Atlas Vector Search

Since its founding in 2019, Arc53 has focused on building predictive AI/ML solutions for its clients, with use cases ranging from recommendation engines to fraud detection. But it was with OpenAI’s launch of ChatGPT in November 2022 that the company saw AI rapidly take a new direction. As Arc53 co-founder Alex Tushynski explains, “It was no surprise to see generative AI suddenly capture market attention. Suddenly developers and data teams were being challenged to bring their companies’ own proprietary data to gen AI models, in what we now call retrieval-augmented generation (RAG) . But this involved them building new skills and disciplines. It wasn’t easy as they had to stitch together all of their different databases, data lakes, file systems, and search engines, and then figure out how to feed data from those systems into their shiny new vector stores. Then they had to orchestrate all of these components to build a complete solution. We identified an opportunity to abstract this complexity away from them. So DocsGPT was born.” DocsGPT is an open-source documentation assistant that makes it easy for developers to build conversational user experiences with natural language processing (NLP) directly on top of their data. That can be a chatbot on a company website for customer support or as an interface into internal data repositories to help boost employee productivity. Developers simply connect their data sources to DocsGPT to experiment with different embedding and large language models to optimize for their specific use case. LLM options currently include ChatGPT 3.5 and 4, along with DocsGPT-7B, based on Mistral. In addition to the choice of models, developers can choose where they deploy DocsGPT. They can download the open source code to run in their own environment or consume DocsGPT as a managed service from Arc53. Figure 1:   DocsGPT tech stack The freedom developers enjoy with DocsGPT is reflected in its levels of adoption. Since its release last year, the project has accumulated close to 14,000 GitHub stars and built a vibrant community with over 100 independent contributors. Tushynski says, “DocsGPT counts the UK government’s Department of Work and Pensions, pharmaceutical industry solution provider NoDeviation, and nearly 20,000 other users.” Tushynski and team selected MongoDB Atlas as the database for the DocsGPT managed service. “We’ve used MongoDB in many of our prior predictive AI projects. Its flexibility to store data of any structure, scale to huge data sets, and ease of use for both developers and data scientists means we can deliver richer AI-driven solutions faster. Using it to underpin DocsGPT was an obvious choice. As developers connect their documentation to DocsGPT, MongoDB stores all of the metadata, along with chat history and user account information.” Migrating from Elasticsearch to MongoDB Atlas Vector Search With the release of Atlas Vector Search , the DocsGPT team is now migrating its vector database from Elasticsearch into MongoDB Atlas. Tushynski says, “MongoDB is a proven OLTP database handling high read and write throughput with transactional guarantees. Bringing these capabilities to vector search and real-time gen AI apps is massively valuable. Atlas is able to handle highly dynamic workloads with rapidly changing embeddings in ways Elasticsearch cannot. The latency Elasticsearch exhibits as it merges updates into existing indexes means the app is often retrieving stale data, impacting the quality and reliability of model outputs.” Tushynski goes on to say, “We’ve experimented with a number of standalone vector databases. There are some good technologies there, but again, they don’t meet our needs when working with highly dynamic genAI apps. We often see users wanting to change embedding models as their apps evolve — a process that means re-encoding the data and updating the vector search index. For example, we’ve migrated our own default embedding models from OpenAI to multiple open-source models hosted on Hugging Face and now to BGE. MongoDB’s OLTP foundations make this a fast, simple, and hassle-free process.” The unification and synchronization of source data, metadata, and vector embeddings in a single platform, accessed by a single API, makes building gen AI apps faster, with lower cost and complexity. Alex Tushynski, co-founder, Arc53 Tushynski discusses the importance of embedding models in his blog post, Amplify DocsGPT with optimal embeddings . The post includes an example of how one customer was able to improve measured user experience by 50% simply by updating their embedding model. Figure 2:   Demonstrating the impact of vector embedding choices “One of the standout features of MongoDB Atlas in this context is its adeptness in handling multiple embeddings. The ability to link various embeddings directly with one or more LLMs without the necessity for separate collections or tables is a powerful feature," Tushynski says. "This approach not only streamlines the data architecture but also eliminates the need for data duplication, a common challenge in traditional database setups. By facilitating the storage and management of multiple embeddings, it allows for a more seamless and flexible interaction between different LLMs and their respective embeddings.” Being part of AI Innovators program , the DocsGPT engineering team gets free Atlas credits as well as access to technical expertise to help support their migration. The AI Innovators program is open to any startup that is building AI with MongoDB. Check out our AI resource page to learn more about building AI-powered apps with MongoDB.

February 6, 2024

Building AI with MongoDB: How Patronus Automates LLM Evaluation to Boost Confidence in GenAI

It is hardly headline news that large language models can be unreliable. For some use cases, this can be inconvenient. For others — especially in regulated industries — the consequences are way more severe. Enter Patronus AI , the industry-first automated evaluation platform for LLMs. Founded by machine learning experts from Meta AI and Meta Reality Labs, Patronus AI is on a mission to boost enterprise confidence in gen AI-powered apps, leading the way in shaping a trustworthy AI landscape. Rebecca Qian, Patronus co-founder and CTO explains, “Our platform enables engineers to score and benchmark LLM performance on real-world scenarios, generate adversarial test cases, monitor hallucinations, and detect PII and other unexpected and unsafe behavior. Customers use Patronus AI to detect LLM mistakes at scale and deploy AI products safely and confidently.” In recently published and widely cited research based on the FinanceBench question answering (QA) evaluation suite , Patronus made a startling discovery. Researchers found that a range of widely used state-of-the-art LLMs frequently hallucinated, incorrectly answering or refusing to answer up to 81% of financial analysts’ questions! This error rate occurred despite the models’ context windows being augmented with context retrieved from an external vector store. While retrieval augmented generation (RAG) is a common way of feeding models with up-to-date, domain-specific context, a key question faced by app owners is how to test the reliability of model outputs in a scalable way. This is where Patronus comes in. The company has partnered with the leading technologies in the gen AI ecosystem — from model providers and frameworks to vector store and RAG solutions — to provide managed evaluation services, test suites, and adversarial data sets. “As we assessed the landscape to prioritize which partners to work with, we saw massive demand from our customers for MongoDB Atlas ," said Qian. “Through our Patronus RAG evaluation API, we help customers verify that their RAG systems built on top of MongoDB Atlas consistently deliver top-tier, dependable information." In its new 10-minute guide , Patronus takes developers through a workflow showcasing how to evaluate a MongoDB Atlas-based retrieval system. The guide focuses on evaluating hallucination and answers relevance against an SEC 10-K filing, simulating a financial analyst querying the document for analysis and insights. The workflow is built using: The LlamaIndex data framework to ingest and chunk the source pdf document Atlas Vector Search to store, index, and query the chunk’s metadata and embeddings Patronus to score the model responses The workflow is shown in the figure below. Equipped with the results of an analysis, there are a number of steps developers can take to improve the performance of a RAG system. These include exploring different indexes, modifying document chunking sizes, re-engineering prompts, and for the most domain-specific apps, fine-tuning the embedding model itself. Review the 10-minute guide for a more detailed explanation of each of these steps. As Qian goes on to say, “Regardless of which approach you take to debug and fix hallucinations, it’s always important to continuously test your RAG system to make sure performance improvements are maintained over time. Of course, you can use the Patronus API iteratively to confirm.” To learn more about LLM evaluation, reach out at . Check out our AI resource page to learn more about building AI-powered apps with MongoDB.

February 2, 2024

Building AI With MongoDB: How Gradient Accelerator Blocks Take You From Zero To AI in Seconds

Founded by the former leaders of AI teams at Google, Netflix, and Splunk, Gradient enables businesses to create high-performing, cost-effective custom AI applications. Gradient provides a platform for businesses to build, customize, and deploy bespoke AI solutions — starting with the fastest way to develop AI through the use of its Accelerator Blocks. Gradient’s Accelerator Blocks are comprehensive, fully managed building blocks designed for AI use cases — reducing developer workload and helping businesses achieve their goals in a fraction of the time. Each block can be used as is (e.g. entity extraction, document summarization, etc.) or combined to create more robust and intricate solutions (e.g. investment co-pilots, customer chatbots, etc.) that are low-code, use best-of-breed technologies, and provide state-of-the-art performance. Gradient’s newest Accelerator Block focuses on enhancing the performance and accuracy of a model through retrieval augmented generation (RAG). The Accelerator Block uses Gradient’s state-of-the-art LLMs and embeddings, MongoDB Atlas Vector Search for storing, indexing, and retrieving high-dimensional vector data, and LlamaIndex for data integration. Together, Atlas Vector Search and LlamaIndex feed foundation models with up-to-date, proprietary enterprise data in real-time. Gradient designed the Accelerator Block for RAG to improve development velocity up to 10x by removing the need for infrastructure, setup, or in-depth knowledge around retrieval architectures. It also incorporates best practices in document chunking, re-rankers, and advanced retrieval strategies. As Tiffany Peng, VP of Engineering from Gradient explains, “Users who are looking to build custom AI applications can leverage Gradient’s Accelerator Block for RAG to set up RAG in seconds. Users just have to upload their data into our UI and Gradient will take care of the rest. That way users can leverage all of the benefits of RAG, without having to write any code or worry about the setup.” Peng goes on to say: “With MongoDB, developers can store data of any structure and then expose that data to OLTP, text search, and vector search processing using a single query API and driver. With this unification, developers have all of the core data services they need to build AI-powered apps that rely on working with live, operational data. For example, querying across keyword and vector search applications can filter on metadata and fuse result sets to quickly identify and return the exact context the model needs to generate grounded, accurate outputs. It is really hard to do this with other systems. That is because developers have to deal with the complexity of bolting on a standalone vector database to a separate OLTP database and search engine, and then keep all those separate systems in sync.” Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Providing further customization and an industry edge With Gradient’s platform, businesses can further build, customize, and deploy AI as they see fit — in addition to the benefits that stem from the use of Gradient’s Accelerator Blocks. Gradient partners with key vendors and communities in the AI ecosystem to provide developers and businesses with best-of-breed technologies. This includes Llama-2 and Mistral LLMs — with additional options coming — alongside the BGE embedding model and the Langchain, LlamaIndex, and Haystack frameworks. MongoDB Atlas is included as a core part of the stack available in the Gradient platform. While any business can leverage its platform, Gradient’s domain-specific models in financial services and healthcare provide a unique advantage for businesses within those industries. For example in financial services, typical use cases for Gradient’s models include risk management, KYC, anti-money laundering (AML), and robo-advisers, along with forecasting and analysis. In healthcare, Gradient use cases include pre-screening and post-visit summaries, clinical research, billing, and benefits, along with claims auditing. What is common to both finance and healthcare is that these two industries are subject to comprehensive regulations where user privacy is key. By building on Gradient and its state-of-the-art open-source large language models (LLMs) and embedding models, enterprises maintain full ownership of their data and AI systems. Developers can train, tune, and deploy their models in private environments running on Gradient’s AI cloud, which the company claims delivers up to 7x higher performance than base foundation models at 10x lower cost than the hyperscale cloud providers. To keep up with the latest announcements from Gradient, follow the company on Twitter/X or LinkedIn . You can learn more about MongoDB Atlas Vector Search from our 10-minute learning byte .

February 1, 2024

Building AI with MongoDB: How Devnagri Brings the Internet to 1.3 Billion People with Machine Translations

It was while on a trip to Japan that Himanshu Sharma — later to become CEO of Devnagri — made an observation that drew parallels with his native India. Despite the majority of Japan’s population not speaking English, they were still well served by an internet that was largely based on the English language. Key to doing this was translation, and specifically the early days of automated machine translation. And so the idea to found Devnagri , India’s first AI-powered translation platform, was born. “In India, 90% of the population are not fluent in English. That is close to 1.3 billion people . We wanted to bridge this gap to make it easy for non-English speakers to access the internet in their native languages. There are more than 22 Indian languages in use, but they represent just 0.1% of data on the internet,” says Sharma. “We want to give people the same access to knowledge and education in their native languages so that they can be part of the digital ecosystem. We wanted to help businesses and the government reach real people who were not online because of the language barrier.” Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Figure 1: Devnagri’s real time translation engine helps over 100 Indian brands connect with their customers over digital channels for the first time Building India’s first machine translation platform Sharma and his team at Devnagri have developed an AI-powered translation platform that can accept multiple file formats from different industry domains. Conceptually it is similar to Google Translate. Rather than a general consumer tool, it focuses on the four key industries that together make the largest impact on the everyday lives of Indian citizens: e-learning, banking, e-commerce, and media publishing. Devnagri provides API access to its platform and a plug-and-play solution for dynamically translating applications and websites. As Sharma explains, “Our platform is built on our own custom transformer model based on the MarianNMT neural machine translation framework. We train on corpuses of content in documents, chunking them into sentences and storing them in MongoDB Atlas . We use in-context learning for training, which is further augmented with reinforcement learning from human feedback (RLHF) to further tune for precise accuracy.” Sharma goes on to say, “We run on Google Vertex AI, which handles our MLops pipeline across both model training as well as inferencing. We use Google Tensor Processing Units (TPUs) to host our models so we can translate content — such as web pages, PDFs, documentation, web and mobile apps, images, and more — for users on the fly in real-time.” While the custom transformer-based models have served the company well, recent advancements in off-the-shelf models is leading Devnagri’s engineers to switch. They are evaluating a move to OpenAI GPT-4 and the Llama-2-7b foundation models, fine-tuned with the past four years of machine translation data captured by Devnagri. Why MongoDB? Flexibility and performance MongoDB is used as the database platform for Devnagri’s machine translation models. For each sentence chunk, MongoDB stores the source English language version, the machine translation, and if applicable, the human-verified sentence translation. As Sharma explains, “We use the sentences stored in MongoDB to train our models and support real-time inference. The flexibility of its document data model made MongoDB an ideal fit to store the diversity of structured and unstructured content and features our ML models translate.” We also exploit MongoDB’s scalable distributed architecture. This allows our models to parallelize read and write requests across multiple nodes in the cloud, dramatically improving training and inference throughput. We get faster time to market with higher quality results by using MongoDB. Himanshu Sharma, Devnagri co-founder and CEO What's next? Today Devnagri serves over 100 brands and several government agencies in India. The company has also joined MongoDB’s AI Innovators Program . The program provides its data science team with access to free Atlas credits to support further machine translation experiments and development, along with access to technical guidance and best practices. If you are building AI-powered apps, the best way to get started is to sign up for an account on MongoDB Atlas. From there, you can create a free MongoDB instance with the Atlas database and Atlas Vector Search , load your own data or our sample data sets, and explore what’s possible within the platform.

January 23, 2024

A Discussion with VISO TRUST: Expanding Atlas Vector Search to Provide Better-Informed Risk Decisions

We recently caught up with the team at VISO TRUST to check in and learn more about their use of MongoDB and their evolving search needs (if you missed our first story, read more about VISO TRUST’s AI use cases with MongoDB on our first blog ). VISO TRUST is an AI-powered third-party cyber risk and trust platform that enables any company to access actionable vendor security information in minutes. VISO TRUST delivers the fast and accurate intelligence needed to make informed cybersecurity risk decisions at scale for companies at any maturity level. Since our last discussion back in September 2023, VISO TRUST has adopted our new dedicated Search Nodes architecture, as well as scaled up both dense and sparse embeddings and retrieval to improve the user experience for their customers. We sat down for a deeper dive with Pierce Lamb, Senior Software Engineer on the Data and Machine Learning team at VISO TRUST to hear more about the latest exciting updates. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. How have things been evolving at VISO TRUST? What are some of the new things you're excited about since we spoke last? There have definitely been some exciting developments since we last spoke. Since then, we’ve implemented a new technique for extracting information out of PDF and image files that is much more accurate and breaks extractions into clear semantic units: sentences, paragraphs, and table rows. This might sound simple, but correctly extracting semantic units out of these PDF files is not an easy task by any means. We tested the entire Python ecosystem of PDF extraction libraries, cloud-based OCR services, and more, and settled on what we believe is currently state-of-the-art. For a retrieval augmented generation (RAG) system , which includes vector search, the accuracy of data extraction is the foundation on which everything else rests. Improving this process is a big win and will continue to be a mainstay of our focus. Last time we spoke, I mentioned that we were using MongoDB Atlas Vector Search to power a dense retrieval system and that we had plans to build a re-ranking architecture. Since then I’m happy to confirm we have achieved this goal. In our intelligent question-answering service, every time a question is asked, our re-ranking architecture provides four levels of ranking and scoring to a set of possible contexts in a matter of seconds to be used by large language models (LLMs) to answer the question. One additional exciting announcement is we’re now using MongoDB Atlas Search Nodes , which allow workload isolation when scaling search independently from our database. Previously, we were upgrading our entire database instance solely because our search needs were changing so rapidly (but our database needs were not). Now we are able to closely tune our search workloads to specific nodes and allow our database needs to change at a much different pace. As an example, retraining is much easier to track and tune with search nodes that can fit the entire Atlas Search Index in memory (which has significant latency implications). As many have echoed recently, our usage of LLMs has not reduced or eliminated our use of discriminative model inference but rather increased it. As the database that powers our ML tools, MongoDB has become the place we store and retrieve training data, which is a big performance improvement over AWS S3. We continue to use more and more model inference to perform tasks like classification that the in-context learning of LLMs cannot beat. We let LLMs stick to the use cases they are really good at like dealing with imperfect human language and providing labeled training data for discriminative models. VISO TRUST's AI Q&A feature being asked a security question You mentioned the recent adoption of Search Nodes. What impacts have you seen so far, especially given your existing usage of Atlas Vector Search? We were excited when we heard the announcement of Search Nodes in General Availability , as the offering solves an acute pain point we’d been experiencing. MongoDB started as the place where our machine learning and data team backed up and stored training data generated by our Document Intelligence Pipeline. When the requirements to build a generative AI product became clear, we were thrilled to see that MongoDB had a vector search offering because all of our document metadata already existed in Atlas. We were able to experiment with, deploy, and grow our generative AI product right on top of MongoDB. Our deployment, however, was now serving multiple use cases: backing up and storing data created by our pipeline and also servicing our vector search needs. The latter forced us to scale the entire deployment multiple times when our original MongoDB use case didn’t require it. Atlas Search Nodes enable us to decouple these two use cases and scale them independently. It was incredibly easy to deploy our search data to Atlas Search Nodes, requiring only a few button clicks. Furthermore, the memory requirements of vector search can now match our Atlas Search Node deployment exactly; we do not need to consider any extra memory for our storage and backup use case. This is a crucial consideration for keeping vector search fast and streamlined. Can you go into a bit more detail on how your use cases have evolved with Vector Search, especially as it relates to dense and sparse embeddings and retrieval? We provide a Q&A system that allows clients to ask questions of the security documents they or their vendors upload. For example, if a client wanted to know what one of their vendor’s password policies is, they could ask the system that question and get an answer with cited evidence without needing to look through the documents themselves. The same system can be used to automatically answer third-party security questionnaires our clients receive by parsing the questions out of them and answering those questions using data from our client’s documents. This saves a lot of time because answering security questions can often take weeks and involve multiple departments. The above system relies on three main collections separated via the semantic units mentioned above: paragraphs, sentences, and table rows . These are extracted from various security compliance documents uploaded to the VISO TRUST platform (things like SOC2s, ISOs, and security policies, among others). Each sentence has a field with an ObjectId that links to the corresponding paragraph or table row for easy look-up. To give a sense of size, the sentences collection is in the order of tens of millions of documents and growing every day. When a question request enters the re-ranking system, sparse retrieval (keyword search for similarity) is performed and then dense retrieval using a list of IDs passed by the request to filter to a set of possible documents the context can come from. The document filtering generally takes the scope from tens of millions to tens or hundreds of thousands. Sparse/dense retrieval independently scores and ranks those thousands or millions of sentences, and return the top one hundred in a matter of milliseconds to seconds. The output of these two sets of results are merged into a final set of one hundred favoring dense results unless a sparse result meets certain thresholds. At this point, we have a set of one hundred sentences, scored and ranked by similarity to the question, using two different methods powered by Atlas Search, in milliseconds to seconds. In parallel, we pass those hundred to a multi-representational model and a cross-encoder model to provide their scoring and ranking of each sentence. Once complete, we now have four independent levels of scoring and ranking for each sentence (sparse, dense, multi-representational, and cross-encoder). This data is passed to the Weighted Reciprocal Rank Fusion algorithm which uses the four independent rankings to create a final ranking and sorting, returning the number of results requested by the caller. How are you measuring the impact or relative success of your retrieval efforts? The monolithic collections I spoke about above grow substantially daily, as we’ve almost tripled our sentence volume since first bringing data into MongoDB, while still maintaining the same low latency our users depend on. We needed a vector database partner that allowed us to easily scale as our datasets grow and continue to deliver millisecond-to-second performance on similarity searches. Our system can often have many in-flight question requests occurring in parallel and Atlas has allowed us to scale with the click of a button when we start to hit performance limits. One piece of advice I would give to readers creating a RAG system using MongoDB’s Vector Search is to use ReadPreferences to ensure that retrieval queries and other reads occur primarily on secondary nodes. We use ReadPreferece.secondariesPreferred almost everywhere and this has helped substantially with the load on the system. Lastly, can you describe how MongoDB helps you execute on your goal of helping to better make informed risk assessments? As most people involved in compliance, auditing, and risk assessment efforts will report, these essential tasks tend to significantly slow down business transactions. This is in part because the need for perfect accuracy is extremely high and also because they tend to be human-reliant and slow to adopt new technology. At VISO TRUST , we are committed to delivering that same level of accuracy, but much faster. Since 2017, we have been executing on that vision and our generative AI products represent a leap forward in enabling our clients to assess and mitigate risk at a faster pace with increased levels of accuracy. MongoDB has been a key partner in the success of our generative AI products by becoming the reliable place we can store and query the data for our AI-based results. Getting started Thanks so much to Pierce Lamb for sharing details on VISO TRUST’s AI-powered applications and experiences with MongoDB. To learn more about MongoDB Atlas Search check out our learning byte , or if you’re ready to get started, head over to the product page to explore tutorials, documentation, and whitepapers. You’ll just be a few clicks away from spinning up your own vector search engine where you can experiment with the power of vector embeddings, RAG, and more!

January 17, 2024

Powering Vector Search Maturity in Retail with Pureinsights

In a competitive retail market, with customer demands higher than ever, retailers are on a constant journey toward search maturity. With the recent announcement of MongoDB’s Vector Search offering , retailers are implementing smarter search solutions to provide customers and staff with delightful experiences. Here we’ll explore how partners like Pureinsights are helping retailers to understand what true search maturity entails, and how to start their vector search journey on MongoDB Atlas. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. How MongoDB Partners Like Pureinsights Can Help Search and AI application specialists like Pureinsights can shorten the planning and development cycle, bring applications to production faster, and accelerate time to value for the customer. The Architecture of Vector Search Applications Virtually every Vector Search application will follow the basic logical flow illustrated below. A Client creates a complex query, which is then submitted to an encoder. The encoder turns the query into a Vector and submits it to the Vector Search Engine. The Vector Search engine searches the Vector Database and returns results, which are then formulated and returned to the Client for presentation. A complete Vector Search application includes all of the elements in this diagram, but not all of them are currently provided in the MongoDB Atlas platform. Everything to the left of the Vector Search Engine has to be developed by someone. MongoDB provides the vector store and a means to search it, but someone has to build the client and logic for the complete application. Why Involve Pureinsights to build your Vector Search applications? Pureinsights is a MongoDB BSI partner and has extensive knowledge and expertise in helping customers accelerate time-to-production of premier search applications. Pureinsights specializes in search applications and provides services to build end-to-end vector search solutions, including solutions to create and populate MongoDB Vector Search and UI/Client to search MongoDB Atlas using Atlas Search and Atlas Vector Search. Customers can focus on their core business while we do the development. Pureinsights Search Maturity Matrix – A Roadmap for Better Search, including Vector Search All of the use cases we discussed – e-commerce search, AI-powered search for support, and product information/reviews are advanced search features for Retail. But it’s always best to walk before you run, so before implementing Vector Search, a good strategy is to make sure your current applications have been optimized. Pureinsights methodology for search applications includes analyzing the state of current applications using a Search Maturity Matrix. Pureinsights - Design, Build, and Manage After mapping out their journey to build out advanced search capabilities for their retail applications, Pureinsights can help customers build the applications on the MongoDB Atlas Platform from design, to build, to operations. Application Design and Architecture: A well-defined plan is the key to efficient application development. Pureinsights with their immense experience can help with complex design decisions, such as choosing the right AI models and creating the best architecture for performance and security. Application Build: With over 20 years of experience in search, Pureinsights can help you build and deploy your Atlas Search application quickly and efficiently. Pureinsights has developed methodologies and frameworks like the Pureinsights Discovery Platform, which work with AI technologies (e.g., ChatGPT) and integrate with the Atlas platform to reduce development time and accelerate time to production. Managed services: Pureinsights can even run your search application for you with our SearchOps and maintain it for optimum performance with their fully managed service so you can focus on your core business. Conclusion Pureinsights can help customers overcome the challenges of building vector search applications and accelerate the time to production. With their expertise in application design, build, and managed services, Pureinsights can help customers build and deploy next-generation vector search applications that deliver real business value. Is your e-commerce store ready for AI? And are your products as easy to find as your competitors? Modern consumer expect flawless search experiences in mobile and online e-commerce search. Join MongoDB and Pureinsights on Tuesday, January 23, at 1pm ET for an insightful new webinar hosted by Digital Commerce 360 to learn: What is the search Maturity Matrix, and which capabilities are your organization missing to achieve better results How retailers are building smarter search applications with AI What's possible with MongoDB's new Vector Search offering Related resources: Modernize E-commerce Customer Experiences with MongoDB | MongoDB Atlas Vector Search | MongoDB MongoDB Atlas for Retail: Driving Innovation from Supply Chain to Checkout | MongoDB MongoDB Atlas Search for Retail: Go Beyond the E-commerce Store | MongoDB

December 14, 2023