Building AI with MongoDB: Putting Jina AI’s Breakthrough Open Source Embedding Model To Work
Founded in 2020 and based in Berlin, Germany, Jina AI has swiftly risen as a leader in multimodal AI, focusing on prompt engineering and embedding models. With its commitment to open-source and open research, Jina AI is bridging the gap between advanced AI theory and the real world AI-powered applications being built by developers and data scientists. Over 400,000 users are registered to use the Jina AI platform. Dr. Han Xiao, Founder and CEO at Jina AI, describes the company’s mission: “We envision paving the way towards the future of AI as a multimodal reality. We recognize that the existing machine learning and software ecosystems face challenges in handling multimodal AI. As a response, we're committed to developing pioneering tools and platforms that assist businesses and developers in navigating these complexities. Our vision is to play a crucial role in helping the world harness the vast potential of multimodal AI and truly revolutionize the way we interpret and interact with information." Jina AI’s work in embedding models has caught significant industry interest. As many developers now know, embeddings are essential to generative AI (gen AI). Embedding models are sophisticated algorithms that transform and embed data of any structure into multi-dimensional numerical encodings called vectors. These vectors give data semantic meaning by capturing its patterns and relationships. This means we can analyze and search for unstructured data in the same way we’ve always been able to with structured business data. Considering that over 80% of the data we create every day is unstructured, we start to appreciate how transformational embeddings — when combined with a powerful solution such as MongoDB Atlas Vector Search — are for gen AI. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Jina AI's jina-embeddings-v2 is the first open-source 8K text embedding model. Its 8K token length provides deeper context comprehension, significantly enhancing accuracy and relevance for tasks like retrieval-augmented generation (RAG) and semantic search . Jina AI’s embeddings offer enhanced data indexing and search capabilities, along with bilingual support. The embedding models are focused on singular languages and language pairs, ensuring state-of-the-art performance on language-specific benchmarks. Currently, Jina Embeddings v2 includes bilingual German-English and Chinese-English models, with other bilingual models in the works. Jina AI’s embedding models excel in classification, reranking, retrieval, and summarization, making them suitable for diverse applications, especially those that are cross-lingual. Recent examples from multinational enterprise customers include the automation of sales sequences, skills matching in HR applications, and payment reconciliation with fraud detection. Figure 1: Jina AI’s world-class embedding models improve search and RAG systems. In our recently published Jina Embeddings v2 and MongoDB Atlas article we show developers how to get started in bringing vector embeddings into their apps. The article covers: Creating a MongoDB Atlas instance and loading it with your data. (The article uses a sample Airbnb reviews data set.) Creating embeddings for the data set using the Jina Embeddings API. Storing and indexing the embeddings with Atlas Vector Search. Implementing semantic search using the embeddings. Dr. Xiao says, “Our Embedding API is natively integrated with key technologies within the gen AI developer stack including MongoDB Atlas, LangChain, LlamaIndex, Dify, and Haystack. MongoDB Atlas unifies application data and vector embeddings in a single platform, keeping both fully synced. Atlas Triggers keeps embeddings fresh by calling our Embeddings API whenever data is inserted or updated in the database. This integrated approach makes developers more productive as they build new, cutting-edge AI-powered apps for the business.” To get started with MongoDB and Jina AI, register for MongoDB Atlas and read the tutorial . If your team is building its AI apps, sign up for the AI Innovators Program . Successful companies get access to free Atlas credits and technical enablement, as well as connections into the broader AI ecosystem.
Building AI with MongoDB: Navigating the Path From Predictive to Generative AI
It should come as no surprise that the organizations unlocking the largest benefits from generative AI (gen AI) today have already been using predictive AI (a.k.a. classic, traditional, or analytical AI). McKinsey made this same observation back in June 2023 with its “Economic Potential of Generative AI 1 ” research. There would seem to be several reasons for this: An internal culture that is willing to experiment and explore what AI can do Access to skills — though we must emphasize that gen AI is way more reliant on developers than the data scientists driving predictive AI Availability of clean and curated data from across the organization that is ready to be fed into genAI models This doesn’t mean to say that only those teams with prior experience in predictive AI stand to benefit from gen AI. If you take a look at examples from our Building AI case study series , you’ll see many organizations with different AI maturity levels tapping MongoDB for gen AI innovation today. In this latest edition of the Building AI series, we feature two companies that, having built predictive AI apps, are now navigating the path to generative AI: MyGamePlan helps professional football players and coaches improve team performance. Ferret.ai helps businesses and consumers build trust by running background checks using public domain data. In both cases, Predictive AI is central to data-driven decision-making. And now both are exploring gen AI to extend their services with new products that further deepen user engagement. The common factor for both? Their use of MongoDB Atlas and its flexibility for any AI use case. Let's dig in. MyGamePlan: Elevating the performance of professional football players with AI-driven insights The use of data and analytics to improve the performance of professional athletes isn’t new. Typically, solutions are highly complex, relying on the integration of multiple data providers, resulting in high costs and slow time-to-insight. MyGamePlan is working to change that for professional football clubs and their players. (For the benefit of my U.S. colleagues, where you see “football” read “soccer.”) MyGamePlan is used by staff and players at successful teams across Europe, including Bayer Leverkusen (current number one in the German Bundesliga), AFC Sunderland in the English Championship, CD Castellón (current number one in the third division of Spain), and Slask Wroclaw (the current number one in the Polish Ekstraklasa). I met with Dries Deprest, CTO and co-founder at MyGamePlan who explains, “We redefine football analysis with cutting-edge analytics, AI, and a user-friendly platform that seamlessly integrates data from match events, player tracking, and video sources. Our platform automates workflows, allowing coaches and players to formulate tactics for each game, empower player development, and drive strategic excellence for the team's success.” At the core of the MyGamePlay platform are custom, Python-based predictive AI models hosted in Amazon Sagemaker. The models analyze passages of gameplay to score the performance of individual players and their impact on the game. Performance and contribution can be tracked over time and used to compare with players on opposing teams to help formulate matchday tactics. Data is key to making the models and predictions accurate. The company uses MongoDB Atlas as its database, storing: Metadata for each game, including matches, teams, and players. Event data from each game such as passes, tackles, fouls, and shots. Tracking telemetry that captures the position of each player on the field every 100ms. This data is pulled from MongoDB into Python DataFrames where it is used alongside third-party data streams to train the company’s ML models. Inferences generated from specific sequences of gameplay are stored back in MongoDB Atlas for downstream analysis by coaches and players. Figure 1: With MyGamePlans web and mobile apps, coaching staff, and players can instantly assess gameplay and shape tactics. On selecting MongoDB, Deprest says, We are continuously enriching data with AI models and using it for insights and analytics. MongoDB is a great fit for this use case. “We chose MongoDB when we started our development two years ago. Our data has complex multi-way relationships, mapping games to players to events and tracking. The best way to represent this data is with nested elements in rich document data structures. It's way more efficient for my developers to work with and for the app to process. Trying to model these relationships with foreign keys and then joining normalized tables in relational databases would be slow and inefficient.” In terms of development, Deprest says, “We use the PyMongo driver to integrate MongoDB with our Python ML data pipelines in Sagemaker and the MongoDB Node.js driver for our React-based, client-facing web and mobile apps.” Deprest goes on to say, "There are two key factors that differentiate MongoDB from NoSQL databases we also considered: the incredible level of developer adoption it has, meaning my team was immediately familiar and productive with it. And we can build in-app analytics directly on top of our live data, without the time and expense of having to move it out into some data warehouse or data lake. With MongoDB’s aggregation pipelines , we can process and analyze data with powerful roll-ups, transformations, and window functions to slice and dice data any way our users need it." Moving beyond predictive AI, the MyGamePlan team is now evaluating how gen AI can further improve user experience. Deprest says, "We have so much rich data and analytics in our platform, and we want to make it even easier for players and coaches to extract insights from it. We are experimenting with natural language processing via chat and question-answering interfaces on top of the data. Gen AI makes it easy for users to visualize and summarize the data. We are currently evaluating OpenAI’s ChatGPT LLM coupled with sophisticated approaches to prompt engineering, orchestration via Langchain, and retrieval augmented generation (RAG) using LlamaIndex and MongoDB Atlas Vector Search ." As our source data is in the MongoDB Atlas database already, unifying it with vector storage and search is a very productive and elegant solution for my developers. Dries Deprest, CTO and Co-founder, MyGamePlan By building on MongoDB Atlas, MyGamePlan’s team can use the breadth of functionality provided by a developer data platform to support almost any application and AI needs in the future. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Ferret.ai: Building trust with relationship intelligence powered by AI and MongoDB Atlas while cutting costs by 30% Across the physical and digital world, we are all constantly building relationships with others. Those relationships can be established through peer-to-peer transactions across online marketplaces, between tradespeople and professionals with their prospective clients, between investors and founders, or in creating new personal connections. All of those relationships rely on trust to work, but building it is hard. Ferret.ai was founded to remove the guesswork from building that trust. Ferret is an AI platform architected from the ground up to empower companies and individuals with real-time, unbiased intelligence to identify risks and embrace opportunities. Leveraging cutting-edge predictive and generative AI, hundreds of thousands of global data sources, and billions of public documents, Ferret.ai provides curated relationship intelligence and monitoring — once only available to the financial industry — making transparency the new norm. Al Basseri, CTO at Ferret tells us how it works: "We ingest information about individuals from public sources. This includes social networks, trading records, court documents, news archives, corporate ownership, and registered business interests. This data is streamed through Kafka pipelines into our Anyscale/Ray MLops platform where we apply natural language processing through our spaCy extraction and machine learning models. All metadata from our data sources — that's close to three billion documents — along with inferences from our models are stored in MongoDB Atlas . The data in Atlas is consumed by our web and mobile customer apps and by our corporate customers through our upcoming APIs." Figure 2: Artificial intelligence + real-time data = Relationship Intelligence from Ferret.ai. Moving beyond predictive AI, the company’s developers are now exploring opportunities to use gen AI in the Ferret platform. "We have a close relationship with the data science team at Nvidia,” says Basseri. “We see the opportunity to summarize the data sources and analysis we provide to help our clients better understand and engage with their contacts. Through our experimentation, the Mistral model with its mixture-of-experts ensemble seems to give us better results with less resource overhead than some of the larger and more generic large language models." As well as managing the data from Ferret’s predictive and gen AI models, customer data and contact lists are also stored in MongoDB Atlas. Through Ferret’s continuous monitoring and scoring of public record sources, any change in an individual's status is immediately detected. As Basseri explains, " MongoDB Atlas Triggers watch for updates to a score and instantly send an alert to consuming apps so our customers get real-time visibility into their relationship networks. It's all fully event-driven and reactive, so my developers just set it and forget it." Basseri also described the other advantages MongoDB provides his developers: Through Atlas, it’s available as a fully managed service with best practices baked in. That frees his developers and data scientists from the responsibilities of running a database so they can focus their efforts on app and AI innovation MongoDB Atlas is mature, having seen it scale in many other high-growth companies The availability of engineers who know MongoDB is important as the team rapidly expands Beyond the database, Ferret is extending its use of the MongoDB Atlas platform into text search. As the company moves into Google Cloud, it is migrating from its existing Amazon OpenSearch service to Atlas Search . Discussing the drivers for the migration, Basseri says, "Unifying both databases and search behind a single API reduces cognitive load for my developers, so they are more productive and build features faster. We eliminate all of the hassle of syncing data between database and search. Again, this frees up engineering cycles. It also means our users get a better experience because previous latency bottlenecks are gone — so as they search across contacts and content on our platform, they get the freshest results, not stale and outdated data." By migrating from OpenSearch to Atlas Search, we also save money and get more freedom. We will reduce our total cloud costs by 30% per month just by eliminating unnecessary data duplication between the database and the search engine. And with Atlas being multi-cloud, we get the optionality to move across cloud providers as and when we need to. Al Basseri, CTO at Ferret.ai Once the migration is complete, Basseri and the team will begin development with Atlas Vector Search as they continue to build out the gen AI side of the Ferret platform. What's next? No matter where you are in your AI journey, MongoDB can help. You can get started with your AI-powered apps by registering for MongoDB Atlas and exploring the tutorials available in our AI resources center . Our teams are always ready to come and explore the art of the possible with you. 1 https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/the-economic-potential-of-generative-ai-the-next-productivity-frontier
Building AI with MongoDB: How Flagler Health's AI-Powered Journey is Revolutionizing Patient Care
Flagler Health is dedicated to supporting patients with chronic diseases by matching them with the right physician for the right care. Typically, patients grappling with severe pain conditions face limited options, often relying on prolonged opioid use or exploring costly and invasive surgical interventions. Unfortunately, the latter approach is not only expensive but also has a long recovery period. Flagler finds these patients and triages them to the appropriate specialist for an advanced and comprehensive evaluation. Current state without Flagler Flagler Health employs sophisticated AI techniques to rapidly process, synthesize, and analyze patient health records to aid physicians in treating patients with advanced pain conditions. This enables medical teams to make well-informed decisions, resulting in improved patient outcomes with an accuracy rate exceeding 90% in identifying and diagnosing patients. As the company built out its offerings, it identified the need to perform similarity searches across patient records to match conditions. Flagler’s engineers identified the need for a vector database but found standalone systems to be inefficient. They decided to use MongoDB Atlas Vector Search . This integrated platform allows the organization to store all data in a single location with a unified interface, facilitating quick access and efficient data querying. What Flagler can offer Will Hu, CTO, and Co-founder of Flagler Health, emphasizes the importance of a flexible database that can evolve with the company's growth. A relational model was deemed too rigid, leading the company to choose MongoDB's document model. This flexibility allows for easy customization of client configuration files, streamlining data editing and evolution. The managed services provided on MongoDB's developer data platform save time and offer reliability at scale throughout the development cycle. Flagler Health collaborates with many clinics, first processing millions of electronic health record (EHR) files in Databricks and transforming PDFs into raw text. Using the MongoDB Spark Connector and Atlas Data Federation , the company seamlessly streams data from AWS S3 to MongoDB. Combined with the transformed data from Databricks, Flagler’s real-time application data in MongoDB is used to generate accurate and personalized treatment plans for its users. MongoDB Atlas Search facilitates efficient data search across Flagler Health's extensive patient records. Beyond AI applications, MongoDB serves critical functions in Flagler Health's business, including its web application and patient engagement suite, fostering seamless communication between patients and clinics. This comprehensive application architecture, consolidated on MongoDB's developer data platform, simplifies Flagler Health's operations, enabling efficient development and increased productivity. By preventing administrative loops, the platform ensures timely access to potentially life-saving care for patients. Looking ahead, Flagler Health aims to enhance patient experiences by developing new features, such as a digital portal offering virtual therapy and mental health services, treatment and recovery tracking, and a repository of physical therapy videos. Leveraging MongoDB’s AI Innovators program for technical support and free Atlas credits, Flagler Health is rapidly integrating new AI-backed functionalities on the MongoDB Atlas developer data platform to further aid patients in need.
DocsGPT: Migrating One of the Industry’s Most Popular Open Source AI Assistants to Atlas Vector Search
Since its founding in 2019, Arc53 has focused on building predictive AI/ML solutions for its clients, with use cases ranging from recommendation engines to fraud detection. But it was with OpenAI’s launch of ChatGPT in November 2022 that the company saw AI rapidly take a new direction. As Arc53 co-founder Alex Tushynski explains, “It was no surprise to see generative AI suddenly capture market attention. Suddenly developers and data teams were being challenged to bring their companies’ own proprietary data to gen AI models, in what we now call retrieval-augmented generation (RAG) . But this involved them building new skills and disciplines. It wasn’t easy as they had to stitch together all of their different databases, data lakes, file systems, and search engines, and then figure out how to feed data from those systems into their shiny new vector stores. Then they had to orchestrate all of these components to build a complete solution. We identified an opportunity to abstract this complexity away from them. So DocsGPT was born.” DocsGPT is an open-source documentation assistant that makes it easy for developers to build conversational user experiences with natural language processing (NLP) directly on top of their data. That can be a chatbot on a company website for customer support or as an interface into internal data repositories to help boost employee productivity. Developers simply connect their data sources to DocsGPT to experiment with different embedding and large language models to optimize for their specific use case. LLM options currently include ChatGPT 3.5 and 4, along with DocsGPT-7B, based on Mistral. In addition to the choice of models, developers can choose where they deploy DocsGPT. They can download the open source code to run in their own environment or consume DocsGPT as a managed service from Arc53. Figure 1: DocsGPT tech stack The freedom developers enjoy with DocsGPT is reflected in its levels of adoption. Since its release last year, the project has accumulated close to 14,000 GitHub stars and built a vibrant community with over 100 independent contributors. Tushynski says, “DocsGPT counts the UK government’s Department of Work and Pensions, pharmaceutical industry solution provider NoDeviation, and nearly 20,000 other users.” Tushynski and team selected MongoDB Atlas as the database for the DocsGPT managed service. “We’ve used MongoDB in many of our prior predictive AI projects. Its flexibility to store data of any structure, scale to huge data sets, and ease of use for both developers and data scientists means we can deliver richer AI-driven solutions faster. Using it to underpin DocsGPT was an obvious choice. As developers connect their documentation to DocsGPT, MongoDB stores all of the metadata, along with chat history and user account information.” Migrating from Elasticsearch to MongoDB Atlas Vector Search With the release of Atlas Vector Search , the DocsGPT team is now migrating its vector database from Elasticsearch into MongoDB Atlas. Tushynski says, “MongoDB is a proven OLTP database handling high read and write throughput with transactional guarantees. Bringing these capabilities to vector search and real-time gen AI apps is massively valuable. Atlas is able to handle highly dynamic workloads with rapidly changing embeddings in ways Elasticsearch cannot. The latency Elasticsearch exhibits as it merges updates into existing indexes means the app is often retrieving stale data, impacting the quality and reliability of model outputs.” Tushynski goes on to say, “We’ve experimented with a number of standalone vector databases. There are some good technologies there, but again, they don’t meet our needs when working with highly dynamic genAI apps. We often see users wanting to change embedding models as their apps evolve — a process that means re-encoding the data and updating the vector search index. For example, we’ve migrated our own default embedding models from OpenAI to multiple open-source models hosted on Hugging Face and now to BGE. MongoDB’s OLTP foundations make this a fast, simple, and hassle-free process.” The unification and synchronization of source data, metadata, and vector embeddings in a single platform, accessed by a single API, makes building gen AI apps faster, with lower cost and complexity. Alex Tushynski, co-founder, Arc53 Tushynski discusses the importance of embedding models in his blog post, Amplify DocsGPT with optimal embeddings . The post includes an example of how one customer was able to improve measured user experience by 50% simply by updating their embedding model. Figure 2: Demonstrating the impact of vector embedding choices “One of the standout features of MongoDB Atlas in this context is its adeptness in handling multiple embeddings. The ability to link various embeddings directly with one or more LLMs without the necessity for separate collections or tables is a powerful feature," Tushynski says. "This approach not only streamlines the data architecture but also eliminates the need for data duplication, a common challenge in traditional database setups. By facilitating the storage and management of multiple embeddings, it allows for a more seamless and flexible interaction between different LLMs and their respective embeddings.” Being part of AI Innovators program , the DocsGPT engineering team gets free Atlas credits as well as access to technical expertise to help support their migration. The AI Innovators program is open to any startup that is building AI with MongoDB. Check out our AI resource page to learn more about building AI-powered apps with MongoDB.
Building AI with MongoDB: How Patronus Automates LLM Evaluation to Boost Confidence in GenAI
It is hardly headline news that large language models can be unreliable. For some use cases, this can be inconvenient. For others — especially in regulated industries — the consequences are way more severe. Enter Patronus AI , the industry-first automated evaluation platform for LLMs. Founded by machine learning experts from Meta AI and Meta Reality Labs, Patronus AI is on a mission to boost enterprise confidence in gen AI-powered apps, leading the way in shaping a trustworthy AI landscape. Rebecca Qian, Patronus co-founder and CTO explains, “Our platform enables engineers to score and benchmark LLM performance on real-world scenarios, generate adversarial test cases, monitor hallucinations, and detect PII and other unexpected and unsafe behavior. Customers use Patronus AI to detect LLM mistakes at scale and deploy AI products safely and confidently.” In recently published and widely cited research based on the FinanceBench question answering (QA) evaluation suite , Patronus made a startling discovery. Researchers found that a range of widely used state-of-the-art LLMs frequently hallucinated, incorrectly answering or refusing to answer up to 81% of financial analysts’ questions! This error rate occurred despite the models’ context windows being augmented with context retrieved from an external vector store. While retrieval augmented generation (RAG) is a common way of feeding models with up-to-date, domain-specific context, a key question faced by app owners is how to test the reliability of model outputs in a scalable way. This is where Patronus comes in. The company has partnered with the leading technologies in the gen AI ecosystem — from model providers and frameworks to vector store and RAG solutions — to provide managed evaluation services, test suites, and adversarial data sets. “As we assessed the landscape to prioritize which partners to work with, we saw massive demand from our customers for MongoDB Atlas ," said Qian. “Through our Patronus RAG evaluation API, we help customers verify that their RAG systems built on top of MongoDB Atlas consistently deliver top-tier, dependable information." In its new 10-minute guide , Patronus takes developers through a workflow showcasing how to evaluate a MongoDB Atlas-based retrieval system. The guide focuses on evaluating hallucination and answers relevance against an SEC 10-K filing, simulating a financial analyst querying the document for analysis and insights. The workflow is built using: The LlamaIndex data framework to ingest and chunk the source pdf document Atlas Vector Search to store, index, and query the chunk’s metadata and embeddings Patronus to score the model responses The workflow is shown in the figure below. Equipped with the results of an analysis, there are a number of steps developers can take to improve the performance of a RAG system. These include exploring different indexes, modifying document chunking sizes, re-engineering prompts, and for the most domain-specific apps, fine-tuning the embedding model itself. Review the 10-minute guide for a more detailed explanation of each of these steps. As Qian goes on to say, “Regardless of which approach you take to debug and fix hallucinations, it’s always important to continuously test your RAG system to make sure performance improvements are maintained over time. Of course, you can use the Patronus API iteratively to confirm.” To learn more about LLM evaluation, reach out at firstname.lastname@example.org . Check out our AI resource page to learn more about building AI-powered apps with MongoDB.
Building AI With MongoDB: How Gradient Accelerator Blocks Take You From Zero To AI in Seconds
Founded by the former leaders of AI teams at Google, Netflix, and Splunk, Gradient enables businesses to create high-performing, cost-effective custom AI applications. Gradient provides a platform for businesses to build, customize, and deploy bespoke AI solutions — starting with the fastest way to develop AI through the use of its Accelerator Blocks. Gradient’s Accelerator Blocks are comprehensive, fully managed building blocks designed for AI use cases — reducing developer workload and helping businesses achieve their goals in a fraction of the time. Each block can be used as is (e.g. entity extraction, document summarization, etc.) or combined to create more robust and intricate solutions (e.g. investment co-pilots, customer chatbots, etc.) that are low-code, use best-of-breed technologies, and provide state-of-the-art performance. Gradient’s newest Accelerator Block focuses on enhancing the performance and accuracy of a model through retrieval augmented generation (RAG). The Accelerator Block uses Gradient’s state-of-the-art LLMs and embeddings, MongoDB Atlas Vector Search for storing, indexing, and retrieving high-dimensional vector data, and LlamaIndex for data integration. Together, Atlas Vector Search and LlamaIndex feed foundation models with up-to-date, proprietary enterprise data in real-time. Gradient designed the Accelerator Block for RAG to improve development velocity up to 10x by removing the need for infrastructure, setup, or in-depth knowledge around retrieval architectures. It also incorporates best practices in document chunking, re-rankers, and advanced retrieval strategies. As Tiffany Peng, VP of Engineering from Gradient explains, “Users who are looking to build custom AI applications can leverage Gradient’s Accelerator Block for RAG to set up RAG in seconds. Users just have to upload their data into our UI and Gradient will take care of the rest. That way users can leverage all of the benefits of RAG, without having to write any code or worry about the setup.” Peng goes on to say: “With MongoDB, developers can store data of any structure and then expose that data to OLTP, text search, and vector search processing using a single query API and driver. With this unification, developers have all of the core data services they need to build AI-powered apps that rely on working with live, operational data. For example, querying across keyword and vector search applications can filter on metadata and fuse result sets to quickly identify and return the exact context the model needs to generate grounded, accurate outputs. It is really hard to do this with other systems. That is because developers have to deal with the complexity of bolting on a standalone vector database to a separate OLTP database and search engine, and then keep all those separate systems in sync.” Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Providing further customization and an industry edge With Gradient’s platform, businesses can further build, customize, and deploy AI as they see fit — in addition to the benefits that stem from the use of Gradient’s Accelerator Blocks. Gradient partners with key vendors and communities in the AI ecosystem to provide developers and businesses with best-of-breed technologies. This includes Llama-2 and Mistral LLMs — with additional options coming — alongside the BGE embedding model and the Langchain, LlamaIndex, and Haystack frameworks. MongoDB Atlas is included as a core part of the stack available in the Gradient platform. While any business can leverage its platform, Gradient’s domain-specific models in financial services and healthcare provide a unique advantage for businesses within those industries. For example in financial services, typical use cases for Gradient’s models include risk management, KYC, anti-money laundering (AML), and robo-advisers, along with forecasting and analysis. In healthcare, Gradient use cases include pre-screening and post-visit summaries, clinical research, billing, and benefits, along with claims auditing. What is common to both finance and healthcare is that these two industries are subject to comprehensive regulations where user privacy is key. By building on Gradient and its state-of-the-art open-source large language models (LLMs) and embedding models, enterprises maintain full ownership of their data and AI systems. Developers can train, tune, and deploy their models in private environments running on Gradient’s AI cloud, which the company claims delivers up to 7x higher performance than base foundation models at 10x lower cost than the hyperscale cloud providers. To keep up with the latest announcements from Gradient, follow the company on Twitter/X or LinkedIn . You can learn more about MongoDB Atlas Vector Search from our 10-minute learning byte .
Building AI with MongoDB: How Devnagri Brings the Internet to 1.3 Billion People with Machine Translations
It was while on a trip to Japan that Himanshu Sharma — later to become CEO of Devnagri — made an observation that drew parallels with his native India. Despite the majority of Japan’s population not speaking English, they were still well served by an internet that was largely based on the English language. Key to doing this was translation, and specifically the early days of automated machine translation. And so the idea to found Devnagri , India’s first AI-powered translation platform, was born. “In India, 90% of the population are not fluent in English. That is close to 1.3 billion people . We wanted to bridge this gap to make it easy for non-English speakers to access the internet in their native languages. There are more than 22 Indian languages in use, but they represent just 0.1% of data on the internet,” says Sharma. “We want to give people the same access to knowledge and education in their native languages so that they can be part of the digital ecosystem. We wanted to help businesses and the government reach real people who were not online because of the language barrier.” Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Figure 1: Devnagri’s real time translation engine helps over 100 Indian brands connect with their customers over digital channels for the first time Building India’s first machine translation platform Sharma and his team at Devnagri have developed an AI-powered translation platform that can accept multiple file formats from different industry domains. Conceptually it is similar to Google Translate. Rather than a general consumer tool, it focuses on the four key industries that together make the largest impact on the everyday lives of Indian citizens: e-learning, banking, e-commerce, and media publishing. Devnagri provides API access to its platform and a plug-and-play solution for dynamically translating applications and websites. As Sharma explains, “Our platform is built on our own custom transformer model based on the MarianNMT neural machine translation framework. We train on corpuses of content in documents, chunking them into sentences and storing them in MongoDB Atlas . We use in-context learning for training, which is further augmented with reinforcement learning from human feedback (RLHF) to further tune for precise accuracy.” Sharma goes on to say, “We run on Google Vertex AI, which handles our MLops pipeline across both model training as well as inferencing. We use Google Tensor Processing Units (TPUs) to host our models so we can translate content — such as web pages, PDFs, documentation, web and mobile apps, images, and more — for users on the fly in real-time.” While the custom transformer-based models have served the company well, recent advancements in off-the-shelf models is leading Devnagri’s engineers to switch. They are evaluating a move to OpenAI GPT-4 and the Llama-2-7b foundation models, fine-tuned with the past four years of machine translation data captured by Devnagri. Why MongoDB? Flexibility and performance MongoDB is used as the database platform for Devnagri’s machine translation models. For each sentence chunk, MongoDB stores the source English language version, the machine translation, and if applicable, the human-verified sentence translation. As Sharma explains, “We use the sentences stored in MongoDB to train our models and support real-time inference. The flexibility of its document data model made MongoDB an ideal fit to store the diversity of structured and unstructured content and features our ML models translate.” We also exploit MongoDB’s scalable distributed architecture. This allows our models to parallelize read and write requests across multiple nodes in the cloud, dramatically improving training and inference throughput. We get faster time to market with higher quality results by using MongoDB. Himanshu Sharma, Devnagri co-founder and CEO What's next? Today Devnagri serves over 100 brands and several government agencies in India. The company has also joined MongoDB’s AI Innovators Program . The program provides its data science team with access to free Atlas credits to support further machine translation experiments and development, along with access to technical guidance and best practices. If you are building AI-powered apps, the best way to get started is to sign up for an account on MongoDB Atlas. From there, you can create a free MongoDB instance with the Atlas database and Atlas Vector Search , load your own data or our sample data sets, and explore what’s possible within the platform.
A Discussion with VISO TRUST: Expanding Atlas Vector Search to Provide Better-Informed Risk Decisions
We recently caught up with the team at VISO TRUST to check in and learn more about their use of MongoDB and their evolving search needs (if you missed our first story, read more about VISO TRUST’s AI use cases with MongoDB on our first blog ). VISO TRUST is an AI-powered third-party cyber risk and trust platform that enables any company to access actionable vendor security information in minutes. VISO TRUST delivers the fast and accurate intelligence needed to make informed cybersecurity risk decisions at scale for companies at any maturity level. Since our last discussion back in September 2023, VISO TRUST has adopted our new dedicated Search Nodes architecture, as well as scaled up both dense and sparse embeddings and retrieval to improve the user experience for their customers. We sat down for a deeper dive with Pierce Lamb, Senior Software Engineer on the Data and Machine Learning team at VISO TRUST to hear more about the latest exciting updates. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. How have things been evolving at VISO TRUST? What are some of the new things you're excited about since we spoke last? There have definitely been some exciting developments since we last spoke. Since then, we’ve implemented a new technique for extracting information out of PDF and image files that is much more accurate and breaks extractions into clear semantic units: sentences, paragraphs, and table rows. This might sound simple, but correctly extracting semantic units out of these PDF files is not an easy task by any means. We tested the entire Python ecosystem of PDF extraction libraries, cloud-based OCR services, and more, and settled on what we believe is currently state-of-the-art. For a retrieval augmented generation (RAG) system , which includes vector search, the accuracy of data extraction is the foundation on which everything else rests. Improving this process is a big win and will continue to be a mainstay of our focus. Last time we spoke, I mentioned that we were using MongoDB Atlas Vector Search to power a dense retrieval system and that we had plans to build a re-ranking architecture. Since then I’m happy to confirm we have achieved this goal. In our intelligent question-answering service, every time a question is asked, our re-ranking architecture provides four levels of ranking and scoring to a set of possible contexts in a matter of seconds to be used by large language models (LLMs) to answer the question. One additional exciting announcement is we’re now using MongoDB Atlas Search Nodes , which allow workload isolation when scaling search independently from our database. Previously, we were upgrading our entire database instance solely because our search needs were changing so rapidly (but our database needs were not). Now we are able to closely tune our search workloads to specific nodes and allow our database needs to change at a much different pace. As an example, retraining is much easier to track and tune with search nodes that can fit the entire Atlas Search Index in memory (which has significant latency implications). As many have echoed recently, our usage of LLMs has not reduced or eliminated our use of discriminative model inference but rather increased it. As the database that powers our ML tools, MongoDB has become the place we store and retrieve training data, which is a big performance improvement over AWS S3. We continue to use more and more model inference to perform tasks like classification that the in-context learning of LLMs cannot beat. We let LLMs stick to the use cases they are really good at like dealing with imperfect human language and providing labeled training data for discriminative models. VISO TRUST's AI Q&A feature being asked a security question You mentioned the recent adoption of Search Nodes. What impacts have you seen so far, especially given your existing usage of Atlas Vector Search? We were excited when we heard the announcement of Search Nodes in General Availability , as the offering solves an acute pain point we’d been experiencing. MongoDB started as the place where our machine learning and data team backed up and stored training data generated by our Document Intelligence Pipeline. When the requirements to build a generative AI product became clear, we were thrilled to see that MongoDB had a vector search offering because all of our document metadata already existed in Atlas. We were able to experiment with, deploy, and grow our generative AI product right on top of MongoDB. Our deployment, however, was now serving multiple use cases: backing up and storing data created by our pipeline and also servicing our vector search needs. The latter forced us to scale the entire deployment multiple times when our original MongoDB use case didn’t require it. Atlas Search Nodes enable us to decouple these two use cases and scale them independently. It was incredibly easy to deploy our search data to Atlas Search Nodes, requiring only a few button clicks. Furthermore, the memory requirements of vector search can now match our Atlas Search Node deployment exactly; we do not need to consider any extra memory for our storage and backup use case. This is a crucial consideration for keeping vector search fast and streamlined. Can you go into a bit more detail on how your use cases have evolved with Vector Search, especially as it relates to dense and sparse embeddings and retrieval? We provide a Q&A system that allows clients to ask questions of the security documents they or their vendors upload. For example, if a client wanted to know what one of their vendor’s password policies is, they could ask the system that question and get an answer with cited evidence without needing to look through the documents themselves. The same system can be used to automatically answer third-party security questionnaires our clients receive by parsing the questions out of them and answering those questions using data from our client’s documents. This saves a lot of time because answering security questions can often take weeks and involve multiple departments. The above system relies on three main collections separated via the semantic units mentioned above: paragraphs, sentences, and table rows . These are extracted from various security compliance documents uploaded to the VISO TRUST platform (things like SOC2s, ISOs, and security policies, among others). Each sentence has a field with an ObjectId that links to the corresponding paragraph or table row for easy look-up. To give a sense of size, the sentences collection is in the order of tens of millions of documents and growing every day. When a question request enters the re-ranking system, sparse retrieval (keyword search for similarity) is performed and then dense retrieval using a list of IDs passed by the request to filter to a set of possible documents the context can come from. The document filtering generally takes the scope from tens of millions to tens or hundreds of thousands. Sparse/dense retrieval independently scores and ranks those thousands or millions of sentences, and return the top one hundred in a matter of milliseconds to seconds. The output of these two sets of results are merged into a final set of one hundred favoring dense results unless a sparse result meets certain thresholds. At this point, we have a set of one hundred sentences, scored and ranked by similarity to the question, using two different methods powered by Atlas Search, in milliseconds to seconds. In parallel, we pass those hundred to a multi-representational model and a cross-encoder model to provide their scoring and ranking of each sentence. Once complete, we now have four independent levels of scoring and ranking for each sentence (sparse, dense, multi-representational, and cross-encoder). This data is passed to the Weighted Reciprocal Rank Fusion algorithm which uses the four independent rankings to create a final ranking and sorting, returning the number of results requested by the caller. How are you measuring the impact or relative success of your retrieval efforts? The monolithic collections I spoke about above grow substantially daily, as we’ve almost tripled our sentence volume since first bringing data into MongoDB, while still maintaining the same low latency our users depend on. We needed a vector database partner that allowed us to easily scale as our datasets grow and continue to deliver millisecond-to-second performance on similarity searches. Our system can often have many in-flight question requests occurring in parallel and Atlas has allowed us to scale with the click of a button when we start to hit performance limits. One piece of advice I would give to readers creating a RAG system using MongoDB’s Vector Search is to use ReadPreferences to ensure that retrieval queries and other reads occur primarily on secondary nodes. We use ReadPreferece.secondariesPreferred almost everywhere and this has helped substantially with the load on the system. Lastly, can you describe how MongoDB helps you execute on your goal of helping to better make informed risk assessments? As most people involved in compliance, auditing, and risk assessment efforts will report, these essential tasks tend to significantly slow down business transactions. This is in part because the need for perfect accuracy is extremely high and also because they tend to be human-reliant and slow to adopt new technology. At VISO TRUST , we are committed to delivering that same level of accuracy, but much faster. Since 2017, we have been executing on that vision and our generative AI products represent a leap forward in enabling our clients to assess and mitigate risk at a faster pace with increased levels of accuracy. MongoDB has been a key partner in the success of our generative AI products by becoming the reliable place we can store and query the data for our AI-based results. Getting started Thanks so much to Pierce Lamb for sharing details on VISO TRUST’s AI-powered applications and experiences with MongoDB. To learn more about MongoDB Atlas Search check out our learning byte , or if you’re ready to get started, head over to the product page to explore tutorials, documentation, and whitepapers. You’ll just be a few clicks away from spinning up your own vector search engine where you can experiment with the power of vector embeddings, RAG, and more!
Powering Vector Search Maturity in Retail with Pureinsights
In a competitive retail market, with customer demands higher than ever, retailers are on a constant journey toward search maturity. With the recent announcement of MongoDB’s Vector Search offering , retailers are implementing smarter search solutions to provide customers and staff with delightful experiences. Here we’ll explore how partners like Pureinsights are helping retailers to understand what true search maturity entails, and how to start their vector search journey on MongoDB Atlas. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. How MongoDB Partners Like Pureinsights Can Help Search and AI application specialists like Pureinsights can shorten the planning and development cycle, bring applications to production faster, and accelerate time to value for the customer. The Architecture of Vector Search Applications Virtually every Vector Search application will follow the basic logical flow illustrated below. A Client creates a complex query, which is then submitted to an encoder. The encoder turns the query into a Vector and submits it to the Vector Search Engine. The Vector Search engine searches the Vector Database and returns results, which are then formulated and returned to the Client for presentation. A complete Vector Search application includes all of the elements in this diagram, but not all of them are currently provided in the MongoDB Atlas platform. Everything to the left of the Vector Search Engine has to be developed by someone. MongoDB provides the vector store and a means to search it, but someone has to build the client and logic for the complete application. Why Involve Pureinsights to build your Vector Search applications? Pureinsights is a MongoDB BSI partner and has extensive knowledge and expertise in helping customers accelerate time-to-production of premier search applications. Pureinsights specializes in search applications and provides services to build end-to-end vector search solutions, including solutions to create and populate MongoDB Vector Search and UI/Client to search MongoDB Atlas using Atlas Search and Atlas Vector Search. Customers can focus on their core business while we do the development. Pureinsights Search Maturity Matrix – A Roadmap for Better Search, including Vector Search All of the use cases we discussed – e-commerce search, AI-powered search for support, and product information/reviews are advanced search features for Retail. But it’s always best to walk before you run, so before implementing Vector Search, a good strategy is to make sure your current applications have been optimized. Pureinsights methodology for search applications includes analyzing the state of current applications using a Search Maturity Matrix. Pureinsights - Design, Build, and Manage After mapping out their journey to build out advanced search capabilities for their retail applications, Pureinsights can help customers build the applications on the MongoDB Atlas Platform from design, to build, to operations. Application Design and Architecture: A well-defined plan is the key to efficient application development. Pureinsights with their immense experience can help with complex design decisions, such as choosing the right AI models and creating the best architecture for performance and security. Application Build: With over 20 years of experience in search, Pureinsights can help you build and deploy your Atlas Search application quickly and efficiently. Pureinsights has developed methodologies and frameworks like the Pureinsights Discovery Platform, which work with AI technologies (e.g., ChatGPT) and integrate with the Atlas platform to reduce development time and accelerate time to production. Managed services: Pureinsights can even run your search application for you with our SearchOps and maintain it for optimum performance with their fully managed service so you can focus on your core business. Conclusion Pureinsights can help customers overcome the challenges of building vector search applications and accelerate the time to production. With their expertise in application design, build, and managed services, Pureinsights can help customers build and deploy next-generation vector search applications that deliver real business value. Is your e-commerce store ready for AI? And are your products as easy to find as your competitors? Modern consumer expect flawless search experiences in mobile and online e-commerce search. Join MongoDB and Pureinsights on Tuesday, January 23, at 1pm ET for an insightful new webinar hosted by Digital Commerce 360 to learn: What is the search Maturity Matrix, and which capabilities are your organization missing to achieve better results How retailers are building smarter search applications with AI What's possible with MongoDB's new Vector Search offering Related resources: Modernize E-commerce Customer Experiences with MongoDB | MongoDB Atlas Vector Search | MongoDB MongoDB Atlas for Retail: Driving Innovation from Supply Chain to Checkout | MongoDB MongoDB Atlas Search for Retail: Go Beyond the E-commerce Store | MongoDB
Vector Search and Dedicated Search Nodes: Now in General Availability
This post is also available in: Deutsch , Français , Español , Português , Italiano , 한국어 , 繁體中文 . Today we’re excited to take the next step in adding even more value to the Atlas platform with the general availability (GA) release of both Atlas Vector Search and Search Nodes. Since announcing Atlas Vector Search and dedicated infrastructure with Search Nodes in public preview, we’ve seen continued excitement and demand for additional workloads using vector-optimized search nodes. This new level of scalability and performance ensures workload isolation and the ability to better optimize resources for vector search use cases. Atlas Vector Search allows developers to build intelligent applications powered by semantic search and generative AI over any data type. Atlas Vector Search solves the challenge of providing relevant results even when users don’t know what they’re looking for and uses machine learning models to find results that are similar for almost any type of data. Within just five months of being announced in public preview, Atlas Vector Search has already received the highest developer net promoter score (NPS) — a measure of how likely someone is to recommend a solution to someone else — and is the second most widely used vector database, according to Retool’s State of AI report . Check out our AI resource page to learn more about building AI-powered apps with MongoDB. There are two key use cases for Atlas Vector Search to build next-gen applications: Semantic search: searching and finding relevant results from unstructured data, based on semantic similarity Retrieval augmented generation (RAG): augment the incredible reasoning capabilities of LLMs with feeds of your own, real-time data to create GenAI apps uniquely tailored to the demands of your business. Atlas Vector Search unlocks the full potential of your data, no matter whether it’s structured or unstructured, taking advantage of the rise in popularity and usage of AI and LLMs to solve critical business challenges. This is possible due to Vector Search being part of the MongoDB Atlas developer data platform, which starts with our flexible document data model and unified API providing one consistent experience. To ensure you unlock the most value possible from Atlas Vector Search, we have cultivated a robust ecosystem of AI integrations, allowing developers to build with their favorite LLMs or frameworks. Our ecosystem of AI integrations for Atlas Vector Search To learn more about Atlas Vector Search, watch our short video or jump right into the tutorial . Atlas Vector Search also takes advantage of our new Search Nodes dedicated architecture, enabling better optimization for the right level of resourcing for specific workload needs. Search Nodes provide dedicated infrastructure for Atlas Search and Vector Search workloads, allowing you to optimize compute resources and fully scale search needs independent of the database. Search Nodes provide better performance at scale, delivering workload isolation, higher availability, and the ability to better optimize resource usage. In some cases we’ve seen 60% faster query time for some users' workloads, leveraging concurrent querying in Search Nodes. In addition to the compute-heavy search nodes we provided in the public preview, this GA release includes a memory-optimized, low CPU option that is optimal for Vector Search in production. This makes resource contention or the possibility of a resulting service interruption (due to your database and search sharing the same infrastructure previously) a thing of the past. Coupled Architecture (left) compared with the decoupled Search Node architecture (right) We see this as the next evolution of our architecture for both Atlas Search and Vector Search, furthering the value provided by the MongoDB developer data platform. At this time Search Nodes are currently available on AWS single-region clusters (with Google Cloud and Azure coming soon), as customers can continue using shared infrastructure for Google Cloud and Microsoft Azure. Read our initial announcement blog post to view the steps of how to turn on Search Nodes today, or jump right into the tutorial . Both of these features are available today for production usage. We can’t wait to see what you build, and please reach out to us with any questions.
Building AI with MongoDB: Retrieval-Augmented Generation (RAG) Puts Power in Developers’ Hands
As recently as 12 months ago, any mention of retrieval-augmented generation (RAG) would have left most of us confused. However, with the explosion of generative AI, the RAG architectural pattern has now firmly established itself in the enterprise landscape. RAG presents developers with a potent combination. They can take the reasoning capabilities of pre-trained, general-purpose LLMs and feed them with real-time, company-specific data. As a result, developers can build AI-powered apps that generate outputs grounded in enterprise data and knowledge that is accurate, up-to-date, and relevant. They can do this without having to turn to specialized data science teams to either retrain or fine-tune models — a complex, time-consuming, and expensive process. Over this series of Building AI with MongoDB blog posts, we’ve featured developers using tools like MongoDB Atlas Vector Search for RAG in a whole range of applications. Take a look at our AI case studies page and you’ll find examples spanning conversational AI with chatbots and voice bots, co-pilots, threat intelligence and cybersecurity, contract management, question-answering, healthcare compliance and treatment assistants, content discovery and monetization, and more. Further reflecting its growing adoption, Retool’s State of AI survey from a couple of weeks ago shows Atlas Vector Search earning the highest net promoter score (NPS) among developers . Check out our AI resource page to learn more about building AI-powered apps with MongoDB. In this blog post, I’ll highlight three more interesting and novel use cases: Unlocking geological data for better decision-making and accelerating the path to net zero at Eni Video and audio personalization at Potion Unlocking insights from enterprise knowledge bases at Kovai Eni makes terabytes of subsurface unstructured data actionable with MongoDB Atlas Based in Italy, Eni is a leading integrated energy company with more than 30,000 employees across 69 countries. In 2020, the company launched a strategy to reach net zero emissions by 2050 and develop more environmentally and financially sustainable products. Sabato Severino, Senior AI Solution Architect for Geoscience at Eni, explains the role of his team: “We’re responsible for finding the best solutions in the market for our cloud infrastructure and adapting them to meet specific business needs.” Projects include using AI for drilling and exploration, leveraging cloud APIs to accelerate innovation, and building a smart platform to promote knowledge sharing across the company. Eni’s document management platform for geosciences offers an ecosystem of services and applications for creating and sharing content. It leverages embedded AI models to extract information from documents and stores unstructured data in MongoDB. The challenges for Severino’s team were to maintain the platform as it ingested a growing volume of data — hundreds of thousands of documents and terabytes of data — and to enable different user groups to extract relevant insights from comprehensive records quickly and easily. With MongoDB Atlas , Eni users can quickly find data spanning multiple years and geographies to identify trends and analyze models that support decision-making within their fields. The platform uses MongoDB Atlas Search to filter out irrelevant documents while also integrating AI and machine learning models, such as vector search, to make it even easier to identify patterns. “The generative AI we’ve introduced currently creates vector embeddings from documents, so when a user asks a question, it retrieves the most relevant document and uses LLMs to build the answer,” explains Severino. “We’re looking at migrating vector embeddings into MongoDB Atlas to create a fully integrated, functional system. We’ll then be able to use Atlas Vector Search to build AI-powered experiences without leaving the Atlas platform — a much better experience for developers.” Read the full case study to learn more about Eni and how it is making unstructured data actionable. Video personalization at scale with Potion and MongoDB Potion enables salespeople to personalize prospecting videos at scale. Already over 7,500 sales professionals at companies including SAP, AppsFlyer, CaptivateIQ, and Opensense are using SendPotion to increase response rates, book more meetings, and build customer trust. All a sales representative needs to do is record a video template, select which words need to be personalized, and let Potion’s audio and vision AI models do the rest. Kanad Bahalkar, co-founder and CEO at Potion explains: “The sales rep tells us what elements need to be personalized in the video — that is typically provided as a list of contacts with their name, company, desired call-to-action, and so on. Our vision and audio models then inspect each frame and reanimate the video and audio with personalized messages lip-synced into the stream. Reanimation is done in bulk in minutes. For example, one video template can be transformed into over 1,000 unique video messages, personalized to each contact.” Potion’s custom generative AI models are built with PyTorch and TensorFlow, and run on Amazon Sagemaker. Describing their models, Kanad says “Our vision model is trained on thousands of different faces, so we can synthesize the video without individualized AI training. The audio models are tuned on-demand for each voice.” And where does the data for the AI lifecycle live? “This is where we use MongoDB Atlas ,” says Kanad. “We use the MongoDB database to store metadata for all the videos, including the source content for personalization, such as the contact list and calls to action. For every new contact entry created in MongoDB, a video is generated for it using our AI models, and a link to that video is stored back in the database. MongoDB also powers all of our application analytics and intelligence . With the insights we generate from MongoDB, we can see how users interact with the service, capturing feedback loops, response rates, video watchtimes, and more. This data is used to continuously train and tune our models in Sagemaker." On selecting MongoDB Kanad says, “I had prior experience of MongoDB and knew how easy and fast it was to get started for both modeling and querying the data. Atlas provides the best-managed database experience out there, meaning we can safely offload running the database to MongoDB. This ease-of-use, speed, and efficiency are all critical as we build and scale the business." To further enrich the SendPotion service, Kanad is planning to use more of the developer features within MongoDB Atlas. This includes Atlas Vector Search to power AI-driven semantic search and RAG for users who are exploring recommendations across video libraries. The engineering team is also planning on using Atlas Triggers to enable event-driven processing of new video content. Potion is a member of the MongoDB AI Innovators program. Asked about the value of the program, Kanad responds, “Access to free credits helped support rapid build and experimentation on top of MongoDB, coupled with access to technical guidance and support." Bringing the power of Vector Search to enterprise knowledge bases Founded in 2011, Kovai is an enterprise software company that offers multiple products in both the enterprise and B2B SaaS arena. Since its founding, the company has grown to nearly 300 employees serving over 2,500 customers. One of Kovai’s key products is Document360, a knowledge base platform for SaaS companies looking for a self-service software documentation solution. Seeing the rise of GenAI, Kovai began developing its AI assistant, “Eddy.” The assistant provides answers to customers' questions utilizing LLMs augmented by retrieving information in a Document360 knowledge base. During the development phase Kovai’s engineering and data science teams explored multiple vector databases to power the RAG portion of the application. They found the need to sync data between its system-of-record MongoDB database and a separate vector database introduced inaccuracies in answers from the assistant. The release of MongoDB Atlas Vector Search provided a solution with three key advantages for the engineers: Architectural simplicity: MongoDB Vector Search's architectural simplicity helps Kovai optimize the technical architecture needed to implement Eddy. Operational efficiency: Atlas Vector Search allows Kovai to store both knowledge base articles and their embeddings together in MongoDB collections, eliminating “data syncing” issues that come with other vendors. Performance: Kovai gets faster query response from MongoDB Vector Search at scale to ensure a positive user experience. Atlas Vector Search is robust, cost-effective, and blazingly fast! Said Saravana Kumar, CEO, Kovai, when speaking about his team's experience Specifically, the team has seen the average time taken to return three, five, and 10 chunks between two and four milliseconds, and if the question is a closed loop, the average time reduces to less than two milliseconds. You can learn more about Kovai’s journey into the world of RAG in the full case study . Getting started As the case studies in our Building AI with MongoDB series demonstrate, retrieval-augmented generation is a key design pattern developers can use as they build AI-powered applications for the business. Take a look at our Embedding Generative AI whitepaper to explore RAG in more detail.
Building AI with MongoDB: Giving Your Apps a Voice
In previous posts in this series, we covered how generative AI and MongoDB are being used to unlock value from data of any modality and in supercharging communications . Put those topics together, and we can start to harness the most powerful communications medium (arguably!) of them all: Voice . Voice brings context, depth, and emotion in ways that text, images, and video alone simply cannot. Or as the ancient Chinese Proverb tells us, “The tongue can paint what the eyes can’t see.” The rise of voice technology has been a transformative journey that spans over a century, from the earliest days of radio and telephone communication to the cutting-edge realm of generative AI. It began with the invention of the telephone in the late 19th century, enabling voice conversations across distances. The evolution continued with the advent of radio broadcasting, allowing mass communication through spoken word and music. As technology advanced, mobile communications emerged, making voice calls accessible anytime, anywhere. Today, generative AI, powered by sophisticated machine learning (ML) models, has taken voice technology to unprecedented levels. The generation of human-like voices and text-to-speech capabilities are one example. Another is the ability to detect sentiment and create summaries from voice communications. These advances are revolutionizing how we interact with technology and information in the age of intelligent software. In this post, we feature three companies that are harnessing the power of voice with generative AI to build completely new classes of user experiences: Xoltar uses voice along with vision to improve engagement and outcomes for patients through clinical treatment and recovery. Cognigy puts voice at the heart of its conversational AI platform, integrating with back-office CRM, ERP, and ticketing systems for some of the world’s largest manufacturing, travel, utility, and ecommerce companies. Artificial Nerds enables any company to enrich its customer service with voice bots and autonomous agents. Let's learn more about the role voice plays in each of these very different applications. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. GenAI companion for patient engagement and better clinical outcomes XOLTAR is the first conversational AI platform designed for long-lasting patient engagement. XOLTAR’s hyper-personalized digital therapeutic app is led by Heather, XOLTAR’s live AI agent. Heather is able to conduct omni-channel interactions, including live video chats. The platform is able to use its multimodal architecture to better understand patients, get more data, increase engagement, create long-lasting relationships, and ultimately achieve real behavioral changes. Figure 1: About 50% of patients fail to stick to prescribed treatments. Through its app and platform, XOLTAR is working to change this, improving outcomes for both patients and practitioners. It provides physical and emotional well-being support through a course of treatment, adherence to medication regimes, monitoring post-treatment recovery, and collection of patient data from wearables for remote analysis and timely interventions. Powering XOLTAR is a sophisticated array of state-of-the-art machine learning models working across multiple modalities — voice and text, as well as vision for visual perception of micro-expressions and non-verbal communication. Fine-tuned LLMs coupled with custom multilingual models for real-time automatic speech recognition and various transformers are trained and deployed to create a truthful, grounded, and aligned free-guided conversation. XOLTAR’s models personalize each patient’s experience by retrieving data stored in MongoDB Atlas . Taking advantage of the flexible document model, XOLTAR developers store both structured data, such as patient details and sensor measurements from wearables, alongside unstructured data, such as video transcripts. This data provides both long-term memory for each patient as well as input for ongoing model training and tuning. MongoDB also powers XOLTAR’S event-driven data pipelines. Follow-on actions generated from patient interactions are persisted in MongoDB, with Atlas Triggers notifying downstream consuming applications so they can react in real-time to new treatment recommendations and regimes. Through its participation in the MongoDB AI Innovators program , XOLTAR’s development team receives access to free Atlas credits and expert technical support, helping them de-risk new feature development. How Cognigy built a leading conversational AI solution Cognigy delivers AI solutions that empower businesses to provide exceptional customer service that is instant, personalized, in any language, and on any channel. Its main product, Cognigy.AI, allows companies to create AI Agents, improving experiences through smart automation and natural language processing. This powerful solution is at the core of Cognigy's offerings, making it easy for businesses to develop and deploy intelligent voice and chatbots. Developing a conversational AI system poses challenges for any company. These solutions must effectively interact with diverse systems like CRMs, ERPs, and ticketing systems. This is where Cognigy introduces the concept of a centralized platform. This platform allows you to construct and deploy agents through an intuitive low-code user interface. Cognigy took a deliberate approach when constructing the platform, employing a composable architecture model, as depicted in Figure 1 below. To achieve this, it designed over 30 specialized microservices, adeptly orchestrated through Kubernetes. These microservices were strategically fortified with MongoDB's replica sets, spanning across three availability zones. In addition, sophisticated indexing and caching strategies were integrated to enhance query performance and expedite response times. Figure 2: Congnigy's composable architecture model platform MongoDB has been a driving force behind Cognigy's unprecedented flexibility and scalability and has been instrumental in bringing groundbreaking products like Cognigy.AI to life. Check out the Cognigy case study to learn more about their architecture and how they use MongoDB. The power of custom voice bots without the complexity of fine-tuning Founded in 2017, Artificial Nerds assembled a group of creative, passionate, and "nerdy" technologists focused on unlocking the benefits of AI for all businesses. Its aim was to liberate teams from repetitive work, freeing them up to spend more time building closer relationships with their clients. The result is a suite of AI-powered products that improve customer sales and service. These include multimodal bots for conversational AI via voice and chat along with intelligent hand-offs to human operators for live chat. These are all backed by no-code functions to integrate customer service actions with backend business processes and campaigns. Originally the company’s ML engineers fine-tuned GPT and BERT language models to customize its products for each one of its clients. This was a time-consuming and complex process. The maturation of vector search and tooling to enable Retrieval-Augmented Generation (RAG) has radically simplified the workflow, allowing Artificial Nerds to grow its business faster. Artificial Nerds started using MongoDB in 2019, taking advantage of its flexible schema to provide long-term memory and storage for richly structured conversation history, messages, and user data. When dealing with customers, it was important for users to be able to quickly browse and search this history. Adopting Atlas Search helped the company meet this need. With Atlas Search, developers were able to spin up a powerful full-text index right on top of their database collections to provide relevance-based search across their entire corpus of data. The integrated approach offered by MongoDB Atlas avoided the overhead of bolting on a separate search engine and creating an ETL mechanism to sync with the database. This eliminated the cognitive overhead of developing against, and operating, separate systems. The release of Atlas Vector Search unlocks those same benefits for vector embeddings. The company has replaced its previously separate standalone vector database with the integrated MongoDB Atlas solution. Not only has this improved the productivity of its developers, but it has also improved the customer experience by reducing latency 4x . Artificial Nerds is growing fast, with revenues expanding 8% every month. The company continues to push the boundaries of customer service by experimenting with new models including the Llama 2 LLM and multilingual sentence transformers hosted in Hugging Face. Being part of the MongoDB AI Innovators program helps Artificial Nerds stay abreast of all of the latest MongoDB product enhancements and provides the company with free Atlas credits to build new features. Getting started Check out our MongoDB for AI page to get access to all of the latest resources to help you build. We see developers increasingly adopting state-of-the-art multimodal models and MongoDB Atlas Vector Search to work with data formats that have previously been accessible only to those organizations with access to the very deepest data science resources. Check out some examples from our previous Building AI with MongoDB blog post series here: Building AI with MongoDB: first qualifiers includes AI at the network edge for computer vision and augmented reality, risk modeling for public safety, and predictive maintenance paired with Question-Answering generation for maritime operators. Building AI with MongoDB: compliance to copilots features AI in healthcare along with intelligent assistants that help product managers specify better products and sales teams compose emails that convert 2x higher. Building AI with MongoDB: unlocking value from multimodal data showcases open source libraries that transform unstructured data into a usable JSON format, entity extraction for contracts management, and making sense of “dark data” to build customer service apps. Building AI with MongoDB: Cultivating Trust with Data covers three key customer use cases of improving model explainability, securing generative AI outputs, and transforming cyber intelligence with the power of MongoDB. Building AI with MongoDB: Supercharging Three Communication Paradigms features developer tools that bring AI to existing enterprise data, conversational AI, and monetization of video streams and the metaverse. There is no better time to release your own inner voice and get building!