Building AI with MongoDB: Improving Productivity with WINN.AI’s Virtual Sales Assistant
Better serving customers is a primary driver for the huge wave of AI innovations we see across enterprises. WINN.AI is a great example. Founded in November 2021 by sales tech entrepreneur Eldad Postan Koren and cybersecurity expert Bar Haleva, their innovations are enabling sales teams to improve productivity by increasing the time they focus on customers. WINN.AI orchestrates a multimodal suite of state-of-the-art models for speech recognition, entity extraction, and meeting summarization, relying on MongoDB Atlas as the underlying data layer. I had the opportunity to sit down with Orr Mendelson, Ph.D., Head of R&D at WINN.AI, to learn more. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Tell us a little bit about what WINN.AI is working to accomplish Today’s salespeople spend over 25% of their time on administrative busywork - costing organizations time, money, and opportunity. We are working to change that so that sales teams can spend more time solving their customer’s problems and less on administrative tasks. At the heart of WINN.AI is an AI-powered real-time sales assistant that joins your virtual meetings. It detects and interprets customer questions, and immediately surfaces relevant information for the salesperson. Think about retrieving relevant customer references or competitive information. It can provide prompts from a sales playbook, and also make sure meetings stay on track and on time. After concluding, WINN.AI extracts relevant information from the meeting and updates the CRM system. WINN.AI integrates with the leading tools used by sales teams, including Zoom, Hubspot, Salesforce, and more. Can you describe what role AI plays in your application? Our technology allows the system to understand not only what people are saying on a sales call, but also to specifically comprehend the context of a sales conversation, thus optimizing meeting summaries and follow-on actions. This includes identifying the most important talking points discussed in the meeting, knowing how to break down the captured data into different sales methodology fields (MEDDICC, BANT, etc.), and automatically pushing updates to the CRM. What specific AI/ML techniques, algorithms, or models are utilized in the application? We started out building and training our own custom Natural Language Processing (NLP) algorithms and later switched to GPT 3.5 and 4 for entity extraction and summarization. Our selection of models is based on specific requirements of the application feature – balancing things like latency with context length and data modality. We orchestrate all of the models with massive automation, reporting, and monitoring mechanisms. This is developed by our engineering teams and assures high-quality AI products across our services and users. We have a dedicated team of AI Engineers and Prompts Engineers that develop and monitor each prompt and response so we are continuously tuning and optimizing app capabilities. How do you use MongoDB in your application stack? MongoDB stores everything in the WINN.AI platform. Organizations and users, sessions, their history, and more. The primary driver for selecting MongoDB was its flexibility in being able to store, index, and query data of any shape or structure. The database fluidly adapts to our application schema, which gives us a more agile approach than traditional relational databases. My developers love the ecosystem that has built up around MongoDB. MongoDB Atlas provides the managed services we need to run, scale, secure, and backup our data. How do you see the broader benefits of MongoDB in your business? In the ever-changing AI tech market, MongoDB is our stable anchor. MongoDB provides the freedom to work with structured and unstructured data while using any of our preferred tools, and we leave database management to the Atlas service. This means my developers are free to create with AI while being able to sleep at night! MongoDB is familiar to our developers so we don’t need any DBA or external experts to maintain and run it safely. We can invest those savings back into building great AI-powered products. What are your future plans for new applications and how does MongoDB fit into them? We’re always looking for opportunities to offer new functionality to our users. Capabilities like Atlas Search for faceted full-text navigation over data coupled with MongoDB’s application-driven intelligence for more real-time analytics and insights are all incredibly valuable. Streaming is one area that I’m really excited about. Our application is composed of multiple microservices that are soon to be connected with Kafka for an event-driven architecture. Building on Kafka based messaging, Atlas Stream Processing is another direction we will explore. It will give our services a way of continuously querying, analyzing and reacting to streaming data without having to first land it in the database. This will give our customers even lower latency AI outputs. Everybody WINNs! Wrapping up Orr, thank you for sharing WINN.AI’s story with the community! WINN.AI is part of the MongoDB AI Innovators program , benefiting from access to free Atlas credits and technical expertise. If you are getting started with AI, sign-up for the program and build with MongoDB.
Atlas Vector Search Commands Highest Developer NPS in Retool State of AI 2023 Survey
This post is also available in: Deutsch , Français , 中文 , Español , Português . Retool has just published its first-ever State of AI report and it's well worth a read. Modeled on its massively popular State of Internal Tools report, the State of AI survey took the pulse of over 1,500 tech folks spanning software engineering, leadership, product managers, designers, and more drawn from a variety of industries. The survey’s purpose is to understand how these tech folk use and build with artificial intelligence (AI). As a part of the survey, Retool dug into which tools were popular, including the vector databases used most frequently with AI. The survey found MongoDB Atlas Vector Search commanded the highest Net Promoter Score (NPS) and was the second most widely used vector database - within just five months of its release. This places it ahead of competing solutions that have been around for years. In this blog post, we’ll examine the phenomenal rise of vector databases and how developers are using solutions like Atlas Vector Search to build AI-powered applications. We’ll also cover other key highlights from the Retool report. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Vector database adoption: Off the charts (well almost...) From mathematical curiosity to the superpower behind generative AI and LLMs, vector embeddings and the databases that manage them have come a long way in a very short time. Check out DB-Engines trends in database models over the past 12 months and you'll see that vector databases are head and shoulders above all others in popularity change. Just look at the pink line’s "up and to the right" trajectory in the chart below. Screenshot courtesy of DB-engines, November 8, 2023 But why have vector databases become so popular? They are a key component in a new architectural pattern called retrieval-augmented generation — otherwise known as RAG — a potent mix that combines the reasoning capabilities of pre-trained, general-purpose LLMs and feeds them real-time, company-specific data. The results are AI-powered apps that uniquely serve the business — whether that’s creating new products, reimagining customer experiences, or driving internal productivity and efficiency to unprecedented heights. Vector embeddings are one of the fundamental components required to unlock the power of RAG. Vector embedding models encode enterprise data, no matter whether it is text, code, video, images, audio streams, or tables, as vectors. Those vectors are then stored, indexed, and queried in a vector database or vector search engine, providing the relevant input data as context to the chosen LLM. The result are AI apps grounded in enterprise data and knowledge that is relevant to the business, accurate, trustworthy, and up-to-date. As the Retool survey shows, the vector database landscape is still largely greenfield. Fewer than 20% of respondents are using vector databases today, but with the growing trend towards customizing models and AI infrastructure, adoption is guaranteed to grow. Why are developers adopting Atlas Vector Search? Retool's State of AI survey features some great vector databases that have blazed a trail over the past couple of years, especially in applications requiring context-aware semantic search. Think product catalogs or content discovery. However, the challenge developers face in using those vector databases is that they have to integrate them alongside other databases in their application’s tech stack. Every additional database layer in the application tech stack adds yet another source of complexity, latency, and operational overhead. This means they have another database to procure, learn, integrate (for development, testing, and production), secure and certify, scale, monitor, and back up, And this is all while keeping data in sync across these multiple systems. MongoDB takes a different approach that avoids these challenges entirely: Developers store and search native vector embeddings in the same system they use as their operational database. Using MongoDB’s distributed architecture, they can isolate these different workloads while keeping the data fully synchronized. Search Nodes provide dedicated compute and workload isolation that is vital for memory-intensive vector search workloads, thereby enabling improved performance and higher availability With MongoDB’s flexible and dynamic document schema, developers can model and evolve relationships between vectors, metadata, and application data in ways other databases cannot. They can process and filter vector and operational data in any way the application needs with an expressive query API and drivers that support all of the most popular programming languages. Using the fully managed MongoDB Atlas developer data platform empowers developers to achieve the scale, security, and performance that their application users expect. What does this unified approach mean for developers? Faster development cycles, higher performing apps providing lower latency with fresher data, coupled with lower operational overhead and cost. Outcomes that are reflected in MongoDB’s best-in-class NPS score. Atlas Vector Search is robust, cost-effective, and blazingly fast! Saravana Kumar, CEO, Kovai discussing the development of his company’s AI assistant Check out our Building AI with MongoDB blog series (head to the Getting Started section to see the back issues). Here you'll see Atlas Vector Search used for GenAI-powered applications spanning conversational AI with chatbots and voicebots, co-pilots, threat intelligence and cybersecurity, contract management, question-answering, healthcare compliance and treatment assistants, content discovery and monetization, and more. MongoDB was already storing metadata about artifacts in our system. With the introduction of Atlas Vector Search, we now have a comprehensive vector-metadata database that’s been battle-tested over a decade and that solves our dense retrieval needs. No need to deploy a new database we'd have to manage and learn. Our vectors and artifact metadata can be stored right next to each other. Pierce Lamb, Senior Software Engineer on the Data and Machine Learning team at VISO TRUST What can you learn about the state of AI from the Retool report? Beyond uncovering the most popular vector databases, the survey covers AI from a range of perspectives. It starts by exploring respondents' perceptions of AI. (Unsurprisingly, the C-suite is more bullish than individual contributors.) It then explores investment priorities, AI’s impact on future job prospects, and how it will likely affect developers and the skills they need in the future. The survey then explores the level of AI adoption and maturity. Over 75% of survey respondents say their companies are making efforts to get started with AI, with around half saying these were still early projects, and mainly geared towards internal applications. The survey goes on to examine what those applications are, and how useful the respondents think they are to the business. It finds that almost everyone’s using AI at work, whether they are allowed to or not, and then identifies the top pain points. It's no surprise that model accuracy, security, and hallucinations top that list. The survey concludes by exploring the top models in use. Again no surprise that Open AI’s offerings are leading the way, but it also indicates growing intent to use open source models along with AI infrastructure and tools for customization in the future. You can dig into all of the survey details by reading the report . Getting started with Atlas Vector Search Eager to take a look at our Vector Search offering? Head over to our Atlas Vector Search product page . There you will find links to tutorials, documentation, and key AI ecosystem integrations so you can dive straight into building your own genAI-powered apps . If you want to learn more about the high level possibilities of Vector Search, then download our Embedding Generative AI whitepaper.
Building AI with MongoDB: Giving Your Apps a Voice
In previous posts in this series, we covered how generative AI and MongoDB are being used to unlock value from data of any modality and in supercharging communications . Put those topics together, and we can start to harness the most powerful communications medium (arguably!) of them all: Voice . Voice brings context, depth, and emotion in ways that text, images, and video alone simply cannot. Or as the ancient Chinese Proverb tells us, “The tongue can paint what the eyes can’t see.” The rise of voice technology has been a transformative journey that spans over a century, from the earliest days of radio and telephone communication to the cutting-edge realm of generative AI. It began with the invention of the telephone in the late 19th century, enabling voice conversations across distances. The evolution continued with the advent of radio broadcasting, allowing mass communication through spoken word and music. As technology advanced, mobile communications emerged, making voice calls accessible anytime, anywhere. Today, generative AI, powered by sophisticated machine learning (ML) models, has taken voice technology to unprecedented levels. The generation of human-like voices and text-to-speech capabilities are one example. Another is the ability to detect sentiment and create summaries from voice communications. These advances are revolutionizing how we interact with technology and information in the age of intelligent software. In this post, we feature three companies that are harnessing the power of voice with generative AI to build completely new classes of user experiences: Xoltar uses voice along with vision to improve engagement and outcomes for patients through clinical treatment and recovery. Cognigy puts voice at the heart of its conversational AI platform, integrating with back-office CRM, ERP, and ticketing systems for some of the world’s largest manufacturing, travel, utility, and ecommerce companies. Artificial Nerds enables any company to enrich its customer service with voice bots and autonomous agents. Let's learn more about the role voice plays in each of these very different applications. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. GenAI companion for patient engagement and better clinical outcomes XOLTAR is the first conversational AI platform designed for long-lasting patient engagement. XOLTAR’s hyper-personalized digital therapeutic app is led by Heather, XOLTAR’s live AI agent. Heather is able to conduct omni-channel interactions, including live video chats. The platform is able to use its multimodal architecture to better understand patients, get more data, increase engagement, create long-lasting relationships, and ultimately achieve real behavioral changes. Figure 1: About 50% of patients fail to stick to prescribed treatments. Through its app and platform, XOLTAR is working to change this, improving outcomes for both patients and practitioners. It provides physical and emotional well-being support through a course of treatment, adherence to medication regimes, monitoring post-treatment recovery, and collection of patient data from wearables for remote analysis and timely interventions. Powering XOLTAR is a sophisticated array of state-of-the-art machine learning models working across multiple modalities — voice and text, as well as vision for visual perception of micro-expressions and non-verbal communication. Fine-tuned LLMs coupled with custom multilingual models for real-time automatic speech recognition and various transformers are trained and deployed to create a truthful, grounded, and aligned free-guided conversation. XOLTAR’s models personalize each patient’s experience by retrieving data stored in MongoDB Atlas . Taking advantage of the flexible document model, XOLTAR developers store both structured data, such as patient details and sensor measurements from wearables, alongside unstructured data, such as video transcripts. This data provides both long-term memory for each patient as well as input for ongoing model training and tuning. MongoDB also powers XOLTAR’S event-driven data pipelines. Follow-on actions generated from patient interactions are persisted in MongoDB, with Atlas Triggers notifying downstream consuming applications so they can react in real-time to new treatment recommendations and regimes. Through its participation in the MongoDB AI Innovators program , XOLTAR’s development team receives access to free Atlas credits and expert technical support, helping them de-risk new feature development. How Cognigy built a leading conversational AI solution Cognigy delivers AI solutions that empower businesses to provide exceptional customer service that is instant, personalized, in any language, and on any channel. Its main product, Cognigy.AI, allows companies to create AI Agents, improving experiences through smart automation and natural language processing. This powerful solution is at the core of Cognigy's offerings, making it easy for businesses to develop and deploy intelligent voice and chatbots. Developing a conversational AI system poses challenges for any company. These solutions must effectively interact with diverse systems like CRMs, ERPs, and ticketing systems. This is where Cognigy introduces the concept of a centralized platform. This platform allows you to construct and deploy agents through an intuitive low-code user interface. Cognigy took a deliberate approach when constructing the platform, employing a composable architecture model, as depicted in Figure 1 below. To achieve this, it designed over 30 specialized microservices, adeptly orchestrated through Kubernetes. These microservices were strategically fortified with MongoDB's replica sets, spanning across three availability zones. In addition, sophisticated indexing and caching strategies were integrated to enhance query performance and expedite response times. Figure 2: Congnigy's composable architecture model platform MongoDB has been a driving force behind Cognigy's unprecedented flexibility and scalability and has been instrumental in bringing groundbreaking products like Cognigy.AI to life. Check out the Cognigy case study to learn more about their architecture and how they use MongoDB. The power of custom voice bots without the complexity of fine-tuning Founded in 2017, Artificial Nerds assembled a group of creative, passionate, and "nerdy" technologists focused on unlocking the benefits of AI for all businesses. Its aim was to liberate teams from repetitive work, freeing them up to spend more time building closer relationships with their clients. The result is a suite of AI-powered products that improve customer sales and service. These include multimodal bots for conversational AI via voice and chat along with intelligent hand-offs to human operators for live chat. These are all backed by no-code functions to integrate customer service actions with backend business processes and campaigns. Originally the company’s ML engineers fine-tuned GPT and BERT language models to customize its products for each one of its clients. This was a time-consuming and complex process. The maturation of vector search and tooling to enable Retrieval-Augmented Generation (RAG) has radically simplified the workflow, allowing Artificial Nerds to grow its business faster. Artificial Nerds started using MongoDB in 2019, taking advantage of its flexible schema to provide long-term memory and storage for richly structured conversation history, messages, and user data. When dealing with customers, it was important for users to be able to quickly browse and search this history. Adopting Atlas Search helped the company meet this need. With Atlas Search, developers were able to spin up a powerful full-text index right on top of their database collections to provide relevance-based search across their entire corpus of data. The integrated approach offered by MongoDB Atlas avoided the overhead of bolting on a separate search engine and creating an ETL mechanism to sync with the database. This eliminated the cognitive overhead of developing against, and operating, separate systems. The release of Atlas Vector Search unlocks those same benefits for vector embeddings. The company has replaced its previously separate standalone vector database with the integrated MongoDB Atlas solution. Not only has this improved the productivity of its developers, but it has also improved the customer experience by reducing latency 4x . Artificial Nerds is growing fast, with revenues expanding 8% every month. The company continues to push the boundaries of customer service by experimenting with new models including the Llama 2 LLM and multilingual sentence transformers hosted in Hugging Face. Being part of the MongoDB AI Innovators program helps Artificial Nerds stay abreast of all of the latest MongoDB product enhancements and provides the company with free Atlas credits to build new features. Getting started Check out our MongoDB for AI page to get access to all of the latest resources to help you build. We see developers increasingly adopting state-of-the-art multimodal models and MongoDB Atlas Vector Search to work with data formats that have previously been accessible only to those organizations with access to the very deepest data science resources. Check out some examples from our previous Building AI with MongoDB blog post series here: Building AI with MongoDB: first qualifiers includes AI at the network edge for computer vision and augmented reality, risk modeling for public safety, and predictive maintenance paired with Question-Answering generation for maritime operators. Building AI with MongoDB: compliance to copilots features AI in healthcare along with intelligent assistants that help product managers specify better products and sales teams compose emails that convert 2x higher. Building AI with MongoDB: unlocking value from multimodal data showcases open source libraries that transform unstructured data into a usable JSON format, entity extraction for contracts management, and making sense of “dark data” to build customer service apps. Building AI with MongoDB: Cultivating Trust with Data covers three key customer use cases of improving model explainability, securing generative AI outputs, and transforming cyber intelligence with the power of MongoDB. Building AI with MongoDB: Supercharging Three Communication Paradigms features developer tools that bring AI to existing enterprise data, conversational AI, and monetization of video streams and the metaverse. There is no better time to release your own inner voice and get building!
Supercharging Edge-to-Cloud Strategy
The emergence of Big Data and the proliferation of AI/ML, is today more than ever, pushing enterprises' digital strategies to adopt more sophisticated systems that help them become data-driven organizations. This said the constant dependency on legacy systems makes it difficult for many enterprises to even access their edge data and make use of it in time to make operational/business decisions. From healthcare to retail, manufacturing to telecom, companies that can successfully adopt IoT into their operations have proven to grow significantly faster than laggards within their respective industries. Modern IoT solutions enable businesses to capture and visualize edge data in real time, resulting in rich insights into their operations. By bringing computing to the edge, they can also deploy a wide variety of applications that help them take action on critical data right then and there, delivering significant efficiency into operations. The emergence of generative AI tools is the disruptive force that has revolutionized business operations from a strategic and operational level. It has supercharged corporate data strategies by taking on the heavy lifting of processing/analysis and automating business activities by triggering reactions. This streamlining of operations has not only increased productivity but has also enabled faster and more efficient decision-making. By relieving technical and analytics teams of arduous tasks, these tools free up resources to focus on important creative aspects of business, unlocking meaningful business value with relatively low effort. Data from the edge plays a key role in improving a company's AI/ML strategies as it helps enrich their corporate models and improve associated outcomes. For this reason, it is imperative that enterprises modernize their edge-to-cloud stack with solutions that can be easily implemented and adaptable to their growing data needs. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Modernizing applications with MongoDB Successful modernization requires the right service provider with the expertise and right tools that can adapt to an organization's unique needs and business goals. MongoDB Atlas, AppServices, and Device Sync provide the infrastructure needed for enterprises to implement these modern solutions and start reaping their benefits. Below is a reference architecture for IoT solutions developed by WeKan that can be implemented across industries. Highlights of the architecture The solution is cloud-agnostic and compatible with services from any of the cloud providers (AWS, GCP, AZURE) Atlas Device SDK’s Data Ingest provides performant behavior for heavy client-side insert-only workloads of structured & unstructured data that is then streamed to Atlas with automatic clean-up Out-of-the-box synchronization allows seamless and secure transport of data from the device to the cloud using Atlas Device Sync Built-in conflict resolution with document and field-level permissions offers reliable bi-directional sync capabilities and ensures data consistency at all times Atlas Device SDK offers computing at the edge, allowing businesses to take action on field/telemetry data without the need for connectivity to the cloud MongoDB’s native time series collections, with hands-free schema optimization, support high-efficiency storage and low-latency querying Change streams allow applications to access real time data changes in the database without any complexity or risk, allowing IoT applications to subscribe to all data changes in real-time and action on them as needed MongoDB’s Schema Flexibility delivers agility to the business as engineers can seamlessly make changes and additions to the schema without downtime MongoDB Atlas, combined with industry-leading data warehousing solutions, enables businesses with first-in-class and real-time business intelligence capabilities MongoDB and WeKan Together, MongoDB and WeKan offer a powerhouse solution that combines the technical capabilities and the right expertise. Their solution streamlines the adoption process, making it easier and safer for customers to modernize their edge-to-cloud stack. Providing the right expertise and support with the Jumpstart Program: From architecture design to migration strategy definition and implementation support. A mixed team of specialists from MongoDB and WeKan works hand in hand with customers to ensure quick and correct implementation of the MongoDB technology. Offering the right tooling to accelerate time to market: WeKan's Migration Acceleration suite offers full DataBase & Application code analysis to best prepare for the migration and MongoDB's relational Migrator helps accelerate the transformation and transport of data from RDBMS to MongoDB. Together, these tools help reduce overall migration efforts/costs by 70%. A highly specialized service at the best price-point: WeKan’s Global Delivery Model offers architecture design and implementation support at a competitive price-point, making it easier for enterprises to access the expertise needed to migrate away from legacy and safely implement modern solutions around MongoDB Here are a couple of examples of how these IOT solutions can be applied across multiple industries: Automotive Industry: Predictive maintenance with MongoDB & GCP Leveraging Atlas Device SDK and MongoDB Atlas helps automakers deploy applications that use real-time data to proactively detect failures and efficiently schedule maintenance events. Vehicle telemetry data is stored in Atlas Device SDK (onboard) and synced using Atlas Device sync to MongoDB Atlas Smart factory telemetry data about their production lines are synced via MQTT to GCP Cloud Iot core and pub-sub back to MongoDB Atlas Data is transferred from MongoDB Atlas to a data warehouse/data lake house, such as Bigquery or Databricks, for analysis Predictive maintenance ML Models are executed on the data from the data warehouse to infer the assets that require maintenance in the near term These are then processed and stored back in MongoDB Atlas as tickets for further action These tickets are then assigned to users and synced to their mobile application using Atlas Device Sync Manufacturing - Industry 4.0 Leveraging Atlas Device SDK at the Edge and MongoDB Atlas, Manufacturers can seamlessly transport their factory data to the cloud, gain business insights, and action on it as needed. Sensors from the production lines in the manufacturing plant transmit telemetry data over MQTT to the local Atlas Device SDK Gateway about the customer orders being built The Atlas Device SDK Gateway sends the data in real-time via Atlas Device Sync back to MongoDB Atlas. With centralized information, customers, factory managers, and warehouse operators can all see real-time data about orders, inventory, and manufacturing timelines. Conclusion As enterprises grapple with the complexities of data management, real-time synchronization, scalability, and edge computing, the MongoDB and WeKan partnership offers powerful solutions to tackle these challenges head-on. Together, they help customers move away from legacy systems and implement complete edge-to-cloud solutions that harness the full potential of IoT for better data access, improved insights, and, ultimately, enhanced business outcomes.
Retrieval Augmented Generation (RAG): The Open-Book Test for GenAI
The release of ChatGPT in November 2022 marked a groundbreaking moment for AI, introducing the world to an entirely new realm of possibilities created by the fusion of generative AI and machine learning foundation models, or large language models (LLMs). In order to truly unlock the power of LLMs, organizations need to not only access the innovative commercial and open-source models but also feed them vast amounts of quality internal and up-to-date data. By combining a mix of proprietary and public data in the models, organizations can expect more accurate and relevant LLM responses that better mirror what's happening at the moment. The ideal way to do this today is by leveraging retrieval-augmented generation (RAG), a powerful approach in natural language processing (NLP) that combines information retrieval and text generation. Most people by now are familiar with the concept of prompt engineering, which is essentially augmenting prompts to direct the LLM to answer in a certain way. With RAG, you're augmenting prompts with proprietary data to direct the LLM to answer in a certain way based on contextual data. The retrieved information serves as a basis for generating coherent and contextually relevant text. This combination allows AI models to provide more accurate, informative, and context-aware responses to queries or prompts. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Applying retrieval-augmented generation (RAG) in the real world Let's use a stock quote as an example to illustrate the usefulness of retrieval-augmented generation in a real-world scenario. Since LLMs aren't trained on recent data like stock prices, the LLM will hallucinate and make up an answer or deflect from answering the question entirely. Using retrieval-augmented generation, you would first fetch the latest news snippets from a database (often using vector embeddings in a vector database or MongoDB Atlas Vector Search ) that contains the latest stock news. Then, you insert or "augment" these snippets into the LLM prompt. Finally, you instruct the LLM to reference the up-to-date stock news in answering the question. With RAG, because there is no retraining of the LLM required, the retrieval is very fast (sub 100 ms latency) and well-suited for real-time applications. Another common application of retrieval-augmented generation is in chatbots or question-answering systems. When a user asks a question, the system can use the retrieval mechanism to gather relevant information from a vast dataset, and then it generates a natural language response that incorporates the retrieved facts. RAG vs. fine-tuning Users will immediately bump up against the limits of GenAI anytime there's a question that requires information that sits outside the LLM's training corpus, resulting in hallucinations, inaccuracies, or deflection. RAG fills in the gaps in knowledge that the LLM wasn't trained on, essentially turning the question-answering task into an “open-book quiz,” which is easier and less complex than an open and unbounded question-answering task. Fine-tuning is another way to augment LLMs with custom data, but unlike RAG it's like giving it entirely new memories or a lobotomy. It's also time- and resource-intensive, generally not viable for grounding LLMs in a specific context, and especially unsuitable for highly volatile, time-sensitive information and personal data. Conclusion Retrieval-augmented generation can improve the quality of generated text by ensuring it's grounded in relevant, contextual, real-world knowledge. It can also help in scenarios where the AI model needs to access information that it wasn't trained on, making it particularly useful for tasks that require factual accuracy, such as research, customer support, or content generation. By leveraging RAG with your own proprietary data, you can better serve your current customers and give yourself a significant competitive edge with reliable, relevant, and accurate AI-generated output. To learn more about how Atlas helps organizations integrate and operationalize GenAI and LLM data, download our white paper, Embedding Generative AI and Advanced Search into your Apps with MongoDB . If you're interested in leveraging generative AI at your organization, reach out to us today and find out how we can help your digital transformation.
4 Key Considerations for Unlocking the Power of GenAI
Vector Search and LLM Essentials - What, When and Why
This post is also available in: Deutsch , Français , Español , Português Vector search and, more broadly, Artificial Intelligence (AI) are more popular now than ever. These terms are arising everywhere. Technology companies around the globe are scrambling to release vector search and AI features in an effort to be part of this growing trend. As a result, it's unusual to come across a homepage for a data-driven business and not see a reference to vector search or large language models (LLMs). In this blog, we'll cover what these terms mean while examining the events that led to their current trend. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. What is vector search Vectors are encoded representations of unstructured data like text, images, and audio in the form of arrays of numbers. Figure 1: Data is turned into vectors by embedding models These vectors are produced by machine learning (ML) techniques called "embedding models". These models are trained on large corpuses of data. Embedding models effectively capture meaningful relationships and similarities between data. This enables users to query data based on the meaning rather than the data itself. This fact unlocks more efficient data analysis tasks like recommendation systems, language understanding, and image recognition. Every search starts with a query and, in vector search, the query is represented by a vector. The job of vector search is finding, from the vectors stored on a database, those that are most similar to the vector of the query. This is the basic premise. It is all about similarity . This is why vector search is often called similarity search. Note: similarity also applies to ranking algorithms that work with non-vector data. To understand the concept of vector similarity, let’s picture a three-dimensional space. In this space, the location of a data point is fully determined by three coordinates. Figure 2: Location of a point P in a three-dimensional space In the same way, if a space has 1024 dimensions, it takes 1024 coordinates to locate a data point. Figure 3: Point P in a sphere that represents a multi-dimensional space Vectors also provide the location of data points in multidimensional spaces. In fact, we can treat the values in a vector as an array of coordinates. Once we have the location of the data points — the vectors — their similarity to each other is calculated by measuring the distance between them in the vector space. Points that are closer to each other in the vector space represent concepts that are more similar in meaning. For example, "tire" has a greater similarity to "car" and a lesser one to "airplane." However, "wing" would only have a similarity to "airplane." Therefore, the distance between the vectors for “tire” and “car” would be smaller than the distance between the vectors for “tire” and “airplane.” Yet, the distance between “wing” and “car” would be enormous. In other words, “tire” is relevant when we talk about a “car,” and to a lesser extent, an “airplane.” However, a “wing” is only relevant when we talk about an “airplane” and not relevant at all when we talk about a “car” (at least until flying cars are a viable mode of transport). The contextualization of data — regardless of the type — allows vector search to retrieve the most relevant results to a given query. A simple example of similarity Table 1: Example of similarity between different terms What are Large Language Models? LLMs are what bring AI to the vector search equation. LLMs and human minds both understand and associate concepts in order to perform certain natural language tasks, such as following a conversation or understanding an article. LLMs, like humans, need training in order to understand different concepts. For example, do you know what the term “corium” pertains to? Unless you're a nuclear engineer, probably not. The same happens with LLMs: if they are not trained in a specific domain, they are not able to understand concepts and therefore perform poorly. Let’s look at an example. LLMs understand pieces of text thanks to their embedding layer. This is where words or sentences are converted into vectors. In order to visualize vectors, we are going to use word clouds. Word clouds are closely related to vectors in the sense that they are representations of concepts and their context. First, let’s see the word cloud that an embedding model would generate for the term “corium” if it was trained with nuclear engineering data: Figure 4: Sample word cloud from a model trained with nuclear data As shown in the picture above, the word cloud indicates that corium is a radioactive material that has something to do with safety and containment structures. But, corium is a special term that can also be applied to another domain. Let’s see the word cloud resulting from an embedding model that has been trained in biology and anatomy: Figure 5: Sample word cloud from a model trained with biology data In this case, the word cloud indicates that corium is a concept related to skin and its layers. What happened here? Is one of the embedding models wrong? No. They have both been trained with different data sets. That is why finding the most appropriate model for a specific use case is crucial. One common practice in the industry is to adopt a pre-trained embedding model with strong background knowledge. One takes this model and then fine-tunes it with the domain-specific knowledge needed to perform particular tasks. The quantity and quality of the data used to train a model is relevant as well. We can agree that a person who has read just one article on aerodynamics will be less informed on the subject than a person who studied physics and aerospace engineering. Similarly, models that are trained with large sets of high-quality data will be better at understanding concepts and generate vectors that more accurately represent them. This creates the foundation for a successful vector search system. It is worth noting that although LLMs use text embedding models, vector search goes beyond that. It can deal with audio, images, and more. It is important to remember that the embedding models used for these cases share the same approach. They also need to be trained with data — images, sounds, etc. — in order to be able to understand the meaning behind it and create the appropriate similarity vectors. When was vector search created? MongoDB Atlas Vector Search currently provides three approaches to calculate vector similarity. These are also referred to as distance metrics, and consist of: euclidean distance cosine product dot product While each metric is different, for the purpose of this blog, we will focus on the fact that they all measure distance. Atlas Vector Search feeds these distance metrics into an approximate nearest neighbor (ANN) algorithm to find the stored vectors that are most similar to the vector of the query. In order to speed this process up, vectors are indexed using an algorithm called hierarchical navigable small world (HNSW). HNSW guides the search through a network of interconnected data points so that only the most relevant data points are considered. Using one of the three distance metrics in conjunction with the HNSW and KNN algorithms constitutes the foundation for performing vector search on MongoDB Atlas. But, how old are these technologies? We would think they are recent inventions by a bleeding-edge quantum computing lab, but the truth is far from that. Figure 6: Timeline of vector search technologies Euclidean distance was formulated in the year 300 BC, the cosine and the dot product in 1881, the KNN algorithm in 1951, and the HNSW algorithm in 2016. What this means is that the foundations for state-of-the-art vector search were fully available back in 2016. So, although vector search is today’s hot topic, it has been possible to implement it for several years. When were LLMs created? In 2017, there was a breakthrough: the transformer architecture . Presented in the famous paper Attention is all you need , this architecture introduced a neural network model for natural language processing (NLP) tasks. This enabled ML algorithms to process language data on an order of magnitude greater than was previously possible. As a result, the amount of information that could be used to train the models increased exponentially. This paved the way for the first LLM to appear in 2018: GPT-1 by OpenAI. LLMs use embedding models to understand pieces of text and perform certain natural language tasks like question answering or machine translation. LLMs are essentially NLP models that were re-branded due to the large amount of data they are trained with — hence the word large in LLM. The graph below shows the amount of data — parameters — used to train ML models over the years. A dramatic increase can be observed in 2017 after the transformer architecture was published. Figure 7: Parameter count of ML systems through time. Source: towardsdatascience.com Why are vector search and LLMs so popular? As stated above, the technology for vector search was fully available back in 2016. However, it did not become particularly popular until the end of 2022. Why? Although the ML industry has been very active since 2018, LLMs were not widely available or easy to use until OpenAI’s release of ChatGPT in November 2022. The fact that OpenAI allowed everyone to interact with an LLM with a simple chat is the key to its success. ChatGPT revolutionized the industry by enabling the average person to interact with NLP algorithms in a way that would have otherwise been reserved for researchers and scientists. As can be seen in the figure below, OpenAI’s breakthrough led to the popularity of LLMs skyrocketing. Concurrently, ChatGPT became a mainstream tool used by the general public. The influence of OpenAI on the popularity of LLMs is also evidenced by the fact that both OpenAI and LLMs had their first popularity peak simultaneously. (See figure 8.) Figure 8: Popularity of the terms LLM and OpenAI over time. Source: Google Trends Here is why. LLMs are so popular because OpenAI made them famous with the release of ChatGPT. Searching and storing large amounts of vectors became a challenge. This is because LLMs work with embeddings. Thus the adoption of vector search increased in tandem. This is the largest contributing factor to the industry shift. This shift resulted in many data companies introducing support for vector search and other functionalities related to LLMs and the AI behind them. Conclusion Vector search is a modern disruptor. The increasing value of both vector embeddings and advanced mathematical search processes has catalyzed vector search adoption to transform the field of information retrieval. Vector generation and vector search might be independent processes, but when they work together their potential is limitless. To learn more visit our Atlas Vector Search product page. To get started using Vector Search, sign up for Atlas or log in to your account.
Building AI with MongoDB: Supercharging Three Communication Paradigms
Communication mediums are core to who we are as humans, from understanding each other to creating bonds and a shared purpose. The methods of communication have evolved over thousands of years, from cave drawings and scriptures to now being able to connect with anyone at any time via internet-enabled devices. The latest paradigm shift to supercharge communication is through the use and application of natural language processing and artificial intelligence. In our latest roundup of AI innovators building with MongoDB, we’re going to focus on three companies building the future across three mediums of communication: data, language, and video. Our blog begins by featuring SuperDuperDB . The company provides tools for developers to apply AI and machine learning on top of their existing data stores for generative AI applications such as chatbots, Question-Answering (Q-A), and summarization. We then cover Algomo , who uses generative AI to help companies offer their best and most personalized service to customers and employees across more than 100 languages. Finally, Source Digital is a monetization platform delivering a new era of customer engagement through video and the metaverse. Let’s dive in to learn more about each company and use case. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Bringing AI to your database SuperDuperDB is an open-source Python package providing tools for developers to apply AI and machine learning on top of their existing data stores. Developers and data scientists continue to use their preferred tools, avoiding both data migration and duplication to specialized data stores. They also have the freedom to run SuperDuperDB anywhere, avoiding lock-in to any one AI ecosystem. With SuperDuperDB developers can: Deploy their chosen AI models to automatically compute outputs (inference) in their database in a single environment with simple Python commands. Train models on their data simply by querying without additional ingestion and pre-processing. Integrate AI APIs (such as OpenAI) to work together with other models on their data effortlessly. Search data with vector search, including model management and serving. Today SuperDuperDB supports MongoDB alongside select relational databases, cloud data warehouses, data lake houses, and object stores. SuperDuperDB provides an array of sample use cases and notebooks that developers can use to get started including vector search with MongoDB, multimodal search, retrieval augmented generation (RAG), transfer learning, and many more. The team has also built an AI chatbot app that allows users to ask questions about technical documentation. The app is built on top of MongoDB and OpenAI with FastAPI and React (FARM stack) + SuperDuperDB. It showcases how easily developers can build next-generation AI applications on top of their existing data stores with SuperDuperDB. You can try the app and read more about how it is built at SuperDuperDB's documentation . “We integrate MongoDB as one of the key backend databases for our platform, the PyMongo driver for the app connectivity and Atlas Vector Search for storing and querying vector embeddings” , said Duncan Blythe, co-founder of SuperDuperDB. “It therefore made sense for us to partner more closely with the company through MongoDB Ventures . We get direct access to the MongoDB engineering team to help optimize our product, along with visibility within MongoDB’s vast ecosystem of developers.” Here are some useful links to learn more: SuperDuperDB Github SuperDuperDB Docs Intro SuperDuperDB Use Cases Page SuperDuberDB Blog Conversational support, powered by generative AI Algomo uses generative AI to help companies offer their best service to both their customers and employees across more than 100 languages. The company’s name is a portmanteau of the words Algorithm (originating from Arabic) and Homo, (human in Latin). It reflects the two core design principles underlying Algomo’s products: Human-centered AI that amplifies and augments rather than displaces human abilities. Inclusive AI that is accessible to all, and that is non-discriminatory and unbiased in its outputs. With Algomo, customers can get a ChatGPT-powered bot up on their site in less than 3 minutes. More than just a bot, Algomo also provides a complete conversational platform. This includes Question-Answering text generators and autonomous agents that triage and orchestrate support processes, escalating to human support staff for live chat as needed. It works across any communication channel from web and Google Chat to Intercom, Slack, WhatsApp, and more. Customers can instantly turn their support articles, past conversations, slack channels, Notion pages, Google Docs, and content on their public website into personalized answers. Algomo vectorizes customer content, using that alongside OpenAI’s ChatGPT. The company uses RAG (Retrieval Augmented Generation) prompting to inject relevant context to LLM prompts and Chain-Of-Thought prompting to increase answer accuracy. A fine-tuned implementation of BERT is also used to classify user intent and retrieve custom FAQs. Taking advantage of its flexible document data model, Algomo uses MongoDB Atlas to store customer data alongside conversation history and messages, providing long-term memory for context and continuity in support interactions. As a fully managed cloud service, Algomo’s team can leave all of the operational heavy lifting to MongoDB, freeing its team up to focus on building great conversational experiences. The team considers using MongoDB as a “no-brainer,” allowing them to iterate quickly while removing the support burden via the simplicity and reliability of the Atlas platform. The company’s engineers are now evaluating Atlas Vector Search as a replacement for its current standalone vector database, further reducing costs and simplifying their codebase. Being able to store source data, chunks, and metadata alongside vector embeddings eliminates the overhead and duplication of synchronizing data across two separate systems. The team is also looking forward to using Atlas Vector Search for their upcoming Agent Assist feature that will provide suggested answers, alongside relevant documentation snippets, to customer service agents who are responding to live customer queries. Being part of the AI Innovators program provides Algomo with direct access to MongoDB technical expertise and best practices to accelerate its evaluation of Atlas Vector Search. Free Atlas credits in addition to those provided by the AWS and Azure start-up program help Algomo reduce its development costs. Creating a new media currency with video detection and monetization Source Digital, Inc . is a monetization platform that delivers a new era of customer engagement through video and the metaverse. The company provides tools for content creators and advertisers to display real-time advertisements and content recommendations directly to users on websites or in video streams hosted on platforms like Netflix, YouTube, Meta, and Vimeo. Source Digital engineers built it’s own in-house machine learning and vector embedding models using Google Vision AI and TensorFlow. These models provide computer vision across video streams, detecting elements that automatically trigger the display of relevant ads and recommendations. An SDK is also provided to customers so that they can integrate the video detection models onto their own websites. The company started out using PostgreSQL to store video metadata and model features, alongside the pgvector extension for video vector embeddings. This initial setup worked well at a small scale, but as Source Digital grew, PostgreSQL began to creak with costs rapidly escalating. PostgreSQL can only be scaled vertically, and so the company encountered step changes in costs as they moved to progressively larger cloud instance sizes. Scaling limitations were compounded by the need for queries to execute resource-intensive JOIN operations. These were needed to bring together data in all of the different database tables hosting video metadata, model features, and vector embeddings. With prior MongoDB experience from an earlier audio streaming project, the company’s engineers were confident they could tame their cost challenges. Horizontal scale-out allows MongoDB to grow at much more granular levels, aligning costs with application usage. Expensive JOIN operations are eliminated because of the flexibility of MongoDB’s document data model. Now developers store the metadata, model features, and vector embeddings together in a single record. The company estimates that the migration from PostgreSQL to MongoDB Atlas and Vector Search will reduce monthly costs by 7x . These are savings that can be reinvested into accelerating delivery against the feature backlog. Being part of the MongoDB AI Innovators Program provides Source Digital with access to expert technical advice on scaling its platform, along with co-marketing opportunities to further fuel its growth. What's next? If you are getting started with building AI-enabled apps on MongoDB, sign up for our AI Innovators Program . Successful applicants get access to expert technical advice, free MongoDB Atlas credits, co-marketing opportunities, and – for eligible startups, introductions to potential venture investors. We’ve seen a whole host of interesting use cases and different companies building the future with AI, so you can refer back to some of our earlier blog posts below: Building AI with MongoDB: first qualifiers include AI at the network edge for computer vision and augmented reality; risk modeling for public safety; and predictive maintenance paired with Question-Answering generation for maritime operators. Building AI with MongoDB: compliance to copilots features AI in healthcare along with intelligent assistants that help product managers specify better products and help sales teams compose emails that convert 2x higher. Building AI with MongoDB: unlocking value from multimodal data showcases open source libraries that transform unstructured data into a usable JSON format; entity extraction for contracts management; and making sense of “dark data” to build customer service apps. Building AI with MongoDB: Cultivating Trust with Data covers three key customer use cases improving model explainability, securing generative AI outputs, and transforming cyber intelligence with the power of MongoDB. And please take a look at the MongoDB for Artificial Intelligence resources page for the latest best practices that get you started in turning your idea into an AI-driven reality. Consider joining our AI Innovators Program to build the next big thing in AI with us!
How to Avoid GenAI Sprawl and Complexity
There's no doubt that generative AI and large language models (LLMs) are disruptive forces that will continue to transform our industry and economy in profound ways. But there's also something very familiar about the path organizations are taking to tap into GenAI capabilities. It's the same journey that happens anytime there's a need for data that serves a very specific and narrow purpose. We've seen it with search where bolt-on full-text search engines have proliferated, resulting in search-specific domains and expertise required to deploy and maintain them. We've also seen it with time-series data where the need to deliver real-time experiences while solving for intermittent connectivity has resulted in a proliferation of edge-specific solutions for handling time-stamped data. And now we're seeing it with GenAI and LLMs, where niche solutions are emerging for handling the volume and velocity of all the new data that organizations are creating. The challenge for IT decision-makers is finding a way to capitalize on innovative new ways of using and working with data while minimizing the extra expertise, storage, and computing resources required for deploying and maintaining purpose-built solutions. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Purpose-built cost and complexity The process of onboarding search databases illustrates the downstream effects that adding a purpose-built database has on developers. In order to leverage advanced search features like fuzzy search and synonyms, organizations will typically onboard a search-specific solution such as Solr, Elasticsearch, Algolia, and OpenSearch. A dedicated search database is yet another system that requires already scarce IT resources to deploy, manage, and maintain. Niche or purpose-built solutions like these often require technology veterans who can expertly deploy and optimize them. More often than not, it's the responsibility of one person or a small team to figure out how to stand up, configure, and optimize the new search environment as they go along. Time-series data is another example. The effort it takes to write sync code that resolves conflicts between the mobile device and the back end consumes a significant amount of developer time. On top of that, the work is non-differentiating since users expect to see up-to-date information and not lose data as a result of poorly written conflict-resolution code. So developers are spending precious time on work that is not of strategic importance to the business, nor does it differentiate their product or service from your competition. The arrival and proliferation of GenAI and LLMs is likely to accelerate new IT investments in order to capitalize on this powerful, game-changing technology. Many of these investments will take the form of dedicated technology resources and developer talent to operationalize. But the last thing tech buyers and developers need is another niche solution that pulls resources away from other strategically important initiatives. Documents to the rescue Leveraging GenAI and LLMs to gain new insights, create new user experiences, and drive new sources of revenue can entail something other than additional architectural sprawl and complexity. Drawing on the powerful document data model and an intuitive API, the MongoDB Atlas developer data platform allows developers to move swiftly and take advantage of fast-paced breakthroughs in GenAI without having to learn new tools or proprietary services. Documents are the perfect vehicle for GenAI feature development because they provide an intuitive and easy-to-understand mapping of data into code objects. Plus, the flexibility they provide enables developers to adapt to ever-changing application requirements, whether it's the addition of new types of data or the implementation of new features. The huge diversity of your typical application data and even vector embeddings of thousands of dimensions can all be handled with documents. The MongoDB Query API makes developers' lives easier, allowing them to use one unified and consistent system to perform CRUD operations while also taking advantage of more sophisticated features such as keyword and vector search , analytics, and stream processing — all without having to switch between different query languages and drivers, helping to keep your tech stack agile and streamlined. Making the most out of GenAI AI-driven innovation is pushing the envelope of what is possible in terms of the user experience — but to find real transformative business value, it must be seamlessly integrated as part of a comprehensive, feature-rich application that moves the needle for companies in meaningful ways. MongoDB Atlas takes the complexity out of AI-driven projects. Our intuitive developer data platform streamlines the process of bringing new experiences to market quickly and cost-effectively. With Atlas, you can reduce the risk and complexity associated with operational and security models, data wrangling, integration work, and data duplication. To find out more about how Atlas helps organizations integrate and operationalize GenAI and LLM data, download our white paper, Embedding Generative AI and Advanced Search into your Apps with MongoDB . If you're interested in leveraging generative AI at your organization, reach out to us today and find out how we can help your digital transformation.
Boost the Accuracy of E-commerce Search Results with Atlas Vector Search
Artificial Intelligence’s (AI) growth has led to transformative advancements in the retail industry, including natural language processing, image recognition, and data analysis. These capabilities are pivotal to enhancing the efficiency and accuracy of e-commerce search results. E-commerce, characterized by its vast product catalogs and diverse customer base, generates enormous amounts of data every day. From user preferences and search histories to product reviews and purchase patterns — and add to that images, video, and audio associated with product campaigns and user search — the data is both a goldmine and a challenge. Traditional search mechanisms, which rely on exact keyword matches, are inadequate at handling such nuanced and voluminous data. This is where vector search comes into play as the perfect data mining tool . As a sophisticated search mechanism, it leverages AI-driven algorithms to understand the intrinsic relationships between data points. This enables it to discern complex patterns, similarities, and contexts that conventional keyword-based searches might overlook. Let’s dig deeper into the differences between traditional keyword matching search and vector search, and answer questions like: What type of queries does vector search improve in the retail search landscape? What are the challenges associated with it? And how can your business tap into the competitive advantage it represents? Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Traditional Keyword Matching vs. Vector Search Traditional search functionalities for e-commerce platforms — keyword matching, typo tolerance, autocomplete, highlighting, facets, and scoring — are often built in-house or implemented on top of typical search engines like Apache Lucene, AtlasSearch, or ElasticSearch, relying heavily on metadata textual descriptions. While this has served the industry well for years, it often falls short of understanding the nuanced needs of modern consumers. For instance, a customer might be looking for a "blue floral summer dress," but if the product description lacks these terms, it might not appear in the search results, even if it perfectly matches the visual description. Figure 1: As embeddings encode numerically the meaning of documents, semantically close documents will be geometrically close as well. Vector search is a method that finds similar items in a dataset based on their vector representations, and offers a more efficient and accurate way to sift through large datasets. Instead of relying on exact matches, it uses mathematical techniques to measure the similarity between vectors, allowing it to retrieve items that are semantically similar to the user's query, even if the query and the item descriptions don't contain exact keyword matches. Figure 2: Data flow diagram showcasing how applications, vector embedding algorithms, and search engines work together at a high level. One great thing about Vector search is that by encoding any type of data, i.e. text, images or sound, you can perform queries on top of that, creating a much more comprehensive way of improving the relevance of your search results. Let’s explore examples of queries that involve context, intent, and similarity. Visual similarity queries Query: "Find lipsticks in shades similar to this coral lipstick." Vector Search Benefit: Vector search can recognize the color tone and undertones of the specified lipstick and suggest similar shades from the same or different brands. Data type: image or text Contextual queries Query: "Affordable running shoes for beginners." Vector Search Benefit: Vector search can consider both the price range and the context of "beginners," leading to relevant shoe suggestions tailored to the user's experience level and budget. Data type: text, audio (voice) Natural language queries Query: "Show me wireless noise-canceling headphones under $100." Vector Search Benefit: Capture intent. Vector search can parse the query's intent to filter headphones with specific features (wireless, noise-canceling) and a price constraint, offering products that precisely match the request. Data type: text, audio (voice) Complementary product queries Query: "Match this dress with elegant heels and a clutch." Vector Search Benefit: Vector search can comprehend the user's request to create a coordinated outfit by suggesting shoes and accessories that complement the selected dress. Data type: text, audio (voice), image Challenging landscape, flexible stack Now that we've explored different queries and their associated data types that could be used in vector embeddings for search, we can see how much more information can be used to deliver more accurate results and fuel growth. Let’s consider some of the challenges associated with a vector search solution data workflow and how MongoDB Atlas Vector Search helps bridge the gap between challenges and opportunities. Data overload The sheer volume of products and user-generated data can be overwhelming, making it challenging to offer relevant search results. By embedding different types of data inputs like images, audio (voice), and text queries for later use with vector search, we can simplify this workload. Storing your vector encoding in the same shared operational data layer your applications are built on top of, but also generating search indexes based on those vectors, makes it simple to add context to your application search functionalities. Using Atlas Vector Search combined with MongoDB App Services , you can reduce operational overhead by creating a trigger that could “see” when a new document is created in your collections and automatically make the call to the embedding API of your preference, pushing the document to it and storing the retrieved embedding data in the same document stored in your collection. Figure 3: Storing vectors with the data simplifies the overall architecture of your application. As the number of documents or vectors grows, efficient indexing structures ensure that search performance remains reasonable. By simply creating an index based on the embedded data field, you can leverage the optimized retrieval of the data, reduce the computational load, and accelerate its performance, especially for nearest neighbor search tasks, where the goal is to find items that are most similar to a given query. Altogether, the combination of MongoDB Vector Search capabilities with App Services and indexing provides a robust and scalable solution to achieve real-time responsiveness. An indexed vector search database can provide rapid query results, making it suitable for applications like recommendation engines or live search interfaces. Changing consumer behavior Developing an effective vector search solution involves understanding the nuances of the retail domain. Retailers must consider factors like seasonality, trends, and user behavior to improve the accuracy of search results. To overcome this challenge, retailers will need to be able to adjust their business model by categorizing their product catalogs and user data according to different criteria, for example: So as you can see all this vast amount of information can be embedded to build more comprehensive criteria for relevance, but first it needs to be properly captured and organized. This is where the value of the flexible document model comes into play. The document model allows you to define different fields and attributes for each category of data. This can be used to capture the various categorization criteria. Retailers could also utilize embedded subdocuments to associate relevant information with products or customers. For instance, you can embed a subdocument containing marketing campaign data, engagement channels, and geographic location within products to track their performance. As categorization criteria evolve, dynamic schema evolution allows you to add or modify fields without disrupting existing data. This flexibility easily accommodates changing business needs. Retailers may also use embedded arrays to record purchase history for customers. Each array element can represent a transaction, including product details and purchase date, facilitating segmentation based on recency and frequency. By embedding all these different data types, and leveraging the flexible capabilities of the document model, retailers can create a comprehensive and dynamic system that effectively categorizes data according to diverse criteria in a fast and resilient way. This enables personalized search experiences and enhanced customer engagement in the e-commerce space. Sitting on a goldmine Every retailer worldwide now realizes that with their customer data, they are sitting on a goldmine. Using the proper enabling technologies would allow them to build better experiences for their customers while infusing their applications with automated, data-driven decision-making. Retailers offering more intuitive and contextual search results can ensure their customers find what they're looking for by personalizing the relevance of their search results, enhancing satisfaction, and increasing the likelihood of successful transactions. The future of e-commerce search lies in harnessing the power of technologies like Atlas Vector Search , as it’s not only another vector search database, but also an extended product for the developer data platform , providing them with an integrated set of data and application services. For retailers, the message is clear: to offer unparalleled shopping experiences, embracing and integrating vector search functionalities with a performant and reliant platform that simplifies your data organization and storage is not just beneficial, it's essential. Learn more and discover How to Implement Databricks Workflows and Atlas Vector Search for Enhanced E-commerce Search Accuracy with our developer guide, and check out our GitHub repository explaining the full code for deploying an AI-Enhanced e-commerce search solution
Building AI with MongoDB: How Metaphor Data Uses Atlas Vector Search to Change the World Through Data
Since announcing MongoDB Atlas Vector Search in preview back in June, we’ve already seen rapid adoption from developers building a wide range of AI-enabled apps. Today we're highlighting another customer who has increased efficiency while removing architectural complexity by adopting Atlas Vector Search. Metaphor is a search and discovery tool built for data scientists, data engineers, and AI practitioners. The company’s mission is to empower individuals and companies of all types to change the world through data. Metaphor is the next evolution of the Data Catalog with fully automated support for Data Governance, Data Literacy, and Data Enablement using an intuitive user interface. We recently caught up with Mars Lan, Co-founder and CTO to learn more about the company’s journey with MongoDB and their adoption of Atlas Vector Search. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Tell us a little bit about your company and what you and the team are building We’re an early-stage startup with a mission to empower individuals and organizations to change the world through data. We refer to ourselves as the social platform for data and have a range of products that support both data teams but also data consumers. Our main product is a SaaS Data Catalog that enables governance and enablement of data across the organization. We’re a small team of around 15 or so with a keen focus on product and engineering. The company was founded about 2.5 years ago. What role does search play at the company and where did your search story begin? Well, I will start by saying that we almost ended up having a very different story to tell you than what actually ended up transpiring! We started off our journey using DocumentDB and Elasticsearch on AWS for our database and search needs. After some time we ran into some scalability issues that caused us to evaluate (and eventually move to) MongoDB Atlas for our database needs. When we saw MongoDB offered Atlas Search which was based on the same underlying Lucene technology we got very excited and began the process of migrating our search efforts over to Atlas — and this eventually laid the groundwork for adopting Atlas Vector Search later on. So starting with those initial search needs, what got you excited about Atlas Search with MongoDB? What were your use cases? We started to face a significant amount of maintenance and upkeep associated between our database and Elasticsearch. We previously had to build data pipelines, so if something changed in the database, it would also change in search. Once we eventually migrated everything to MongoDB Atlas Search, we no longer had to manage those pipelines. This resulted in lower latency and less likelihood of bugs, which excited our team. The other component to this was the scalability disconnect of having two different systems. We realized if we ever needed to spin up more storage or compute, we could just spin up a larger MongoDB cluster and get that extra scalability right away with the Atlas platform. Of course one less thing to worry about is also a huge benefit — Elasticsearch is not the easiest thing to manage, so having it all in MongoDB was another big plus for us. How did you initially learn about Atlas Vector Search and what piqued your interest? We started experimenting with Pinecone as the AI stuff really started to explode a while back, just to try out the tool, as one of our interns had initially started playing around with it. It turns out not to be cost-effective to spin up a Pinecone instance for each customer, and quite difficult to scale up due to API throttling. After some time, we started looking around for other vendors for vector search. However, once we learned that MongoDB had Vector Search we got excited at the prospect of being able to use our existing tech stack for this additional functionality. It quickly became a no-brainer to us — since we knew we were going to move everything to Atlas, it became obvious we should just consolidate everything there, so we ended up migrating to Atlas Vector Search for all of our semantic search needs. This means one query API, one set of dependencies, and build in sync, all in a single platform. What were the key factors that made you pull the trigger and adopt Atlas Vector Search? What were the problems you were trying to solve? So one key unlock for us was the semantic search side of things, where someone can ask a natural language question and get a natural language answer. This is a much more preferable user experience for us compared to your Google-style keyword searches. From day one we always wanted to best serve our core customer the engineer, but another huge constituency for us is the business or non-technical audience. These folks prefer a tool that is more intuitive to use. To best serve them we have a first-class integration into Slack and Microsoft Teams, so they can ask a question and don't have to go to another place or switch tools to get that answer. We didn’t always have the capability to do the natural language question and response, but with Atlas Vector Search this now becomes possible. Using Vector Search we now have the ability to ask the Slack bot questions like “where can I find this type of data” or “where is this one table on revenue from last quarter and who is using it” and get a natural language response back. One of the key considerations for us when looking at vendors was cost - but not just cost in terms of what shows up on an invoice. I would rather scale one system and get benefits on both (search and vector search). We saw that having to scale two systems independently was just not going to be very efficient in the long run. Can you talk about some of the initial benefits you’ve seen so far both on the Atlas Search side as well as with Vector Search specifically? How do you think about and quantify these benefits? Well one obvious thing that stands out on the search side is increased speed and being able to move quickly. MongoDB in general has a great developer experience. Our data model tends to be highly complex documents, and all the metadata tends to be highly structured and complex, so the MongoDB model fits us very well. In terms of productivity, it’s never an exact science. I will say that with the adoption of Atlas we were able to keep our engineering team size relatively constant while serving many more customers and scale our development efforts faster — so we probably saw a 2X - 3X increase in productivity. One last item of note. We adopt the most rigorous security practices because we deal with so much customer data, so we want to ensure the highest security possible. We chose to have dedicated MongoDB clusters per customer, so every customer’s data is totally isolated from each other. When we were on Pinecone, this meant spinning up a new Pinecone pod for each customer, which would be both really hard to do and not at all financially viable. Because we are centralizing this all under MongoDB, it becomes so much easier - you can dynamically scale your cluster sizes up and down depending on the needs or requirements of small vs. large customers. There’s not the sort of waste you’d get with multiple discrete systems. Getting started A big thank you to Mars and the entire Metaphor Data team for sharing more about their story and use of Atlas Vector Search. Want to learn more? Head over to our Vector Search page for a variety of resources, or jump straight into our tutorial . And if you’re a startup building with AI please check out our MongoDB AI Innovators program for Atlas credits, one-on-one technical advice, access to our partner network, and more!
How to Stand Out From the Crowd When Everyone Uses Generative AI
The arrival of Generative AI powered by Large Language Models (LLMs) in 2022 has captivated business leaders and everyday consumers due to its revolutionary potential. As the dawn of another new era in technology begins, the gold rush is on to leverage Generative AI and drive disruption in markets or risk becoming a victim of said disruption. Now, a vast array of vendors are bringing to market Generative-AI enablers and products. This proliferation of fast-followers leaves executives and software developers feeling overwhelmed. These promising tools must also be able to be modified from just a demo or prototype to full-scale production use. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Success doesn't necessarily equate to differentiation, especially when everyone has access to the same tools. In this environment, the key to market differentiation is layering your own unique proprietary data on top of Generative AI powered by LLMs. Documents, the underlying data model for MongoDB Atlas , allow you to combine your proprietary data with LLM-powered insights in ways that previous tabular data models couldn't, providing the potential for a dynamic, superior level of market differentiation. The way to do this is by transforming your proprietary data - structured and unstructured - into vector embeddings. They capture the semantic meaning and contextual information of data making them suitable for various tasks like text classification, machine translation, sentiment analysis, and more. With vector embeddings, you can easily unlock a world of possibilities for your AI models. Vector embeddings provide numerical encodings that capture the structure and patterns of your data. This semantically rich representation makes calculations of relationships and similarities between objects a breeze, allowing you to create powerful applications that weren’t possible before. MongoDB's ability to ingest and quickly process customer data from various sources allows organizations to build a unified, real-time view of their customers, which is valuable when powering Generative AI solutions like chatbot and question-answer (Q-A) customer service experiences. We recently announced the release of MongoDB Vector Search , a fast and easy way to build semantic search and AI-powered applications by integrating the operational database and vector store in a single, unified, and fully managed platform — along with support integrations into large language models (LLMs). Rather than create a tangled web of cut-and-paste technologies for your new AI-driven experiences, our developer data platform built on MongoDB Atlas provides the streamlined approach you need to bring those experiences to market quickly and efficiently, reducing operational and security models, data wrangling, integration work, and data duplication, while still keeping costs and risk low. With MongoDB Atlas at the core of your AI-powered applications, you can benefit from a unified platform that combines the best of operational, analytical, and generative AI data services for building intelligent, reliable systems designed to stay in sync with the latest developments, scale with user demands, and keep data secure and performant. To find out more about how Atlas Vector Search enables you to create vector embeddings tailored to your needs (using the machine learning model of your choice including OpenAI, Hugging Face, and more) and store them securely in Atlas, download our white paper, Embedding Generative AI and Advanced Search into your Apps with MongoDB . If you're interested in leveraging generative AI at your organization, reach out to us today and find out how we can help.