19 results

Building AI with MongoDB: Retrieval-Augmented Generation (RAG) Puts Power in Developers’ Hands

As recently as 12 months ago, any mention of retrieval-augmented generation (RAG) would have left most of us confused. However, with the explosion of generative AI, the RAG architectural pattern has now firmly established itself in the enterprise landscape. RAG presents developers with a potent combination. They can take the reasoning capabilities of pre-trained, general-purpose LLMs and feed them with real-time, company-specific data. As a result, developers can build AI-powered apps that generate outputs grounded in enterprise data and knowledge that is accurate, up-to-date, and relevant. They can do this without having to turn to specialized data science teams to either retrain or fine-tune models — a complex, time-consuming, and expensive process. Over this series of Building AI with MongoDB blog posts, we’ve featured developers using tools like MongoDB Atlas Vector Search for RAG in a whole range of applications. Take a look at our AI case studies page and you’ll find examples spanning conversational AI with chatbots and voice bots, co-pilots, threat intelligence and cybersecurity, contract management, question-answering, healthcare compliance and treatment assistants, content discovery and monetization, and more. Further reflecting its growing adoption, Retool’s State of AI survey from a couple of weeks ago shows Atlas Vector Search earning the highest net promoter score (NPS) among developers . Check out our AI resource page to learn more about building AI-powered apps with MongoDB. In this blog post, I’ll highlight three more interesting and novel use cases: Unlocking geological data for better decision-making and accelerating the path to net zero at Eni Video and audio personalization at Potion Unlocking insights from enterprise knowledge bases at Kovai Eni makes terabytes of subsurface unstructured data actionable with MongoDB Atlas Based in Italy, Eni is a leading integrated energy company with more than 30,000 employees across 69 countries. In 2020, the company launched a strategy to reach net zero emissions by 2050 and develop more environmentally and financially sustainable products. Sabato Severino, Senior AI Solution Architect for Geoscience at Eni, explains the role of his team: “We’re responsible for finding the best solutions in the market for our cloud infrastructure and adapting them to meet specific business needs.” Projects include using AI for drilling and exploration, leveraging cloud APIs to accelerate innovation, and building a smart platform to promote knowledge sharing across the company. Eni’s document management platform for geosciences offers an ecosystem of services and applications for creating and sharing content. It leverages embedded AI models to extract information from documents and stores unstructured data in MongoDB. The challenges for Severino’s team were to maintain the platform as it ingested a growing volume of data — hundreds of thousands of documents and terabytes of data — and to enable different user groups to extract relevant insights from comprehensive records quickly and easily. With MongoDB Atlas , Eni users can quickly find data spanning multiple years and geographies to identify trends and analyze models that support decision-making within their fields. The platform uses MongoDB Atlas Search to filter out irrelevant documents while also integrating AI and machine learning models, such as vector search, to make it even easier to identify patterns. “The generative AI we’ve introduced currently creates vector embeddings from documents, so when a user asks a question, it retrieves the most relevant document and uses LLMs to build the answer,” explains Severino. “We’re looking at migrating vector embeddings into MongoDB Atlas to create a fully integrated, functional system. We’ll then be able to use Atlas Vector Search to build AI-powered experiences without leaving the Atlas platform — a much better experience for developers.” Read the full case study to learn more about Eni and how it is making unstructured data actionable. Video personalization at scale with Potion and MongoDB Potion enables salespeople to personalize prospecting videos at scale. Already over 7,500 sales professionals at companies including SAP, AppsFlyer, CaptivateIQ, and Opensense are using SendPotion to increase response rates, book more meetings, and build customer trust. All a sales representative needs to do is record a video template, select which words need to be personalized, and let Potion’s audio and vision AI models do the rest. Kanad Bahalkar, co-founder and CEO at Potion explains: “The sales rep tells us what elements need to be personalized in the video — that is typically provided as a list of contacts with their name, company, desired call-to-action, and so on. Our vision and audio models then inspect each frame and reanimate the video and audio with personalized messages lip-synced into the stream. Reanimation is done in bulk in minutes. For example, one video template can be transformed into over 1,000 unique video messages, personalized to each contact.” Potion’s custom generative AI models are built with PyTorch and TensorFlow, and run on Amazon Sagemaker. Describing their models, Kanad says “Our vision model is trained on thousands of different faces, so we can synthesize the video without individualized AI training. The audio models are tuned on-demand for each voice.” And where does the data for the AI lifecycle live? “This is where we use MongoDB Atlas ,” says Kanad. “We use the MongoDB database to store metadata for all the videos, including the source content for personalization, such as the contact list and calls to action. For every new contact entry created in MongoDB, a video is generated for it using our AI models, and a link to that video is stored back in the database. MongoDB also powers all of our application analytics and intelligence . With the insights we generate from MongoDB, we can see how users interact with the service, capturing feedback loops, response rates, video watchtimes, and more. This data is used to continuously train and tune our models in Sagemaker." On selecting MongoDB Kanad says, “I had prior experience of MongoDB and knew how easy and fast it was to get started for both modeling and querying the data. Atlas provides the best-managed database experience out there, meaning we can safely offload running the database to MongoDB. This ease-of-use, speed, and efficiency are all critical as we build and scale the business." To further enrich the SendPotion service, Kanad is planning to use more of the developer features within MongoDB Atlas. This includes Atlas Vector Search to power AI-driven semantic search and RAG for users who are exploring recommendations across video libraries. The engineering team is also planning on using Atlas Triggers to enable event-driven processing of new video content. Potion is a member of the MongoDB AI Innovators program. Asked about the value of the program, Kanad responds, “Access to free credits helped support rapid build and experimentation on top of MongoDB, coupled with access to technical guidance and support." Bringing the power of Vector Search to enterprise knowledge bases Founded in 2011, Kovai is an enterprise software company that offers multiple products in both the enterprise and B2B SaaS arena. Since its founding, the company has grown to nearly 300 employees serving over 2,500 customers. One of Kovai’s key products is Document360, a knowledge base platform for SaaS companies looking for a self-service software documentation solution. Seeing the rise of GenAI, Kovai began developing its AI assistant, “Eddy.” The assistant provides answers to customers' questions utilizing LLMs augmented by retrieving information in a Document360 knowledge base. During the development phase Kovai’s engineering and data science teams explored multiple vector databases to power the RAG portion of the application. They found the need to sync data between its system-of-record MongoDB database and a separate vector database introduced inaccuracies in answers from the assistant. The release of MongoDB Atlas Vector Search provided a solution with three key advantages for the engineers: Architectural simplicity: MongoDB Vector Search's architectural simplicity helps Kovai optimize the technical architecture needed to implement Eddy. Operational efficiency: Atlas Vector Search allows Kovai to store both knowledge base articles and their embeddings together in MongoDB collections, eliminating “data syncing” issues that come with other vendors. Performance: Kovai gets faster query response from MongoDB Vector Search at scale to ensure a positive user experience. Atlas Vector Search is robust, cost-effective, and blazingly fast! Said Saravana Kumar, CEO, Kovai, when speaking about his team's experience Specifically, the team has seen the average time taken to return three, five, and 10 chunks between two and four milliseconds, and if the question is a closed loop, the average time reduces to less than two milliseconds. You can learn more about Kovai’s journey into the world of RAG in the full case study . Getting started As the case studies in our Building AI with MongoDB series demonstrate, retrieval-augmented generation is a key design pattern developers can use as they build AI-powered applications for the business. Take a look at our Embedding Generative AI whitepaper to explore RAG in more detail.

November 28, 2023

Building AI with MongoDB: Improving Productivity with WINN.AI’s Virtual Sales Assistant

Better serving customers is a primary driver for the huge wave of AI innovations we see across enterprises. WINN.AI is a great example. Founded in November 2021 by sales tech entrepreneur Eldad Postan Koren and cybersecurity expert Bar Haleva, their innovations are enabling sales teams to improve productivity by increasing the time they focus on customers. WINN.AI orchestrates a multimodal suite of state-of-the-art models for speech recognition, entity extraction, and meeting summarization, relying on MongoDB Atlas as the underlying data layer. I had the opportunity to sit down with Orr Mendelson, Ph.D., Head of R&D at WINN.AI, to learn more. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Tell us a little bit about what WINN.AI is working to accomplish Today’s salespeople spend over 25% of their time on administrative busywork - costing organizations time, money, and opportunity. We are working to change that so that sales teams can spend more time solving their customer’s problems and less on administrative tasks. At the heart of WINN.AI is an AI-powered real-time sales assistant that joins your virtual meetings. It detects and interprets customer questions, and immediately surfaces relevant information for the salesperson. Think about retrieving relevant customer references or competitive information. It can provide prompts from a sales playbook, and also make sure meetings stay on track and on time. After concluding, WINN.AI extracts relevant information from the meeting and updates the CRM system. WINN.AI integrates with the leading tools used by sales teams, including Zoom, Hubspot, Salesforce, and more. Can you describe what role AI plays in your application? Our technology allows the system to understand not only what people are saying on a sales call, but also to specifically comprehend the context of a sales conversation, thus optimizing meeting summaries and follow-on actions. This includes identifying the most important talking points discussed in the meeting, knowing how to break down the captured data into different sales methodology fields (MEDDICC, BANT, etc.), and automatically pushing updates to the CRM. What specific AI/ML techniques, algorithms, or models are utilized in the application? We started out building and training our own custom Natural Language Processing (NLP) algorithms and later switched to GPT 3.5 and 4 for entity extraction and summarization. Our selection of models is based on specific requirements of the application feature – balancing things like latency with context length and data modality. We orchestrate all of the models with massive automation, reporting, and monitoring mechanisms. This is developed by our engineering teams and assures high-quality AI products across our services and users. We have a dedicated team of AI Engineers and Prompts Engineers that develop and monitor each prompt and response so we are continuously tuning and optimizing app capabilities. How do you use MongoDB in your application stack? MongoDB stores everything in the WINN.AI platform. Organizations and users, sessions, their history, and more. The primary driver for selecting MongoDB was its flexibility in being able to store, index, and query data of any shape or structure. The database fluidly adapts to our application schema, which gives us a more agile approach than traditional relational databases. My developers love the ecosystem that has built up around MongoDB. MongoDB Atlas provides the managed services we need to run, scale, secure, and backup our data. How do you see the broader benefits of MongoDB in your business? In the ever-changing AI tech market, MongoDB is our stable anchor. MongoDB provides the freedom to work with structured and unstructured data while using any of our preferred tools, and we leave database management to the Atlas service. This means my developers are free to create with AI while being able to sleep at night! MongoDB is familiar to our developers so we don’t need any DBA or external experts to maintain and run it safely. We can invest those savings back into building great AI-powered products. What are your future plans for new applications and how does MongoDB fit into them? We’re always looking for opportunities to offer new functionality to our users. Capabilities like Atlas Search for faceted full-text navigation over data coupled with MongoDB’s application-driven intelligence for more real-time analytics and insights are all incredibly valuable. Streaming is one area that I’m really excited about. Our application is composed of multiple microservices that are soon to be connected with Kafka for an event-driven architecture. Building on Kafka based messaging, Atlas Stream Processing is another direction we will explore. It will give our services a way of continuously querying, analyzing and reacting to streaming data without having to first land it in the database. This will give our customers even lower latency AI outputs. Everybody WINNs! Wrapping up Orr, thank you for sharing WINN.AI’s story with the community! WINN.AI is part of the MongoDB AI Innovators program , benefiting from access to free Atlas credits and technical expertise. If you are getting started with AI, sign-up for the program and build with MongoDB.

November 20, 2023

Vector Search und LLM-Grundlagen – Was, Wann und Warum

Vektorsuche und im weiteren Sinne künstliche Intelligenz (KI) sind heute beliebter denn je. Diese Begriffe tauchen überall auf. Technologieunternehmen auf der ganzen Welt bemühen sich darum, Vektorsuche und KI-Funktionen zu veröffentlichen, um Teil dieses wachsenden Trends zu sein. Daher ist es ungewöhnlich, auf die Homepage eines datengesteuerten Unternehmens zu stoßen, ohne einen Verweis auf die Vektorsuche oder große Sprachmodelle (LLMs) zu finden. In diesem Blog befassen wir uns mit der MEAN dieser Begriffe und untersuchen gleichzeitig die Ereignisse, die zu ihrem aktuellen Trend geführt haben. Schauen Sie sich unsere KI-Ressourcenseite an, um mehr über die Erstellung KI-gestützter Apps mit MongoDB zu erfahren. Was ist Vector Search? Vektoren sind codierte Darstellungen unstrukturierter Daten wie Text, Bilder und Audio in Form eines Array. Abbildung 1: Durch die Einbettung von Modellen werden Daten in Vektoren umgewandelt Diese Vektoren werden durch Techniken des maschinellen Lernens (ML) erzeugt, die als „Einbettungsmodelle“ bezeichnet werden. Diese Modelle werden auf großen Datenmengen trainiert. Einbettungsmodelle erfassen effektiv sinnvolle Beziehungen und Ähnlichkeiten zwischen Daten. Dies ermöglicht es Benutzern, Daten basierend auf der Bedeutung und nicht auf der Grundlage der Daten selbst abzufragen. Diese Tatsache ermöglicht effizientere Datenanalyseaufgaben wie Empfehlungssysteme, Sprachverständnis und Bilderkennung. Jede Suche beginnt mit einer Abfrage und bei der Vector Search wird die Abfrage durch einen Vektor dargestellt. Die Aufgabe der Vektorsuche besteht darin, aus den in einer Datenbank gespeicherten Vektoren diejenigen zu finden, die dem Vektor der Abfrage am ähnlichsten sind. Dies ist die Grundvoraussetzung. Es geht um Ähnlichkeit. Aus diesem Grund wird die Vektorsuche oft als Ähnlichkeitssuche bezeichnet. Hinweis: Ähnlichkeit gilt auch für Ranking-Algorithmen, die mit Nicht-Vektordaten arbeiten. Um das Konzept der Vektorähnlichkeit zu verstehen, stellen wir uns einen dreidimensionalen Raum vor. In diesem Raum wird die Position eines Datenpunkts vollständig durch drei Koordinaten bestimmt. Abbildung 2: Lage eines Punktes P im dreidimensionalen Raum Wenn ein Raum 1024 Dimensionen hat, sind auf die gleiche Weise 1024 Koordinaten erforderlich, um einen Datenpunkt zu lokalisieren. Abbildung 3: Punkt P in einer Kugel, die einen mehrdimensionalen Raum darstellt Vektoren liefern auch die Position von Datenpunkten in mehrdimensionalen Räumen. Tatsächlich können wir die Werte in einem Vektor als ein Array behandeln. Sobald wir die Position der Datenpunkte – der Vektoren – haben, wird ihre Ähnlichkeit untereinander berechnet, indem der Abstand zwischen ihnen im Vektorraum gemessen wird. Punkte, die im Vektorraum näher beieinander liegen, stellen Konzepte dar, deren Bedeutung ähnlicher ist. Beispielsweise hat „Reifen“ eine größere Ähnlichkeit mit „Auto“ und eine geringere Ähnlichkeit mit „Flugzeug“. Allerdings hätte „Flügel“ nur eine Ähnlichkeit mit „Flugzeug“. Daher wäre der Abstand zwischen den Vektoren für „Reifen“ und „Auto“ kleiner als der Abstand zwischen den Vektoren für „Reifen“ und „Flugzeug“. Allerdings wäre der Abstand zwischen „Flügel“ und „Auto“ enorm. Mit anderen Worten: „Reifen“ ist relevant, wenn wir von einem „Auto“ und in geringerem Maße von einem „Flugzeug“ sprechen. Allerdings ist ein „Flügel“ nur relevant, wenn wir von einem „Flugzeug“ sprechen, und überhaupt nicht relevant, wenn wir von einem „Auto“ sprechen (zumindest bis fliegende Autos ein brauchbares Transportmittel sind). Die Kontextualisierung von Daten – unabhängig vom Typ – ermöglicht es der Vector Search, die relevantesten Ergebnisse zu einer bestimmten Abfrage abzurufen. Ein einfaches Beispiel für Ähnlichkeit Tabelle 1: Beispiel für Ähnlichkeit zwischen verschiedenen Begriffen Was sind große Sprachmodelle? LLMs bringen KI in die Vektorsuchgleichung ein. LLMs und der menschliche Geist verstehen und assoziieren Konzepte, um bestimmte Aufgaben in natürlicher Sprache auszuführen, beispielsweise einem Gespräch zu folgen oder einen Artikel zu verstehen. LLMs benötigen wie Menschen eine Ausbildung, um verschiedene Konzepte zu verstehen. Wissen Sie zum Beispiel, was der Begriff „Corium“ bedeutet? Es sei denn, Sie sind Nuklearingenieur, wahrscheinlich nicht. Das Gleiche gilt für LLMs: Wenn sie nicht in einem bestimmten Bereich ausgebildet sind, sind sie nicht in der Lage, Konzepte zu verstehen und erbringen daher schlechte Leistungen. Schauen wir uns ein Beispiel an. LLMs verstehen Textteile dank ihrer Einbettungsschicht. Dabei werden Wörter oder Sätze in Vektoren umgewandelt. Um Vektoren zu visualisieren, verwenden wir die cloud. cloud sind eng mit Vektoren verwandt, da sie Konzepte und deren Kontext darstellen. Sehen wir uns zunächst die cloud an, die ein Einbettungsmodell für den Begriff „Corium“ erzeugen würde, wenn es mit kerntechnischen Daten trainiert würde: Abbildung 4: cloud aus einem mit Nukleardaten trainierten Modell Wie im Bild oben gezeigt, weist die cloud darauf hin, dass es sich bei Corium um ein radioaktives Material handelt, das etwas mit Sicherheits- und Eindämmungsstrukturen zu tun hat. Corium ist jedoch ein spezieller Begriff, der auch auf einen anderen Bereich angewendet werden kann. Sehen wir uns die cloud an, die sich aus einem Einbettungsmodell ergibt, das in Biologie und Anatomie trainiert wurde: Abbildung 5: cloud aus einem mit Biologiedaten trainierten Modell In diesem Fall weist die cloud darauf hin, dass es sich bei Corium um einen Begriff handelt, der sich auf die Haut und ihre Schichten bezieht. Was ist hier passiert? Ist eines der Einbettungsmodelle falsch? NEIN. Sie wurden beide mit unterschiedlichen Datenfestlegungen trainiert. Deshalb ist es entscheidend, das am besten geeignete Modell für einen bestimmten Anwendungsfall zu finden. Eine gängige Praxis in der Branche ist die Übernahme eines vorab trainierten Einbettungsmodells mit fundiertem Hintergrundwissen. Man nimmt dieses Modell und passt es dann mit dem domänenspezifischen Wissen an, das zur Ausführung bestimmter Aufgaben erforderlich ist. Auch die Quantität und Qualität der zum Trainieren eines Modells verwendeten Daten sind relevant. Wir sind uns einig, dass eine Person, die nur einen Artikel über Aerodynamik gelesen hat, über das Thema weniger informiert ist als eine Person, die Physik und Luft- und Raumfahrttechnik studiert hat. Ebenso sind Modelle, die mit großen Mengen an qualitativ hochwertigen Daten trainiert werden, besser in der Lage, Konzepte zu verstehen und Vektoren zu generieren, die sie genauer darstellen. Dies schafft die Grundlage für ein erfolgreiches Vektorsuchsystem. Es ist erwähnenswert, dass LLMs zwar Texteinbettungsmodelle verwenden, die Vektorsuche jedoch darüber hinausgeht. Es kann mit Audio, Bildern und mehr umgehen. Es ist wichtig zu bedenken, dass die für diese Fälle verwendeten Einbettungsmodelle denselben Ansatz verfolgen. Sie müssen auch mit Daten – Bildern, Tönen usw. – trainiert werden, um die Bedeutung dahinter zu verstehen und die entsprechenden Ähnlichkeitsvektoren zu erstellen. Wann wurde die Vector Search erstellt? MongoDB Atlas Vector Search bietet derzeit drei Ansätze zur Berechnung der Vektorähnlichkeit. Diese werden auch als Distanzmetriken bezeichnet und bestehen aus: Euklidische Entfernung Kosinusprodukt Skalarprodukt Obwohl jede Metrik anders ist, konzentrieren wir uns in diesem Blog auf die Tatsache, dass sie alle die Entfernung messen. Atlas Vector Search speist diese Distanzmetriken in einen ANN-Algorithmus (Approximation Nearest Neighbor) ein, um die gespeicherten Vektoren zu finden, die dem Vektor der Abfrage am ähnlichsten sind. Um diesen Prozess zu beschleunigen, werden Vektoren mithilfe eines Algorithmus namens „Hierarchical Navigable Small World“ (HNSW) Index . HNSW leitet die Suche durch ein Netzwerk miteinander verbundener Datenpunkte, sodass nur die relevantesten Datenpunkte berücksichtigt werden. Die Verwendung einer der drei Distanzmetriken in Verbindung mit den HNSW- und KNN-Algorithmen bildet die Grundlage für die Durchführung einer Vektorsuche im MongoDB Atlas. Aber wie alt sind diese Technologien? Wir würden denken, dass es sich um neue Erfindungen eines hochmodernen Quantencomputerlabors handelt, aber die Wahrheit ist weit davon entfernt. Abbildung 6: Zeitleiste der Vektorsuchtechnologien Der euklidische Abstand wurde im Jahr 300 v. Chr. formuliert, der Kosinus und das Skalarprodukt im Jahr 1881, der KNN-Algorithmus im Jahr 1951 und der HNSW-Algorithmus im Jahr 2016. Dies bedeutet, dass die Grundlagen für eine moderne Vektorsuche bereits im Jahr 2016 vollständig vorhanden waren. Obwohl die Vektorsuche heute ein heißes Thema ist, ist es bereits seit mehreren Jahren möglich, sie umzusetzen. Wann wurden LLMs erstellt? Im Jahr 2017 gab es einen Durchbruch: die Transformer-Architektur . Diese Architektur wurde in dem berühmten Artikel Attention is all you need vorgestellt und führte ein neuronales Netzwerkmodell für NLP-Aufgaben (Natural Language Processing) ein. Dies ermöglichte es ML-Algorithmen, Sprachdaten in einer Größenordnung zu verarbeiten, die zuvor möglich war. Dadurch nahm die Menge an Informationen, die zum Trainieren der Modelle verwendet werden konnte, exponentiell zu. Dies ebnete den Weg für das Erscheinen des ersten LLM im Jahr 2018: GPT-1 von OpenAI. LLMs verwenden Einbettungsmodelle, um Textteile zu verstehen und bestimmte Aufgaben in natürlicher Sprache wie die Beantwortung von Fragen oder maschinelle Übersetzung auszuführen. LLMs sind im Wesentlichen NLP-Modelle, die aufgrund der großen Datenmenge, mit der sie trainiert werden, umbenannt wurden – daher das Wort „groß“ in LLM. Die folgende Graph zeigt die Datenmenge – Parameter –, die im Laufe der Jahre zum Trainieren von ML-Modellen verwendet wurde. Ein dramatischer Anstieg ist im Jahr 2017 nach Veröffentlichung der Transformer-Architektur zu beobachten. Abbildung 7: Parameteranzahl von ML-Systemen im Zeitverlauf. Source: Warum sind Vektorsuche und LLMs so beliebt? Wie bereits erwähnt, war die Technologie zur Vektorsuche bereits im Jahr 2016 vollständig verfügbar. Besonders beliebt wurde es allerdings erst Ende 2022. Warum? Obwohl die ML-Branche seit 2018 sehr aktiv ist, waren LLMs bis zur Veröffentlichung von ChatGPT durch OpenAI im November 2022 weder allgemein verfügbar noch einfach zu verwenden. Die Tatsache, dass OpenAI jedem die Interaktion mit einem LLM über einen einfachen Chat ermöglichte, ist der Schlüssel zu seinem Erfolg. ChatGPT revolutionierte die Branche, indem es dem Durchschnittsbürger ermöglichte, mit NLP-Algorithmen auf eine Weise zu interagieren, die sonst Forschern und Wissenschaftlern vorbehalten gewesen wäre. Wie in der folgenden Abbildung zu sehen ist, führte der Durchbruch von OpenAI zu einem rasanten Anstieg der Popularität von LLMs. Gleichzeitig wurde ChatGPT zu einem Mainstream-Tool, das von der breiten Öffentlichkeit genutzt wird. Der Einfluss von OpenAI auf die Popularität von LLMs wird auch durch die Tatsache belegt, dass sowohl OpenAI als auch LLMs gleichzeitig ihren ersten Popularitätshöhepunkt erreichten. (Siehe Abbildung 8.) Abbildung 8: Beliebtheit der Begriffe LLM und OpenAI im Zeitverlauf. Quelle:   Google Trends Hier erfahren Sie, warum. LLMs sind so beliebt, weil OpenAI sie mit der Veröffentlichung von ChatGPT berühmt gemacht hat. Das Suchen und Speichern großer Mengen an Vektoren wurde zu einer Herausforderung. Dies liegt daran, dass LLMs mit Einbettungen arbeiten. Damit nahm gleichzeitig auch die Einführung der Vektorsuche zu. Dies ist der größte Faktor, der zum Branchenwandel beiträgt. Dieser Wandel führte dazu, dass viele Datenunternehmen Unterstützung für die vector search und andere Funktionen im Zusammenhang mit LLMs und der dahinter stehenden KI einführten. Fazit Die Vektorsuche ist ein moderner Disruptor. Der zunehmende Wert sowohl von Vektoreinbettungen als auch fortgeschrittener mathematischer Suchprozesse hat die Einführung der Vektorsuche beschleunigt und den Bereich der Informationsbeschaffung verändert. Vektorgenerierung und Vektorsuche mögen zwar unabhängige Prozesse sein, aber wenn sie zusammenarbeiten, ist ihr Potenzial grenzenlos. Um mehr zu erfahren, besuchen Sie unsere Atlas Vector Search -Produktseite. Um mit Vector Search zu beginnen, melden Sie sich bei Atlas an oder anmeldung bei Ihrem Konto an.

November 16, 2023

Vector Search e LLM Essentials - O Quê, Quando e Por Quê

Vector Search e, de forma mais ampla, a Inteligência Artificial (IA) são mais populares agora do que nunca. Esses termos estão surgindo em todos os lugares. As empresas de tecnologia em todo o mundo estão se esforçando para lançar recursos de pesquisa vetorial e IA em um esforço para fazer parte dessa tendência crescente. Como resultado, é incomum encontrar uma página inicial de uma empresa baseada em dados e não ver uma referência à pesquisa vetorial ou aos grandes modelos de linguagem (LLMs). Neste blog, abordaremos o que esses termos MEAN enquanto examinamos os eventos que levaram à sua tendência atual. Confira nossa página de recursos de IA para saber mais sobre como criar aplicativos baseados em IA com MongoDB. O que é pesquisa vetorial Vetores são representações codificadas de dados não estruturados como texto, imagens e áudio na forma de matrizes de números. Figura 1: Os dados são transformados em vetores incorporando modelos Esses vetores são produzidos por técnicas de aprendizado de máquina (ML) chamadas "modelos de incorporação". Esses modelos são treinados em grandes conjuntos de dados. A incorporação de modelos captura efetivamente relacionamentos e semelhanças significativas entre os dados. Isso permite que os usuários consultem dados com base no significado, e não nos dados em si. Este fato desbloqueia tarefas de análise de dados mais eficientes, como sistemas de recomendação, compreensão de linguagem e reconhecimento de imagem. Toda pesquisa começa com uma consulta e, na vector search, a consulta é representada por um vetor. O trabalho da busca vetorial é encontrar, a partir dos vetores armazenados em um banco de dados, aqueles que mais se assemelham ao vetor da consulta. Esta é a premissa básica. É tudo uma questão de semelhança. É por isso que a vector search é frequentemente chamada de pesquisa por similaridade. Nota: a similaridade também se aplica a algoritmos de classificação que trabalham com dados não vetoriais. Para entender o conceito de similaridade vetorial, vamos imaginar um espaço tridimensional. Neste espaço, a localização de um ponto de dados é totalmente determinada por três coordenadas. Figura 2: Localização de um ponto P num espaço tridimensional Da mesma forma, se um espaço tem 1.024 dimensões, são necessárias 1.024 coordenadas para localizar um ponto de dados. Figura 3: Ponto P em uma esfera que representa um espaço multidimensional Os vetores também fornecem a localização de pontos de dados em espaços multidimensionais. Na verdade, podemos tratar os valores de um vetor como uma matriz de coordenadas. Assim que tivermos a localização dos pontos de dados – os vetores – sua similaridade entre si é calculada medindo a distância entre eles no espaço vetorial. Os pontos mais próximos uns dos outros no espaço vetorial representam conceitos com significados mais semelhantes. Por exemplo, “pneu” tem maior semelhança com “carro” e menor com “avião”. No entanto, “asa” teria apenas semelhança com “avião”. Portanto, a distância entre os vetores “pneu” e “carro” seria menor que a distância entre os vetores “pneu” e “avião”. No entanto, a distância entre “asa” e “carro” seria enorme. Em outras palavras, “pneu” é relevante quando falamos de “carro” e, em menor medida, de “avião”. No entanto, uma “asa” só é relevante quando falamos de um “avião” e não é de todo relevante quando falamos de um “carro” (pelo menos até que os carros voadores sejam um meio de transporte viável). A contextualização dos dados — independentemente do tipo — permite que a pesquisa vetorial recupere os resultados mais relevantes para uma determinada consulta. Um exemplo simples de semelhança Tabela 1: Exemplo de semelhança entre diferentes termos O que são modelos de linguagem grande? LLMs são o que trazem a IA para a equação de vector search. LLMs e mentes humanas entendem e associam conceitos para realizar certas tarefas de linguagem natural, como acompanhar uma conversa ou compreender um artigo. Os LLMs, assim como os humanos, precisam de treinamento para compreender diferentes conceitos. Por exemplo, você sabe a que se refere o termo “corium”? A menos que você seja um engenheiro nuclear, provavelmente não. O mesmo acontece com os LLMs: se não forem treinados em um domínio específico, não serão capazes de compreender conceitos e, portanto, terão um desempenho insatisfatório. Vejamos um exemplo. LLMs entendem pedaços de texto graças à sua camada de incorporação. É aqui que palavras ou frases são convertidas em vetores. Para visualizar vetores, usaremos cloud de palavras. cloud de palavras está intimamente relacionada aos vetores no sentido de que são representações de conceitos e seu contexto. Primeiro, vamos ver a cloud de palavras que um modelo de incorporação geraria para o termo “corium” se fosse treinado com dados de engenharia nuclear: Figura 4: Exemplo de cloud de palavras de um modelo treinado com dados nucleares Conforme mostrado na imagem acima, a palavra cloud indica que o cório é um material radioativo que tem algo a ver com estruturas de segurança e contenção. Mas cório é um termo especial que também pode ser aplicado a outro domínio. Vejamos a cloud de palavras resultante de um modelo de incorporação treinado em biologia e anatomia: Figura 5: Exemplo de cloud de palavras de um modelo treinado com dados biológicos Nesse caso, a palavra cloud indica que cório é um conceito relacionado à pele e suas camadas. O que aconteceu aqui? Um dos modelos de incorporação está errado? Não. Ambos foram treinados com conjuntos de dados diferentes. É por isso que encontrar o modelo mais apropriado para um caso de uso específico é crucial. Uma prática comum na indústria é adotar um modelo de incorporação pré-treinado com forte conhecimento prévio. Pegamos esse modelo e o ajustamos com o conhecimento específico do domínio necessário para executar tarefas específicas. A quantidade e a qualidade dos dados usados para treinar um modelo também são relevantes. Podemos concordar que quem leu apenas um artigo sobre aerodinâmica estará menos informado sobre o assunto do que quem estudou física e engenharia aeroespacial. Da mesma forma, os modelos treinados com grandes conjuntos de dados de alta qualidade serão melhores na compreensão de conceitos e na geração de vetores que os representem com mais precisão. Isso cria a base para um sistema de busca vetorial bem-sucedido. É importante notar que embora os LLMs utilizem modelos de incorporação de texto, a vector search vai além disso. Ele pode lidar com áudio, imagens e muito mais. É importante lembrar que os modelos de incorporação utilizados para estes casos compartilham a mesma abordagem. Eles também precisam ser treinados com dados – imagens, sons, etc. – para serem capazes de compreender o significado por trás deles e criar os vetores de similaridade apropriados. Quando a pesquisa vetorial foi criada? O MongoDB Atlas Vector Search fornece atualmente três abordagens para calcular a similaridade vetorial. Eles também são chamados de métricas de distância e consistem em: Distância euclidiana Produto cosseno Produto escalar Embora cada métrica seja diferente, para o propósito deste blog, focaremos no fato de que todas medem distância. O Atlas Vector Search alimenta essas métricas de distância em um algoritmo de vizinho mais próximo aproximado (ANN) para encontrar os vetores armazenados que são mais semelhantes ao vetor da consulta. Para agilizar esse processo, os vetores são indexados por meio de um algoritmo denominado mundo pequeno navegável hierárquico (HNSW). O HNSW orienta a pesquisa através de uma rede de pontos de dados interconectados para que apenas os pontos de dados mais relevantes sejam considerados. O uso de uma das três métricas de distância em conjunto com os algoritmos HNSW e KNN constitui a base para realizar a pesquisa vetorial no MongoDB Atlas. Mas, quantos anos têm essas tecnologias? Poderíamos pensar que são invenções recentes de um laboratório de computação quântica de última geração, mas a verdade está longe disso. Figura 6: Linha do tempo das tecnologias de busca vetorial A distância euclidiana foi formulada no ano 300 a.C., o cosseno e o produto escalar em 1881, o algoritmo KNN em 1951 e o algoritmo HNSW em 2016. O que isto significa é que as bases para a pesquisa vetorial de última geração estavam totalmente disponíveis em 2016. Portanto, embora a pesquisa vetorial seja o tema quente da atualidade, já é possível implementá-la há vários anos. Quando os LLMs foram criados? Em 2017, houve um grande avanço: a arquitetura do transformador . Apresentada no famoso artigo Attention is all you need , essa arquitetura introduziu um modelo de rede neural para tarefas de processamento de linguagem natural (PNL). Isso permitiu que algoritmos de ML processassem dados de linguagem em uma ordem de magnitude maior do que era possível anteriormente. Como resultado, a quantidade de informações que poderiam ser usadas para treinar os modelos aumentou exponencialmente. Isso abriu caminho para o primeiro LLM aparecer em 2018: GPT-1 da OpenAI. LLMs usam modelos de incorporação para compreender trechos de texto e realizar certas tarefas de linguagem natural, como resposta a perguntas ou tradução automática. LLMs são essencialmente modelos de PNL que foram renomeados devido à grande quantidade de dados com os quais são treinados – daí a palavra grande em LLM. O gráfico abaixo mostra a quantidade de dados — parâmetros — usados para treinar modelos de ML ao longo dos anos. Um aumento dramático pode ser observado em 2017, após a publicação da arquitetura do transformador. Figura 7: Contagem de parâmetros de sistemas de ML ao longo do tempo. Fonte: Por que a pesquisa vetorial e os LLMs são tão populares? Conforme afirmado acima, a tecnologia de busca vetorial estava totalmente disponível em 2016. No entanto, não se tornou particularmente popular até o final de 2022. Por que? Embora a indústria de ML tenha estado muito ativa desde 2018, os LLMs não estavam amplamente disponíveis ou fáceis de usar até o lançamento do ChatGPT pela OpenAI em novembro de 2022. O fato de o OpenAI permitir que todos interajam com um LLM com um simples chat é a chave do seu sucesso. ChatGPT revolucionou a indústria ao permitir que uma pessoa comum interagisse com algoritmos de PNL de uma forma que de outra forma seria reservada para pesquisadores e cientistas. Como pode ser visto na figura abaixo, o avanço da OpenAI fez com que a popularidade dos LLMs disparasse. Ao mesmo tempo, o ChatGPT tornou-se uma ferramenta convencional usada pelo público em geral. A influência do OpenAI na popularidade dos LLMs também é evidenciada pelo fato de que tanto o OpenAI quanto os LLMs tiveram seu primeiro pico de popularidade simultaneamente. (Veja a figura 8.) Figura 8: Popularidade dos termos LLM e OpenAI ao longo do tempo. Fonte:   Google Trends Aqui está o porquê. Os LLMs são tão populares porque a OpenAI os tornou famosos com o lançamento do ChatGPT. Pesquisar e armazenar grandes quantidades de vetores tornou-se um desafio. Isso ocorre porque os LLMs funcionam com incorporações. Assim, a adoção da busca vetorial aumentou paralelamente. Este é o maior fator que contribui para a mudança da indústria. Essa mudança resultou na introdução de suporte para pesquisa vetorial e outras funcionalidades relacionadas aos LLMs e à IA por trás deles, em muitas empresas de dados. Conclusão A pesquisa vetorial é um disruptor moderno. O valor crescente da incorporação de vetores e dos processos avançados de busca matemática catalisou a adoção da busca vetorial para transformar o campo da recuperação de informação. A geração e a busca de vetores podem ser processos independentes, mas quando trabalham juntas seu potencial é ilimitado. Para saber mais, visite nossa página de produto Atlas Vector Search . Para começar a usar o Vector Search, inscreva-se no Atlas ou faça log-in em sua conta.

November 16, 2023

Vector Search y Conceptos Básicos de LLM: Qué, Cuándo y Por Qué

Vector Search y, más ampliamente, la Inteligencia Artificial (IA) son más populares ahora que nunca. Estos términos están surgiendo en todas partes. Las empresas de tecnología de todo el mundo están luchando por adoptar versiones de búsqueda vectorial y características de IA en un esfuerzo por ser parte de esta creciente tendencia. Como resultado, es inusual encontrar una página de inicio para un negocio basado en datos y no ver una referencia a la vector search o a modelos de lenguaje grande (LLM). En este blog, cubriremos lo que MEAN estos términos mientras examinamos los eventos que llevaron a su tendencia actual. Consulte nuestra página de recursos de IA para obtener más información sobre cómo crear aplicaciones basadas en IA con MongoDB. ¿Qué es la búsqueda vectorial? Los vectores son representaciones codificadas de datos no estructurados como texto, imágenes y audio en forma de abanico de números. Figura 1: Los datos se convierten en vectores al incrustar modelos Estos vectores son producidos por técnicas de aprendizaje automático (ML) llamadas modelos " de " incrustación. Estos modelos están entrenados en grandes volúmenes de datos. Los modelos de incrustación capturan de manera efectiva las relaciones significativas y las similitudes entre los datos. Esto permite a los usuarios consultar datos basados en el significado en lugar de los datos en sí. Este hecho desbloquea tareas de análisis de datos más eficientes como sistemas de recomendación, comprensión del lenguaje y reconocimiento de imágenes. Cada búsqueda comienza con una consulta y, en la búsqueda vectorial, la consulta está representada por un vector. El trabajo de la vector search es encontrar, a partir de los vectores almacenados en una base de datos, aquellos que son más similares al vector de la consulta. Esta es la premisa básica. Se trata de similitud. Esta es la razón por la que la búsqueda vectorial a menudo se llama búsqueda de similitud. Nota: la similitud también se aplica a algoritmos de clasificación que funcionan con datos no vectoriales. Para entender el concepto de similitud vectorial, imaginemos un espacio tridimensional. En este espacio, la ubicación de un punto de datos está completamente determinada por tres coordenadas. Figura 2: Ubicación de un punto P en un espacio tridimensional De la misma manera, si un espacio tiene 1024 dimensiones, se necesitan 1024 coordenadas para localizar un punto de datos. Figura 3: Punto P en una esfera que representa un espacio multidimensional Los vectores también proporcionan la ubicación de los puntos de datos en espacios multidimensionales. De hecho, podemos tratar los valores de un vector como un abanico de coordenadas. Una vez que tenemos la ubicación de los puntos de datos, los vectores, su similitud entre ellos se calcula midiendo la distancia entre ellos en el espacio vectorial. Los puntos que están más cerca unos de otros en el espacio vectorial representan conceptos que son más similares en significado. Por ejemplo, la " llanta " tiene una mayor similitud con el " auto " y una menor con el " avión. " Sin embargo, el " ala solo " tendría una similitud con el " avión. " Por lo tanto, la distancia entre los vectores para “neumático” y “auto” sería menor que la distancia entre los vectores para “neumático” y “avión”. Sin embargo, la distancia entre “ala” y “auto” sería enorme. En otras palabras, “neumático” es relevante cuando hablamos de un “auto” y, en menor medida, de un “avión”. Sin embargo, un “ala” solo es relevante cuando hablamos de un “avión” y nada relevante cuando hablamos de un “automóvil” (al menos hasta que los autos voladores sean un modo de transporte viable). La contextualización de los datos, independientemente del tipo, permite la vector search para recuperar los resultados más relevantes para una consulta determinada. Un ejemplo simple de similitud Tabla 1: Ejemplo de similitud entre diferentes términos ¿Qué son los modelos de lenguaje grande? Los LLM son lo que lleva la IA a la ecuación de búsqueda vectorial. Los LLM y las mentes humanas entienden y asocian conceptos para realizar ciertas tareas del lenguaje natural, como seguir una conversación o comprender un artículo. Los LLM, como los humanos, necesitan entrenamiento para entender diferentes conceptos. Por ejemplo, ¿sabe a qué se refiere el término “corium”? A menos que sea ingeniero nuclear, probablemente no. Lo mismo sucede con los LLM: si no están capacitados en un dominio específico, no son capaces de entender los conceptos y, por lo tanto, tienen un desempeño deficiente. Veamos un ejemplo. Los LLM entienden fragmentos de texto gracias a su capa de incrustación. Aquí es donde las palabras u oraciones se convierten en vectores. Para visualizar vectores vamos a utilizar cloud de palabras. cloud de palabras están estrechamente relacionadas con los vectores en el sentido de que son representaciones de conceptos y su contexto. Primero, veamos la cloud de palabras que generaría un modelo de incrustación para el término “corium” si fuera entrenado con datos de ingeniería nuclear: Figura 4: cloud de palabras de muestra de un modelo entrenado con datos nucleares Como se muestra en la imagen de arriba, la cloud de palabras indica que el corio es un material radiactivo que tiene algo que ver con las estructuras de seguridad y contención. Pero, corium es un término especial que también se puede aplicar a otro dominio. Veamos la cloud de palabras resultante de un modelo de incrustación que ha sido entrenado en biología y anatomía: Figura 5: cloud de palabras de muestra de un modelo entrenado con datos biológicos En este caso, la cloud de palabras indica que corium es un concepto relacionado con la piel y sus capas. ¿Qué pasó aquí? ¿Está equivocado uno de los modelos de incrustación? No. Ambos han sido entrenados con diferentes datos para establecer. Por eso es crucial encontrar el modelo más apropiado para un caso de uso específico. Una práctica común en la industria es adoptar un modelo de incrustación previamente entrenado con un sólido conocimiento de fondo. Uno toma este modelo y luego lo ajusta con el conocimiento específico del dominio necesario para realizar tareas particulares. La cantidad y calidad de los datos utilizados para entrenar un modelo también son relevantes. Podemos estar de acuerdo en que una persona que haya leído solo un artículo sobre aerodinámica estará menos informada sobre el tema que una persona que estudió física e ingeniería aeroespacial. De manera similar, los modelos que se entrenan con una gran cantidad de datos de alta calidad comprenderán mejor los conceptos y generarán vectores que los representen con mayor precisión. Esto crea las bases para un éxito (en este contexto, por ejemplo, «éxito de la innovación»); en otros contextos: sistema de búsqueda de vectores correctos. Vale la pena señalar que aunque los LLM usan modelos de incrustación de texto, la vector search va más allá de eso. Puede tratar con audio, imágenes y más. Es importante recordar que los modelos de incrustación utilizados para estos casos de acción (en este contexto empresarial particular) siguen el mismo enfoque. También necesitan ser entrenados con datos (imágenes, sonidos, etc.) para poder entender el significado detrás de esto y crear los vectores de similitud apropiados. ¿Cuándo se creó la vector search? MongoDB Atlas Vector Search actualmente proporciona tres enfoques para calcular la similitud de vectores. Estas también se denominan métricas de distancia y consisten en: distancia euclidiana producto coseno producto punto Si bien cada métrica es diferente, a los efectos de este blog nos centraremos en el hecho de que todas miden la distancia. Atlas Vector Search introduce estas métricas de distancia en un algoritmo de vecino más cercano (ANN) aproximado para encontrar los vectores almacenados que sean más similares al vector de la consulta. Para acelerar este proceso, los vectores se clasifican utilizando un algoritmo llamado mundo pequeño navegable jerárquico (HNSW). HNSW guía la búsqueda a través de una red de puntos de datos interconectados para que solo se consideren los puntos de datos más relevantes. El uso de una de las tres métricas de distancia junto con los algoritmos HNSW y KNN constituye la base para realizar búsquedas vectoriales en MongoDB Atlas. Pero, ¿cuántos años tienen estas tecnologías? Pensaríamos que son invenciones recientes de un laboratorio de computación cuántica de vanguardia, pero la verdad está lejos de eso. Figura 6: Cronología de las tecnologías de búsqueda vectorial La distancia euclidiana se formuló en el año 300 a.C., el coseno y el producto punto en 1881, el algoritmo KNN en 1951 y el algoritmo HNSW en 2016. Lo que esto significa es que las bases para la búsqueda vectorial de última generación estaban completamente disponibles en 2016. Entonces, aunque la búsqueda vectorial es el tema candente de hoy, ha sido posible implementarla durante varios años. ¿Cuándo se crearon las LLM? En 2017, hubo un gran avance: la arquitectura del transformador . Presentado en el famoso periódico Attention is all you need , esta arquitectura introdujo un modelo de red neuronal para las tareas de procesamiento del lenguaje natural (PNL). Esto permitió que los algoritmos de ML procesaran datos de lenguaje en un orden de magnitud mayor de lo que antes era posible. Como resultado, la cantidad de información que podría usarse para entrenar a los modelos aumentó exponencialmente. Esto allanó el camino para que aparezca el primer LLM en 2018: GPT -1 de OpenAI. Los LLM utilizan modelos de incrustación para comprender fragmentos de texto y realizar ciertas tareas de lenguaje natural, como responder preguntas o traducción automática. Las LLM son esencialmente modelos de PNL que fueron rebautizados debido a la gran cantidad de datos con los que están entrenados, de ahí la palabra grande en LLM. El siguiente gráfico muestra la cantidad de datos (parámetros) utilizados para entrenar modelos de ML a lo largo de los años. Se puede observar un aumento dramático en 2017 después de que se publicara la arquitectura del transformador. Figura 7: Recuento de parámetros de sistemas ML a través del tiempo. Fuente: ¿Por qué son tan populares la búsqueda vectorial y los LLM? Como se indicó anteriormente, la tecnología para la búsqueda vectorial estaba completamente disponible en 2016. Sin embargo, no se hizo particularmente popular hasta finales de 2022. ¿Por qué? Aunque la industria del aprendizaje automático ha estado muy activa desde 2018, los LLM no estuvieron ampliamente disponibles ni fueron fáciles de usar hasta la versión OpenAI de ChatGPT en noviembre de 2022. El hecho de que OpenAI permitiera a todos interactuar con un LLM con una simple charla es la clave de su éxito. ChatGPT revolucionó la industria al permitir que la persona promedio interactúe con algoritmos de PNL de una manera que de otro modo habría sido reservada para investigadores y científicos. Como se puede ver en la siguiente figura, el avance de OpenAI llevó a que la popularidad de los LLM se disparara. Al mismo tiempo, ChatGPT se convirtió en una herramienta convencional utilizada por el público en general. La influencia de OpenAI en la popularidad de los LLM también se evidencia por el hecho de que tanto OpenAI como LLM tuvieron su primer pico de popularidad simultáneamente. (Vea la figura 8.) Figura 8: Popularidad de los términos LLM y OpenAI a lo largo del tiempo. Fuente: v Google Trends Aquí está el por qué. Los LLM son tan populares porque OpenAI los hizo famosos con la versión de ChatGPT. Buscar y almacenar grandes cantidades de vectores se convirtió en un desafío. Esto se debe a que los LLM funcionan con incrustaciones. Así, la adopción de la búsqueda vectorial aumentó en tándem. Este es el mayor factor que contribuye al cambio de la industria. Este cambio resultó en que muchas empresas de datos introdujeran asistencia técnica para la búsqueda de vectores y otras funcionalidades relacionadas con los LLM y la IA detrás de ellos. Conclusión La búsqueda vectorial es un disruptor moderno. El creciente valor tanto de las incrustaciones vectoriales como de los procesos de búsqueda matemática avanzada ha catalizado la adopción de la búsqueda vectorial para transformar el campo de la recuperación de información. La generación de vectores y la búsqueda de vectores pueden ser procesos independientes, pero cuando trabajan juntos, su potencial es ilimitado. Para obtener más información, visite nuestra página de producto Atlas Vector Search . Para comenzar a utilizar Vector Search, regístrese en Atlas o acesso en su cuenta.

November 16, 2023

Atlas Vector Search Commands Highest Developer NPS in Retool State of AI 2023 Survey

This post is also available in: Deutsch , Français , 中文 , Español , Português . Retool has just published its first-ever State of AI report and it's well worth a read. Modeled on its massively popular State of Internal Tools report, the State of AI survey took the pulse of over 1,500 tech folks spanning software engineering, leadership, product managers, designers, and more drawn from a variety of industries. The survey’s purpose is to understand how these tech folk use and build with artificial intelligence (AI). As a part of the survey, Retool dug into which tools were popular, including the vector databases used most frequently with AI. The survey found MongoDB Atlas Vector Search commanded the highest Net Promoter Score (NPS) and was the second most widely used vector database - within just five months of its release. This places it ahead of competing solutions that have been around for years. In this blog post, we’ll examine the phenomenal rise of vector databases and how developers are using solutions like Atlas Vector Search to build AI-powered applications. We’ll also cover other key highlights from the Retool report. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Vector database adoption: Off the charts (well almost...) From mathematical curiosity to the superpower behind generative AI and LLMs, vector embeddings and the databases that manage them have come a long way in a very short time. Check out DB-Engines trends in database models over the past 12 months and you'll see that vector databases are head and shoulders above all others in popularity change. Just look at the pink line’s "up and to the right" trajectory in the chart below. Screenshot courtesy of DB-engines, November 8, 2023 But why have vector databases become so popular? They are a key component in a new architectural pattern called retrieval-augmented generation — otherwise known as RAG — a potent mix that combines the reasoning capabilities of pre-trained, general-purpose LLMs and feeds them real-time, company-specific data. The results are AI-powered apps that uniquely serve the business — whether that’s creating new products, reimagining customer experiences, or driving internal productivity and efficiency to unprecedented heights. Vector embeddings are one of the fundamental components required to unlock the power of RAG. Vector embedding models encode enterprise data, no matter whether it is text, code, video, images, audio streams, or tables, as vectors. Those vectors are then stored, indexed, and queried in a vector database or vector search engine, providing the relevant input data as context to the chosen LLM. The result are AI apps grounded in enterprise data and knowledge that is relevant to the business, accurate, trustworthy, and up-to-date. As the Retool survey shows, the vector database landscape is still largely greenfield. Fewer than 20% of respondents are using vector databases today, but with the growing trend towards customizing models and AI infrastructure, adoption is guaranteed to grow. Why are developers adopting Atlas Vector Search? Retool's State of AI survey features some great vector databases that have blazed a trail over the past couple of years, especially in applications requiring context-aware semantic search. Think product catalogs or content discovery. However, the challenge developers face in using those vector databases is that they have to integrate them alongside other databases in their application’s tech stack. Every additional database layer in the application tech stack adds yet another source of complexity, latency, and operational overhead. This means they have another database to procure, learn, integrate (for development, testing, and production), secure and certify, scale, monitor, and back up, And this is all while keeping data in sync across these multiple systems. MongoDB takes a different approach that avoids these challenges entirely: Developers store and search native vector embeddings in the same system they use as their operational database. Using MongoDB’s distributed architecture, they can isolate these different workloads while keeping the data fully synchronized. Search Nodes provide dedicated compute and workload isolation that is vital for memory-intensive vector search workloads, thereby enabling improved performance and higher availability With MongoDB’s flexible and dynamic document schema, developers can model and evolve relationships between vectors, metadata, and application data in ways other databases cannot. They can process and filter vector and operational data in any way the application needs with an expressive query API and drivers that support all of the most popular programming languages. Using the fully managed MongoDB Atlas developer data platform empowers developers to achieve the scale, security, and performance that their application users expect. What does this unified approach mean for developers? Faster development cycles, higher performing apps providing lower latency with fresher data, coupled with lower operational overhead and cost. Outcomes that are reflected in MongoDB’s best-in-class NPS score. Atlas Vector Search is robust, cost-effective, and blazingly fast! Saravana Kumar, CEO, Kovai discussing the development of his company’s AI assistant Check out our Building AI with MongoDB blog series (head to the Getting Started section to see the back issues). Here you'll see Atlas Vector Search used for GenAI-powered applications spanning conversational AI with chatbots and voicebots, co-pilots, threat intelligence and cybersecurity, contract management, question-answering, healthcare compliance and treatment assistants, content discovery and monetization, and more. MongoDB was already storing metadata about artifacts in our system. With the introduction of Atlas Vector Search, we now have a comprehensive vector-metadata database that’s been battle-tested over a decade and that solves our dense retrieval needs. No need to deploy a new database we'd have to manage and learn. Our vectors and artifact metadata can be stored right next to each other. Pierce Lamb, Senior Software Engineer on the Data and Machine Learning team at VISO TRUST What can you learn about the state of AI from the Retool report? Beyond uncovering the most popular vector databases, the survey covers AI from a range of perspectives. It starts by exploring respondents' perceptions of AI. (Unsurprisingly, the C-suite is more bullish than individual contributors.) It then explores investment priorities, AI’s impact on future job prospects, and how it will likely affect developers and the skills they need in the future. The survey then explores the level of AI adoption and maturity. Over 75% of survey respondents say their companies are making efforts to get started with AI, with around half saying these were still early projects, and mainly geared towards internal applications. The survey goes on to examine what those applications are, and how useful the respondents think they are to the business. It finds that almost everyone’s using AI at work, whether they are allowed to or not, and then identifies the top pain points. It's no surprise that model accuracy, security, and hallucinations top that list. The survey concludes by exploring the top models in use. Again no surprise that Open AI’s offerings are leading the way, but it also indicates growing intent to use open source models along with AI infrastructure and tools for customization in the future. You can dig into all of the survey details by reading the report . Getting started with Atlas Vector Search Eager to take a look at our Vector Search offering? Head over to our Atlas Vector Search product page . There you will find links to tutorials, documentation, and key AI ecosystem integrations so you can dive straight into building your own genAI-powered apps . If you want to learn more about the high level possibilities of Vector Search, then download our Embedding Generative AI whitepaper.

November 13, 2023

Building AI with MongoDB: Giving Your Apps a Voice

In previous posts in this series, we covered how generative AI and MongoDB are being used to unlock value from data of any modality and in supercharging communications . Put those topics together, and we can start to harness the most powerful communications medium (arguably!) of them all: Voice . Voice brings context, depth, and emotion in ways that text, images, and video alone simply cannot. Or as the ancient Chinese Proverb tells us, “The tongue can paint what the eyes can’t see.” The rise of voice technology has been a transformative journey that spans over a century, from the earliest days of radio and telephone communication to the cutting-edge realm of generative AI. It began with the invention of the telephone in the late 19th century, enabling voice conversations across distances. The evolution continued with the advent of radio broadcasting, allowing mass communication through spoken word and music. As technology advanced, mobile communications emerged, making voice calls accessible anytime, anywhere. Today, generative AI, powered by sophisticated machine learning (ML) models, has taken voice technology to unprecedented levels. The generation of human-like voices and text-to-speech capabilities are one example. Another is the ability to detect sentiment and create summaries from voice communications. These advances are revolutionizing how we interact with technology and information in the age of intelligent software. In this post, we feature three companies that are harnessing the power of voice with generative AI to build completely new classes of user experiences: Xoltar uses voice along with vision to improve engagement and outcomes for patients through clinical treatment and recovery. Cognigy puts voice at the heart of its conversational AI platform, integrating with back-office CRM, ERP, and ticketing systems for some of the world’s largest manufacturing, travel, utility, and ecommerce companies. Artificial Nerds enables any company to enrich its customer service with voice bots and autonomous agents. Let's learn more about the role voice plays in each of these very different applications. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. GenAI companion for patient engagement and better clinical outcomes XOLTAR is the first conversational AI platform designed for long-lasting patient engagement. XOLTAR’s hyper-personalized digital therapeutic app is led by Heather, XOLTAR’s live AI agent. Heather is able to conduct omni-channel interactions, including live video chats. The platform is able to use its multimodal architecture to better understand patients, get more data, increase engagement, create long-lasting relationships, and ultimately achieve real behavioral changes. Figure 1: About 50% of patients fail to stick to prescribed treatments. Through its app and platform, XOLTAR is working to change this, improving outcomes for both patients and practitioners. It provides physical and emotional well-being support through a course of treatment, adherence to medication regimes, monitoring post-treatment recovery, and collection of patient data from wearables for remote analysis and timely interventions. Powering XOLTAR is a sophisticated array of state-of-the-art machine learning models working across multiple modalities — voice and text, as well as vision for visual perception of micro-expressions and non-verbal communication. Fine-tuned LLMs coupled with custom multilingual models for real-time automatic speech recognition and various transformers are trained and deployed to create a truthful, grounded, and aligned free-guided conversation. XOLTAR’s models personalize each patient’s experience by retrieving data stored in MongoDB Atlas . Taking advantage of the flexible document model, XOLTAR developers store both structured data, such as patient details and sensor measurements from wearables, alongside unstructured data, such as video transcripts. This data provides both long-term memory for each patient as well as input for ongoing model training and tuning. MongoDB also powers XOLTAR’S event-driven data pipelines. Follow-on actions generated from patient interactions are persisted in MongoDB, with Atlas Triggers notifying downstream consuming applications so they can react in real-time to new treatment recommendations and regimes. Through its participation in the MongoDB AI Innovators program , XOLTAR’s development team receives access to free Atlas credits and expert technical support, helping them de-risk new feature development. How Cognigy built a leading conversational AI solution Cognigy delivers AI solutions that empower businesses to provide exceptional customer service that is instant, personalized, in any language, and on any channel. Its main product, Cognigy.AI, allows companies to create AI Agents, improving experiences through smart automation and natural language processing. This powerful solution is at the core of Cognigy's offerings, making it easy for businesses to develop and deploy intelligent voice and chatbots. Developing a conversational AI system poses challenges for any company. These solutions must effectively interact with diverse systems like CRMs, ERPs, and ticketing systems. This is where Cognigy introduces the concept of a centralized platform. This platform allows you to construct and deploy agents through an intuitive low-code user interface. Cognigy took a deliberate approach when constructing the platform, employing a composable architecture model, as depicted in Figure 1 below. To achieve this, it designed over 30 specialized microservices, adeptly orchestrated through Kubernetes. These microservices were strategically fortified with MongoDB's replica sets, spanning across three availability zones. In addition, sophisticated indexing and caching strategies were integrated to enhance query performance and expedite response times. Figure 2: Congnigy's composable architecture model platform MongoDB has been a driving force behind Cognigy's unprecedented flexibility and scalability and has been instrumental in bringing groundbreaking products like Cognigy.AI to life. Check out the Cognigy case study to learn more about their architecture and how they use MongoDB. The power of custom voice bots without the complexity of fine-tuning Founded in 2017, Artificial Nerds assembled a group of creative, passionate, and "nerdy" technologists focused on unlocking the benefits of AI for all businesses. Its aim was to liberate teams from repetitive work, freeing them up to spend more time building closer relationships with their clients. The result is a suite of AI-powered products that improve customer sales and service. These include multimodal bots for conversational AI via voice and chat along with intelligent hand-offs to human operators for live chat. These are all backed by no-code functions to integrate customer service actions with backend business processes and campaigns. Originally the company’s ML engineers fine-tuned GPT and BERT language models to customize its products for each one of its clients. This was a time-consuming and complex process. The maturation of vector search and tooling to enable Retrieval-Augmented Generation (RAG) has radically simplified the workflow, allowing Artificial Nerds to grow its business faster. Artificial Nerds started using MongoDB in 2019, taking advantage of its flexible schema to provide long-term memory and storage for richly structured conversation history, messages, and user data. When dealing with customers, it was important for users to be able to quickly browse and search this history. Adopting Atlas Search helped the company meet this need. With Atlas Search, developers were able to spin up a powerful full-text index right on top of their database collections to provide relevance-based search across their entire corpus of data. The integrated approach offered by MongoDB Atlas avoided the overhead of bolting on a separate search engine and creating an ETL mechanism to sync with the database. This eliminated the cognitive overhead of developing against, and operating, separate systems. The release of Atlas Vector Search unlocks those same benefits for vector embeddings. The company has replaced its previously separate standalone vector database with the integrated MongoDB Atlas solution. Not only has this improved the productivity of its developers, but it has also improved the customer experience by reducing latency 4x . Artificial Nerds is growing fast, with revenues expanding 8% every month. The company continues to push the boundaries of customer service by experimenting with new models including the Llama 2 LLM and multilingual sentence transformers hosted in Hugging Face. Being part of the MongoDB AI Innovators program helps Artificial Nerds stay abreast of all of the latest MongoDB product enhancements and provides the company with free Atlas credits to build new features. Getting started Check out our MongoDB for AI page to get access to all of the latest resources to help you build. We see developers increasingly adopting state-of-the-art multimodal models and MongoDB Atlas Vector Search to work with data formats that have previously been accessible only to those organizations with access to the very deepest data science resources. Check out some examples from our previous Building AI with MongoDB blog post series here: Building AI with MongoDB: first qualifiers includes AI at the network edge for computer vision and augmented reality, risk modeling for public safety, and predictive maintenance paired with Question-Answering generation for maritime operators. Building AI with MongoDB: compliance to copilots features AI in healthcare along with intelligent assistants that help product managers specify better products and sales teams compose emails that convert 2x higher. Building AI with MongoDB: unlocking value from multimodal data showcases open source libraries that transform unstructured data into a usable JSON format, entity extraction for contracts management, and making sense of “dark data” to build customer service apps. Building AI with MongoDB: Cultivating Trust with Data covers three key customer use cases of improving model explainability, securing generative AI outputs, and transforming cyber intelligence with the power of MongoDB. Building AI with MongoDB: Supercharging Three Communication Paradigms features developer tools that bring AI to existing enterprise data, conversational AI, and monetization of video streams and the metaverse. There is no better time to release your own inner voice and get building!

November 8, 2023

Announcing LangChain Templates for MongoDB Atlas

Since announcing the public preview of MongoDB Atlas Vector Search back in June, we’ve seen tremendous adoption by developers working to build AI-powered applications. The ability to store, index, and query vector embeddings right alongside their operational data in a single, unified platform dramatically boosts engineering velocity while keeping their technology footprint streamlined and efficient. Atlas Vector Search is used by developers as a key part of the Retrieval-Augmented Generation (RAG) pattern. RAG is used to feed LLMs with the additional data they need to ground their responses, providing outputs that are reliable, relevant, and accurate for the business. One of the key enabling technologies being used to bring external data into LLMs is LangChain. Just one example is healthcare innovator Inovaare who is building AI with MongoDB and LangChain for document classification, information extraction and enrichment, and chatbots over medical data. Now making it even easier for developers to build AI-powered apps, we are excited to announce our partnership with LangChain in the launch of LangChain Templates ! We have worked with LangChain to create a RAG template using MongoDB Atlas Vector Search and OpenAI . This easy-to-use template can help developers build and deploy a Chatbot application over their own proprietary data. LangChain Templates offer a reference architecture that’s easily deployable as a REST API using LangServe . We have also been working with LangChain to release the latest features of Atlas Vector Search, like the recently announced dedicated vector search aggregation stage $vectorSearch, to both the MongoDB LangChain python integration as well as the MongoDB LangChain Javascript integration . Similarly, we will continue working with LangChain to create more templates, that will allow developers to bring their ideas to production faster. If you’re building AI-powered apps on MongoDB, we’d love to hear from you. Sign up to our AI Innovators program where successful applicants receive no-cost MongoDB Atlas credits to develop apps, access to technical resources, and the opportunity to showcase your work to the broader AI community.

November 2, 2023

Retrieval Augmented Generation (RAG): The Open-Book Test for GenAI

The release of ChatGPT in November 2022 marked a groundbreaking moment for AI, introducing the world to an entirely new realm of possibilities created by the fusion of generative AI and machine learning foundation models, or large language models (LLMs). In order to truly unlock the power of LLMs, organizations need to not only access the innovative commercial and open-source models but also feed them vast amounts of quality internal and up-to-date data. By combining a mix of proprietary and public data in the models, organizations can expect more accurate and relevant LLM responses that better mirror what's happening at the moment. The ideal way to do this today is by leveraging retrieval-augmented generation (RAG), a powerful approach in natural language processing (NLP) that combines information retrieval and text generation. Most people by now are familiar with the concept of prompt engineering, which is essentially augmenting prompts to direct the LLM to answer in a certain way. With RAG, you're augmenting prompts with proprietary data to direct the LLM to answer in a certain way based on contextual data. The retrieved information serves as a basis for generating coherent and contextually relevant text. This combination allows AI models to provide more accurate, informative, and context-aware responses to queries or prompts. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Applying retrieval-augmented generation (RAG) in the real world Let's use a stock quote as an example to illustrate the usefulness of retrieval-augmented generation in a real-world scenario. Since LLMs aren't trained on recent data like stock prices, the LLM will hallucinate and make up an answer or deflect from answering the question entirely. Using retrieval-augmented generation, you would first fetch the latest news snippets from a database (often using vector embeddings in a vector database or MongoDB Atlas Vector Search ) that contains the latest stock news. Then, you insert or "augment" these snippets into the LLM prompt. Finally, you instruct the LLM to reference the up-to-date stock news in answering the question. With RAG, because there is no retraining of the LLM required, the retrieval is very fast (sub 100 ms latency) and well-suited for real-time applications. Another common application of retrieval-augmented generation is in chatbots or question-answering systems. When a user asks a question, the system can use the retrieval mechanism to gather relevant information from a vast dataset, and then it generates a natural language response that incorporates the retrieved facts. RAG vs. fine-tuning Users will immediately bump up against the limits of GenAI anytime there's a question that requires information that sits outside the LLM's training corpus, resulting in hallucinations, inaccuracies, or deflection. RAG fills in the gaps in knowledge that the LLM wasn't trained on, essentially turning the question-answering task into an “open-book quiz,” which is easier and less complex than an open and unbounded question-answering task. Fine-tuning is another way to augment LLMs with custom data, but unlike RAG it's like giving it entirely new memories or a lobotomy. It's also time- and resource-intensive, generally not viable for grounding LLMs in a specific context, and especially unsuitable for highly volatile, time-sensitive information and personal data. Conclusion Retrieval-augmented generation can improve the quality of generated text by ensuring it's grounded in relevant, contextual, real-world knowledge. It can also help in scenarios where the AI model needs to access information that it wasn't trained on, making it particularly useful for tasks that require factual accuracy, such as research, customer support, or content generation. By leveraging RAG with your own proprietary data, you can better serve your current customers and give yourself a significant competitive edge with reliable, relevant, and accurate AI-generated output. To learn more about how Atlas helps organizations integrate and operationalize GenAI and LLM data, download our white paper, Embedding Generative AI and Advanced Search into your Apps with MongoDB . If you're interested in leveraging generative AI at your organization, reach out to us today and find out how we can help your digital transformation.

October 26, 2023

4 Key Considerations for Unlocking the Power of GenAI

Artificial intelligence is evolving at an unprecedented pace, and generative AI (GenAI) is at the forefront of the revolution. GenAI capabilities are vast, ranging from text generation to music and art creation. But what makes GenAI truly unique is its ability to deeply understand context, producing outputs that closely resemble that of humans. It's not just about conversing with intelligent chatbots. GenAI has the potential to transform industries, providing richer user experiences and unlocking new possibilities. In the coming months and years, we'll witness the emergence of applications that leverage GenAI's power behind the scenes, offering capabilities never before seen. Unlike now popular chatbots like ChatGPT, users won't necessarily realize that GenAI is working in the background. But behind the scenes, these new applications are combining information retrieval and text generation to deliver truly personalized and contextual user experiences in real-time. This process is called retrieval-augmented generation, or RAG for short. So, how does retrieval-augmented generation (RAG) work, and what role do databases play in this process? Let's delve deeper into the world of GenAI and its database requirements. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. The challenge of training AI foundation models One of the primary challenges with GenAI is the lack of access to private or proprietary data. AI foundation models, of which large language models (LLMs) are a subset, are typically trained on publicly available data but do not have access to confidential or proprietary information. Even if the data were in the public domain, it might be outdated and irrelevant. LLMs also have limitations in recognizing very recent events or knowledge. Furthermore, without proper guidance, LLMs may produce inaccurate information, which is unacceptable in most situations. Databases play a crucial role in addressing these challenges. Instead of sending prompts directly to LLMs, applications can use databases to retrieve relevant data and include it in the prompt as context. For example, a banking application could query the user's transaction data from a legacy database, add it to the prompt, and then send this engineered prompt to the LLM. This approach ensures that the LLM generates accurate and up-to-date responses, eliminating the issues of missing data, stale data, and inaccuracies. Top 4 database considerations for GenAI applications It won't be easy for businesses to achieve real competitive advantage leveraging GenAI when everyone has access to the same tools and knowledge base. Rather, the key to differentiation will come from layering your own unique proprietary data on top of Generative AI powered by foundation models and LLMs. There are four key considerations organizations should focus on when choosing a database to leverage the full potential of GenAI-powered applications: Queryability: The database needs to be able to support rich, expressive queries and secondary indexes to enable real-time, context-aware user experiences. This capability ensures data can be retrieved in milliseconds, regardless of the complexity of the query or the size of data stored in the database. Flexible data model: GenAI applications often require different types and formats of data, referred to as multi-modal data. To accommodate these changing data sets, databases should have a flexible data model that allows for easy onboarding of new data without schema changes, code modifications, or version releases. Multi-modal data can be challenging for relational databases because they're designed to handle structured data, where information is organized into tables with rows and columns, with strict schema rules. Integrated vector search: GenAI applications may need to perform semantic or similarity queries on different types of data, such as free-form text, audio, or images. Vector embeddings in a vector database enable semantic or similarity queries. Vector embeddings capture the semantic meaning and contextual information of data making them suitable for various tasks like text classification, machine translation, and sentiment analysis. Databases should provide integrated vector search indexing to eliminate the complexity of keeping two separate systems synchronized and ensuring a unified query language for developers. Scalability: As GenAI applications grow in terms of user base and data size, databases must be able to scale out dynamically to support increasing data volumes and request rates. Native support for scale-out sharding ensures that database limitations aren't blockers to business growth. The ideal database solution: MongoDB Atlas MongoDB Atlas is a powerful and versatile platform for handling the unique demands of GenAI. MongoDB uses a powerful query API that makes it easy to work with multi-modal data, enabling developers to deliver more with less code. MongoDB is the most popular document database as rated by developers. Working with documents is easy and intuitive for developers because documents map to objects in object-oriented programming, which are more familiar than the endless rows and tables in relational databases. Flexible schema design allows for the data model to evolve to meet the needs of GenAI use cases, which are inherently multi-modal. By using sharding, Atlas scales out to support large increases in the volume of data and requests that come with GenAI-powered applications. MongoDB Atlas Vector Search embeds vector search indexing natively so there's no need to maintain two different systems. Atlas keeps Vector Search indexes up to date with the source data constantly. Developers can use a single endpoint and query language to construct queries that combine regular database query filters and vector search filters. This removes friction and provides an environment for developers to prototype and deliver GenAI solutions rapidly. Conclusion GenAI is poised to reshape industries and provide innovative solutions across sectors. With the right database solution, GenAI applications can thrive, delivering accurate, context-aware, and dynamic data-driven user experiences that meet the growing demands of today's fast-paced digital landscape. With MongoDB Atlas, organizations can unlock agility, productivity, and growth, providing a competitive edge in the rapidly evolving world of generative AI. To learn more about how Atlas helps organizations integrate and operationalize GenAI and LLM data, download our white paper, Embedding Generative AI and Advanced Search into your Apps with MongoDB . If you're interested in leveraging generative AI at your organization, reach out to us today and find out how we can help your digital transformation.

October 26, 2023

Vector Search and LLM Essentials - What, When and Why

This post is also available in: Deutsch , Français , Español , Português Vector search and, more broadly, Artificial Intelligence (AI) are more popular now than ever. These terms are arising everywhere. Technology companies around the globe are scrambling to release vector search and AI features in an effort to be part of this growing trend. As a result, it's unusual to come across a homepage for a data-driven business and not see a reference to vector search or large language models (LLMs). In this blog, we'll cover what these terms mean while examining the events that led to their current trend. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. What is vector search Vectors are encoded representations of unstructured data like text, images, and audio in the form of arrays of numbers. Figure 1: Data is turned into vectors by embedding models These vectors are produced by machine learning (ML) techniques called "embedding models". These models are trained on large corpuses of data. Embedding models effectively capture meaningful relationships and similarities between data. This enables users to query data based on the meaning rather than the data itself. This fact unlocks more efficient data analysis tasks like recommendation systems, language understanding, and image recognition. Every search starts with a query and, in vector search, the query is represented by a vector. The job of vector search is finding, from the vectors stored on a database, those that are most similar to the vector of the query. This is the basic premise. It is all about similarity . This is why vector search is often called similarity search. Note: similarity also applies to ranking algorithms that work with non-vector data. To understand the concept of vector similarity, let’s picture a three-dimensional space. In this space, the location of a data point is fully determined by three coordinates. Figure 2: Location of a point P in a three-dimensional space In the same way, if a space has 1024 dimensions, it takes 1024 coordinates to locate a data point. Figure 3: Point P in a sphere that represents a multi-dimensional space Vectors also provide the location of data points in multidimensional spaces. In fact, we can treat the values in a vector as an array of coordinates. Once we have the location of the data points — the vectors — their similarity to each other is calculated by measuring the distance between them in the vector space. Points that are closer to each other in the vector space represent concepts that are more similar in meaning. For example, "tire" has a greater similarity to "car" and a lesser one to "airplane." However, "wing" would only have a similarity to "airplane." Therefore, the distance between the vectors for “tire” and “car” would be smaller than the distance between the vectors for “tire” and “airplane.” Yet, the distance between “wing” and “car” would be enormous. In other words, “tire” is relevant when we talk about a “car,” and to a lesser extent, an “airplane.” However, a “wing” is only relevant when we talk about an “airplane” and not relevant at all when we talk about a “car” (at least until flying cars are a viable mode of transport). The contextualization of data — regardless of the type — allows vector search to retrieve the most relevant results to a given query. A simple example of similarity Table 1: Example of similarity between different terms What are Large Language Models? LLMs are what bring AI to the vector search equation. LLMs and human minds both understand and associate concepts in order to perform certain natural language tasks, such as following a conversation or understanding an article. LLMs, like humans, need training in order to understand different concepts. For example, do you know what the term “corium” pertains to? Unless you're a nuclear engineer, probably not. The same happens with LLMs: if they are not trained in a specific domain, they are not able to understand concepts and therefore perform poorly. Let’s look at an example. LLMs understand pieces of text thanks to their embedding layer. This is where words or sentences are converted into vectors. In order to visualize vectors, we are going to use word clouds. Word clouds are closely related to vectors in the sense that they are representations of concepts and their context. First, let’s see the word cloud that an embedding model would generate for the term “corium” if it was trained with nuclear engineering data: Figure 4: Sample word cloud from a model trained with nuclear data As shown in the picture above, the word cloud indicates that corium is a radioactive material that has something to do with safety and containment structures. But, corium is a special term that can also be applied to another domain. Let’s see the word cloud resulting from an embedding model that has been trained in biology and anatomy: Figure 5: Sample word cloud from a model trained with biology data In this case, the word cloud indicates that corium is a concept related to skin and its layers. What happened here? Is one of the embedding models wrong? No. They have both been trained with different data sets. That is why finding the most appropriate model for a specific use case is crucial. One common practice in the industry is to adopt a pre-trained embedding model with strong background knowledge. One takes this model and then fine-tunes it with the domain-specific knowledge needed to perform particular tasks. The quantity and quality of the data used to train a model is relevant as well. We can agree that a person who has read just one article on aerodynamics will be less informed on the subject than a person who studied physics and aerospace engineering. Similarly, models that are trained with large sets of high-quality data will be better at understanding concepts and generate vectors that more accurately represent them. This creates the foundation for a successful vector search system. It is worth noting that although LLMs use text embedding models, vector search goes beyond that. It can deal with audio, images, and more. It is important to remember that the embedding models used for these cases share the same approach. They also need to be trained with data — images, sounds, etc. — in order to be able to understand the meaning behind it and create the appropriate similarity vectors. When was vector search created? MongoDB Atlas Vector Search currently provides three approaches to calculate vector similarity. These are also referred to as distance metrics, and consist of: euclidean distance cosine product dot product While each metric is different, for the purpose of this blog, we will focus on the fact that they all measure distance. Atlas Vector Search feeds these distance metrics into an approximate nearest neighbor (ANN) algorithm to find the stored vectors that are most similar to the vector of the query. In order to speed this process up, vectors are indexed using an algorithm called hierarchical navigable small world (HNSW). HNSW guides the search through a network of interconnected data points so that only the most relevant data points are considered. Using one of the three distance metrics in conjunction with the HNSW and KNN algorithms constitutes the foundation for performing vector search on MongoDB Atlas. But, how old are these technologies? We would think they are recent inventions by a bleeding-edge quantum computing lab, but the truth is far from that. Figure 6: Timeline of vector search technologies Euclidean distance was formulated in the year 300 BC, the cosine and the dot product in 1881, the KNN algorithm in 1951, and the HNSW algorithm in 2016. What this means is that the foundations for state-of-the-art vector search were fully available back in 2016. So, although vector search is today’s hot topic, it has been possible to implement it for several years. When were LLMs created? In 2017, there was a breakthrough: the transformer architecture . Presented in the famous paper Attention is all you need , this architecture introduced a neural network model for natural language processing (NLP) tasks. This enabled ML algorithms to process language data on an order of magnitude greater than was previously possible. As a result, the amount of information that could be used to train the models increased exponentially. This paved the way for the first LLM to appear in 2018: GPT-1 by OpenAI. LLMs use embedding models to understand pieces of text and perform certain natural language tasks like question answering or machine translation. LLMs are essentially NLP models that were re-branded due to the large amount of data they are trained with — hence the word large in LLM. The graph below shows the amount of data — parameters — used to train ML models over the years. A dramatic increase can be observed in 2017 after the transformer architecture was published. Figure 7: Parameter count of ML systems through time. Source: Why are vector search and LLMs so popular? As stated above, the technology for vector search was fully available back in 2016. However, it did not become particularly popular until the end of 2022. Why? Although the ML industry has been very active since 2018, LLMs were not widely available or easy to use until OpenAI’s release of ChatGPT in November 2022. The fact that OpenAI allowed everyone to interact with an LLM with a simple chat is the key to its success. ChatGPT revolutionized the industry by enabling the average person to interact with NLP algorithms in a way that would have otherwise been reserved for researchers and scientists. As can be seen in the figure below, OpenAI’s breakthrough led to the popularity of LLMs skyrocketing. Concurrently, ChatGPT became a mainstream tool used by the general public. The influence of OpenAI on the popularity of LLMs is also evidenced by the fact that both OpenAI and LLMs had their first popularity peak simultaneously. (See figure 8.) Figure 8: Popularity of the terms LLM and OpenAI over time. Source: Google Trends Here is why. LLMs are so popular because OpenAI made them famous with the release of ChatGPT. Searching and storing large amounts of vectors became a challenge. This is because LLMs work with embeddings. Thus the adoption of vector search increased in tandem. This is the largest contributing factor to the industry shift. This shift resulted in many data companies introducing support for vector search and other functionalities related to LLMs and the AI behind them. Conclusion Vector search is a modern disruptor. The increasing value of both vector embeddings and advanced mathematical search processes has catalyzed vector search adoption to transform the field of information retrieval. Vector generation and vector search might be independent processes, but when they work together their potential is limitless. To learn more visit our Atlas Vector Search product page. To get started using Vector Search, sign up for Atlas or log in to your account.

October 16, 2023

Building AI with MongoDB: Supercharging Three Communication Paradigms

Communication mediums are core to who we are as humans, from understanding each other to creating bonds and a shared purpose. The methods of communication have evolved over thousands of years, from cave drawings and scriptures to now being able to connect with anyone at any time via internet-enabled devices. The latest paradigm shift to supercharge communication is through the use and application of natural language processing and artificial intelligence. In our latest roundup of AI innovators building with MongoDB, we’re going to focus on three companies building the future across three mediums of communication: data, language, and video. Our blog begins by featuring SuperDuperDB . The company provides tools for developers to apply AI and machine learning on top of their existing data stores for generative AI applications such as chatbots, Question-Answering (Q-A), and summarization. We then cover Algomo , who uses generative AI to help companies offer their best and most personalized service to customers and employees across more than 100 languages. Finally, Source Digital is a monetization platform delivering a new era of customer engagement through video and the metaverse. Let’s dive in to learn more about each company and use case. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Bringing AI to your database SuperDuperDB is an open-source Python package providing tools for developers to apply AI and machine learning on top of their existing data stores. Developers and data scientists continue to use their preferred tools, avoiding both data migration and duplication to specialized data stores. They also have the freedom to run SuperDuperDB anywhere, avoiding lock-in to any one AI ecosystem. With SuperDuperDB developers can: Deploy their chosen AI models to automatically compute outputs (inference) in their database in a single environment with simple Python commands. Train models on their data simply by querying without additional ingestion and pre-processing. Integrate AI APIs (such as OpenAI) to work together with other models on their data effortlessly. Search data with vector search, including model management and serving. Today SuperDuperDB supports MongoDB alongside select relational databases, cloud data warehouses, data lake houses, and object stores. SuperDuperDB provides an array of sample use cases and notebooks that developers can use to get started including vector search with MongoDB, multimodal search, retrieval augmented generation (RAG), transfer learning, and many more. The team has also built an AI chatbot app that allows users to ask questions about technical documentation. The app is built on top of MongoDB and OpenAI with FastAPI and React (FARM stack) + SuperDuperDB. It showcases how easily developers can build next-generation AI applications on top of their existing data stores with SuperDuperDB. You can try the app and read more about how it is built at SuperDuperDB's documentation . “We integrate MongoDB as one of the key backend databases for our platform, the PyMongo driver for the app connectivity and Atlas Vector Search for storing and querying vector embeddings” , said Duncan Blythe, co-founder of SuperDuperDB. “It therefore made sense for us to partner more closely with the company through MongoDB Ventures . We get direct access to the MongoDB engineering team to help optimize our product, along with visibility within MongoDB’s vast ecosystem of developers.” Here are some useful links to learn more: SuperDuperDB Github SuperDuperDB Docs Intro SuperDuperDB Use Cases Page SuperDuberDB Blog Conversational support, powered by generative AI Algomo uses generative AI to help companies offer their best service to both their customers and employees across more than 100 languages. The company’s name is a portmanteau of the words Algorithm (originating from Arabic) and Homo, (human in Latin). It reflects the two core design principles underlying Algomo’s products: Human-centered AI that amplifies and augments rather than displaces human abilities. Inclusive AI that is accessible to all, and that is non-discriminatory and unbiased in its outputs. With Algomo, customers can get a ChatGPT-powered bot up on their site in less than 3 minutes. More than just a bot, Algomo also provides a complete conversational platform. This includes Question-Answering text generators and autonomous agents that triage and orchestrate support processes, escalating to human support staff for live chat as needed. It works across any communication channel from web and Google Chat to Intercom, Slack, WhatsApp, and more. Customers can instantly turn their support articles, past conversations, slack channels, Notion pages, Google Docs, and content on their public website into personalized answers. Algomo vectorizes customer content, using that alongside OpenAI’s ChatGPT. The company uses RAG (Retrieval Augmented Generation) prompting to inject relevant context to LLM prompts and Chain-Of-Thought prompting to increase answer accuracy. A fine-tuned implementation of BERT is also used to classify user intent and retrieve custom FAQs. Taking advantage of its flexible document data model, Algomo uses MongoDB Atlas to store customer data alongside conversation history and messages, providing long-term memory for context and continuity in support interactions. As a fully managed cloud service, Algomo’s team can leave all of the operational heavy lifting to MongoDB, freeing its team up to focus on building great conversational experiences. The team considers using MongoDB as a “no-brainer,” allowing them to iterate quickly while removing the support burden via the simplicity and reliability of the Atlas platform. The company’s engineers are now evaluating Atlas Vector Search as a replacement for its current standalone vector database, further reducing costs and simplifying their codebase. Being able to store source data, chunks, and metadata alongside vector embeddings eliminates the overhead and duplication of synchronizing data across two separate systems. The team is also looking forward to using Atlas Vector Search for their upcoming Agent Assist feature that will provide suggested answers, alongside relevant documentation snippets, to customer service agents who are responding to live customer queries. Being part of the AI Innovators program provides Algomo with direct access to MongoDB technical expertise and best practices to accelerate its evaluation of Atlas Vector Search. Free Atlas credits in addition to those provided by the AWS and Azure start-up program help Algomo reduce its development costs. Creating a new media currency with video detection and monetization Source Digital, Inc . is a monetization platform that delivers a new era of customer engagement through video and the metaverse. The company provides tools for content creators and advertisers to display real-time advertisements and content recommendations directly to users on websites or in video streams hosted on platforms like Netflix, YouTube, Meta, and Vimeo. Source Digital engineers built it’s own in-house machine learning and vector embedding models using Google Vision AI and TensorFlow. These models provide computer vision across video streams, detecting elements that automatically trigger the display of relevant ads and recommendations. An SDK is also provided to customers so that they can integrate the video detection models onto their own websites. The company started out using PostgreSQL to store video metadata and model features, alongside the pgvector extension for video vector embeddings. This initial setup worked well at a small scale, but as Source Digital grew, PostgreSQL began to creak with costs rapidly escalating. PostgreSQL can only be scaled vertically, and so the company encountered step changes in costs as they moved to progressively larger cloud instance sizes. Scaling limitations were compounded by the need for queries to execute resource-intensive JOIN operations. These were needed to bring together data in all of the different database tables hosting video metadata, model features, and vector embeddings. With prior MongoDB experience from an earlier audio streaming project, the company’s engineers were confident they could tame their cost challenges. Horizontal scale-out allows MongoDB to grow at much more granular levels, aligning costs with application usage. Expensive JOIN operations are eliminated because of the flexibility of MongoDB’s document data model. Now developers store the metadata, model features, and vector embeddings together in a single record. The company estimates that the migration from PostgreSQL to MongoDB Atlas and Vector Search will reduce monthly costs by 7x . These are savings that can be reinvested into accelerating delivery against the feature backlog. Being part of the MongoDB AI Innovators Program provides Source Digital with access to expert technical advice on scaling its platform, along with co-marketing opportunities to further fuel its growth. What's next? If you are getting started with building AI-enabled apps on MongoDB, sign up for our AI Innovators Program . Successful applicants get access to expert technical advice, free MongoDB Atlas credits, co-marketing opportunities, and – for eligible startups, introductions to potential venture investors. We’ve seen a whole host of interesting use cases and different companies building the future with AI, so you can refer back to some of our earlier blog posts below: Building AI with MongoDB: first qualifiers include AI at the network edge for computer vision and augmented reality; risk modeling for public safety; and predictive maintenance paired with Question-Answering generation for maritime operators. Building AI with MongoDB: compliance to copilots features AI in healthcare along with intelligent assistants that help product managers specify better products and help sales teams compose emails that convert 2x higher. Building AI with MongoDB: unlocking value from multimodal data showcases open source libraries that transform unstructured data into a usable JSON format; entity extraction for contracts management; and making sense of “dark data” to build customer service apps. Building AI with MongoDB: Cultivating Trust with Data covers three key customer use cases improving model explainability, securing generative AI outputs, and transforming cyber intelligence with the power of MongoDB. And please take a look at the MongoDB for Artificial Intelligence resources page for the latest best practices that get you started in turning your idea into an AI-driven reality. Consider joining our AI Innovators Program to build the next big thing in AI with us!

October 16, 2023