Retrieval-Augmented Generation (RAG) with MongoDB

Retrieval-augmented generation (RAG) is an architecture used to augment large language models (LLMs) with additional data so that they can generate more accurate responses. You can implement RAG in your generative AI applications by combining an LLM with a retrieval system powered by Atlas Vector Search.

Get Started

To quickly try RAG with Atlas Vector Search, use the Chatbot Demo Builder in the Atlas Search Playground. To learn more, see Chatbot Demo Builder in Atlas Search Playground.

To implement your own RAG system with Atlas Vector Search, see the tutorial on this page.

Why use RAG?

When working with LLMs, you might encounter the following limitations:

Stale data: LLMs are trained on a static dataset up to a certain point in time. This means that they have a limited knowledge base and might use outdated data.
No access to local data: LLMs don't have access to local or personalized data. Therefore, they can lack knowledge about specific domains.
Hallucinations: When training data is incomplete or outdated, LLMs can generate inaccurate information.

You can address these limitations by taking the following steps to implement RAG:

Ingestion: Store your custom data as vector embeddings in a vector database, such as MongoDB. This allows you to create a knowledge base of up-to-date and personalized data.
Retrieval: Retrieve semantically similar documents from the database based on the user's question by using a search solution, such as Atlas Vector Search. These documents augment the LLM with additional, relevant data.
Generation: Prompt the LLM. The LLM uses the retrieved documents as context to generate a more accurate and relevant response, reducing hallucinations.

Because RAG enables tasks such as question answering and text generation, it's an effective architecture for building AI chatbots that provide personalized, domain-specific responses. To create production-ready chatbots, you must configure a server to route requests and build a user interface on top of your RAG implementation.

RAG with Atlas Vector Search

To implement RAG with Atlas Vector Search, you ingest data into MongoDB, retrieve documents with Atlas Vector Search, and generate responses using an LLM. This section describes the components of a basic, or naive, RAG implementation with Atlas Vector Search. For step-by-step instructions, see Tutorial.

RAG flowchart with MongoDB Vector Search

Learn by Watching

Watch a video that demonstrates how to implement RAG with Atlas Vector Search.

Duration: 5 Minutes

Ingestion

Data ingestion for RAG involves processing your custom data and storing it in a vector database to prepare it for retrieval. To create a basic ingestion pipeline with MongoDB as the vector database, do the following:

Prepare your data.
Load, process, and chunk, your data to prepare it for your RAG application. Chunking involves splitting your data into smaller parts for optimal retrieval.
Convert the data to vector embeddings.
Convert your data into vector embeddings by using an embedding model. To learn more, see How to Create Vector Embeddings.
Store the data and embeddings in MongoDB.
Store these embeddings in your cluster. You store embeddings as a field alongside other data in your collection.

Retrieval

Building a retrieval system involves searching for and returning the most relevant documents from your vector database to augment the LLM with. To retrieve relevant documents with Atlas Vector Search, you convert the user's question into vector embeddings and run a vector search query against the data in your MongoDB collection to find documents with the most similar embeddings.

To perform basic retrieval with Atlas Vector Search, do the following:

Define an Atlas Vector Search index on the collection that contains your vector embeddings.
Choose one of the following methods to retrieve documents based on the user's question:
- Use an Atlas Vector Search integration with a popular framework or service. These integrations include built-in libraries and tools that enable you to easily build retrieval systems with Atlas Vector Search.
- Build your own retrieval system. You can define your own functions and pipelines to run Atlas Vector Search queries specific to your use case.
  To learn how to build a basic retrieval system with Atlas Vector Search, see Tutorial.

Generation

To generate responses, combine your retrieval system with an LLM. After you perform a vector search to retrieve relevant documents, you provide the user's question along with the relevant documents as context to the LLM so that it can generate a more accurate response.

Choose one of the following methods to connect to an LLM:

Use an Atlas Vector Search integration with a popular framework or service. These integrations include built-in libraries and tools to help you connect to LLMs with minimal set-up.
Call the LLM's API. Most AI providers offer APIs to their generative models that you can use to generate responses.
Load an open-source LLM. If you don't have API keys or credits, you can use an open-source LLM by loading it locally from your application. For an example implementation, see the Build a Local RAG Implementation with Atlas Vector Search tutorial.

Tutorial

The following example demonstrates how to implement RAG with a retrieval system powered by Atlas Vector Search.

➤ Use the Select your language drop-down menu to set the language of the examples on this page.

Select your language

Work with a runnable version of this tutorial as a Python notebook.

Prerequisites

To complete this example, you must have the following:

One of the following:
- An Atlas cluster running MongoDB version 6.0.11, 7.0.2, or later. Ensure that your IP address is included in your Atlas project's access list.
- A local Atlas deployment created using the Atlas CLI. To learn more, see Deploy a Local Atlas Cluster.
A Voyage AI API key.
An OpenAI API Key. You must have an OpenAI account with credits available for API requests. To learn more about registering an OpenAI account, see the OpenAI API website.
A terminal and code editor to run your .NET project.
.NET version 8.0 or higher installed.

One of the following:
- An Atlas cluster running MongoDB version 6.0.11, 7.0.2, or later. Ensure that your IP address is included in your Atlas project's access list.
- A local Atlas deployment created using the Atlas CLI. To learn more, see Deploy a Local Atlas Cluster.
A Voyage AI API key or Hugging Face Access Token to access an embedding model.
An OpenAI API Key. You must have an OpenAI account with credits available for API requests. To learn more about registering an OpenAI account, see the OpenAI API website.
A terminal and code editor to run your Go project.
Go installed.

One of the following:
- An Atlas cluster running MongoDB version 6.0.11, 7.0.2, or later. Ensure that your IP address is included in your Atlas project's access list.
- A local Atlas deployment created using the Atlas CLI. To learn more, see Deploy a Local Atlas Cluster.

Java Development Kit (JDK) version 8 or later.
An environment to set up and run a Java application. We recommend that you use an integrated development environment (IDE) such as IntelliJ IDEA or Eclipse IDE to configure Maven or Gradle to build and run your project.

A Hugging Face Access Token with read access.

One of the following:
- An Atlas cluster running MongoDB version 6.0.11, 7.0.2, or later. Ensure that your IP address is included in your Atlas project's access list.
- A local Atlas deployment created using the Atlas CLI. To learn more, see Deploy a Local Atlas Cluster.
A Voyage AI API key or Hugging Face Access Token to access an embedding model.
An OpenAI API key or Hugging Face Access Token to access a generative model (LLM).
A terminal and code editor to run your Node.js project.
npm and Node.js installed.

One of the following:
- An Atlas cluster running MongoDB version 6.0.11, 7.0.2, or later. Ensure that your IP address is included in your Atlas project's access list.
- A local Atlas deployment created using the Atlas CLI. To learn more, see Deploy a Local Atlas Cluster.
A Voyage AI API key or Hugging Face Access Token to access an embedding model.
An OpenAI API key or Hugging Face Access Token to access a generative model (LLM).
An environment to run interactive Python notebooks such as Colab.

Procedure

Set up the environment.

Initialize your .NET project.
Run the following commands in your terminal to create a new directory named MyCompany.RAG and initialize your project:
```
dotnet new console -o MyCompany.RAG
cd MyCompany.RAG
```
Install and import dependencies.
Run the following commands:
```
dotnet add package MongoDB.Driver --version 3.1.0
dotnet add package PdfPig
dotnet add package OpenAI
```
Set your environment variables.
Export the following environment variables, set them in PowerShell, or use your IDE's environment variable manager to make these variables available to your project.
```
export VOYAGE_API_KEY="<voyage-api-key>"
export OPENAI_API_KEY="<openai-api-key>"
export MONGODB_URI="<connection-string>"
```

Replace the placeholder values with your Voyage AI and OpenAI API keys.

Replace <connection-string> with the connection string for your Atlas cluster or local Atlas deployment.

Your connection string should use the following format:

mongodb+srv://<db_username>:<db_password>@<clusterName>.<hostname>.mongodb.net

To learn more, see Connect to a Cluster via Drivers.

Your connection string should use the following format:

mongodb://localhost:<port-number>/?directConnection=true

To learn more, see Connection Strings.

Create a function to generate vector embeddings.

Create a new class named AIService in a file of the same name by pasting the following code. This code defines an async Task named GetEmbeddingsAsync() to generate a array of embeddings for an array of given string inputs. This function uses Voyage AI's voyage-3-large model to generate an embedding for a given input.

AIService.cs

namespace MyCompany.RAG;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Text;
using System.Text.Json;
using System.Text.Json.Serialization;
using System.Threading.Tasks;
public class AIService
{
    private static readonly string? VoyageApiKey = Environment.GetEnvironmentVariable("VOYAGE_API_KEY");
    private static readonly string EmbeddingModelName = "voyage-3-large";
    private static readonly string ApiEndpoint = "https://api.voyageai.com/v1/embeddings";
    
    public async Task<Dictionary<string, float[]>> GetEmbeddingsAsync(string[] texts)
    {
        Dictionary<string, float[]> documentData = new Dictionary<string, float[]>();
        
        try
        {
            using HttpClient client = new HttpClient();
            client.DefaultRequestHeaders.Authorization = 
                new AuthenticationHeaderValue("Bearer", VoyageApiKey);
            
            var requestBody = new
            {
                input = texts,
                model = EmbeddingModelName,
                truncation = true
            };
            
            var content = new StringContent(
                JsonSerializer.Serialize(requestBody),
                Encoding.UTF8,
                "application/json");
            
            HttpResponseMessage response = await client.PostAsync(ApiEndpoint, content);
            
            if (response.IsSuccessStatusCode)
            {
                string responseBody = await response.Content.ReadAsStringAsync();
                var embeddingResponse = JsonSerializer.Deserialize<EmbeddingResponse>(responseBody);
                
                if (embeddingResponse != null && embeddingResponse.Data != null)
                {
                    foreach (var embeddingResult in embeddingResponse.Data)
                    {
                        if (embeddingResult.Index < texts.Length)
                        {
                            documentData[texts[embeddingResult.Index]] = 
                                embeddingResult.Embedding.Select(e => (float)e).ToArray();
                        }
                    }
                }
            }
            else
            {
                throw new ApplicationException($"Error calling Voyage API: {response.ReasonPhrase}");
            }
        }
        catch (Exception e)
        {
            throw new ApplicationException(e.Message);
        }
        
        return documentData;
    }
    
    private class EmbeddingResponse
    {
        [JsonPropertyName("object")]
        public string Object { get; set; } = string.Empty;
        
        [JsonPropertyName("data")]
        public List<EmbeddingResult>? Data { get; set; }
        
        [JsonPropertyName("model")]
        public string Model { get; set; } = string.Empty;
        
        [JsonPropertyName("usage")]
        public Usage? Usage { get; set; }
    }
    
    private class EmbeddingResult
    {
        [JsonPropertyName("object")]
        public string Object { get; set; } = string.Empty;
        
        [JsonPropertyName("embedding")]
        public List<double> Embedding { get; set; } = new();
        
        [JsonPropertyName("index")]
        public int Index { get; set; }
    }
    
    private class Usage
    {
        [JsonPropertyName("total_tokens")]
        public int TotalTokens { get; set; }
    }
}

Ingest data into your MongoDB deployment.

In this section, you ingest sample data into MongoDB that LLMs don't have access to.

Load and split the data.

Create a new class named PdfIngester in a file of the same name by pasting the following code. This code contains a few functions to do the following:

Load a PDF that contains a MongoDB earnings report.
Use PdfPig to parse the PDF into text.
Split the text into chunks, specifying the chunk size (number of characters) and chunk overlap (number of overlapping characters between consecutive chunks).

PdfIngester.cs

namespace MyCompany.RAG;
using System;
using System.Net.Http;
using System.IO;
using System.Threading.Tasks;
using System.Collections.Generic;
using System.Text;
using UglyToad.PdfPig;
using UglyToad.PdfPig.Content;
public class PdfIngester
{
    public async Task<String> DownloadPdf(string url, string path, string fileName)
    {
        using (HttpClient client = new HttpClient())
        {
            try
            {
                byte[] pdfBytes = await client.GetByteArrayAsync(url);
                await File.WriteAllBytesAsync(path + fileName, pdfBytes);
                return "PDF downloaded and saved to " + path + fileName;
            }
            catch (HttpRequestException e)
            {
                throw new ApplicationException("Error downloading the PDF: " + e.Message);
            }
            catch (IOException e)
            {
                throw new ApplicationException("Error writing the file to disk: " + e.Message);
            }
        }
    }
    
    public List<string> ConvertPdfToChunkedText(string filePath)
    {
        List<string> textChunks;
        using (var document = PdfDocument.Open(filePath))
        {
            StringBuilder fullText = new StringBuilder();
            foreach (Page page in document.GetPages())
            {
                fullText.Append(page.Text + "\n");
            }
            textChunks = ChunkText(fullText.ToString(), 400, 20);
        }
        var chunkCount = textChunks.Count;
        if (chunkCount == 0)
        {
            throw new ApplicationException("Unable to chunk PDF contents into text.");
        }
        Console.WriteLine($"Successfully chunked the PDF text into {chunkCount} chunks.");
        return textChunks;
    }
    
    static List<string> ChunkText(string text, int chunkSize, int overlap)
    {
        List<string> chunks = new List<string>();
        int start = 0;
        int textLength = text.Length;
        while (start < textLength)
        {
            int end = start + chunkSize;
            if (end > textLength)
            {
                end = textLength;
            }
            string chunk = text.Substring(start, end - start);
            chunks.Add(chunk);
            // Increment starting point, considering the overlap
            start += chunkSize - overlap;
            if (start >= textLength) break;
        }
        return chunks;
    }
}

Prepare to store the data and embeddings in MongoDB.

Create a new class named MongoDBDataService in a file of the same name by pasting the following code. This code defines an async Task named AddDocumentsAsync to add documents to MongoDB. This function uses the Collection.InsertManyAsync() C# Driver method to insert a list of the BsonDocument type. This code stores the embeddings alongside the chunked data in the rag_db.test collection.

MongoDBDataService.cs

namespace MyCompany.RAG;
using MongoDB.Driver;
using MongoDB.Bson;
public class MongoDBDataService
{
    private static readonly string? ConnectionString = Environment.GetEnvironmentVariable("MONGODB_URI");
    private static readonly MongoClient Client = new MongoClient(ConnectionString);
    private static readonly IMongoDatabase Database = Client.GetDatabase("rag_db");
    private static readonly IMongoCollection<BsonDocument> Collection = Database.GetCollection<BsonDocument>("test");
    public async Task<string> AddDocumentsAsync(Dictionary<string, float[]> embeddings)
    {
        var documents = new List<BsonDocument>();
        foreach( KeyValuePair<string, float[]> var in embeddings )
        {
            var document = new BsonDocument
            {
                {
                    "text", var.Key
                },
                {
                    "embedding", new BsonArray(var.Value)
                }
            };
            documents.Add(document);
        }
        await Collection.InsertManyAsync(documents);
        return $"Successfully inserted {embeddings.Count} documents.";
    }
}

Convert the data to vector embeddings.

Create a new class named EmbeddingGenerator in a file of the same name by pasting the following code. This code prepares the chunked documents for ingestion by creating a list of documents with their corresponding vector embeddings. You generate these embeddings using the GetEmbeddingsAsync() function that you defined earlier.

EmbeddingGenerator.cs

namespace MyCompany.RAG;
public class EmbeddingGenerator
{
    private readonly MongoDBDataService _dataService = new();
    private readonly AIService _AiService = new();
    public async Task<string> GenerateEmbeddings(List<string> textChunks)
    {
        Console.WriteLine("Generating embeddings.");
        Dictionary<string, float[]> docs = new Dictionary<string, float[]>();
        try
        {
            // Pass the text chunks to AI to generate vector embeddings
            var embeddings = await _AiService.GetEmbeddingsAsync(textChunks.ToArray());
            
            // Pair each embedding with the text chunk used to generate it
            int index = 0;
            foreach (var embedding in embeddings)
            {
                docs[textChunks[index]] = embedding.Value;
                index++;
            }
        }
        catch (Exception e)
        {
            throw new ApplicationException("Error creating embeddings for text chunks: "  + e.Message);
        }
        // Add a new document to the MongoDB collection for each text and vector embedding pair
        var result = await _dataService.AddDocumentsAsync(docs);
        return result;
    }
}

Update the Program.cs file.

Paste this code in your Program.cs:

Program.cs

using MyCompany.RAG;
const string pdfUrl = "https://investors.mongodb.com/node/12236/pdf";
const string savePath = "<path-name>";
const string fileName = "investor-report.pdf";
var pdfIngester = new PdfIngester();
var pdfDownloadResult = await pdfIngester.DownloadPdf(pdfUrl, savePath, fileName);
Console.WriteLine(pdfDownloadResult);
var textChunks = pdfIngester.ConvertPdfToChunkedText(savePath + fileName);
if (textChunks.Any()) {
    var embeddingGenerator = new EmbeddingGenerator();
    var embeddingGenerationResult = await embeddingGenerator.GenerateEmbeddings(textChunks);
    Console.WriteLine(embeddingGenerationResult);
}

This code:

Uses the PdfIngester to load and chunk the PDF into text segments
Uses the EmbeddingGenerator to generate embeddings for each text chunk from the PDF, and write the text chunks and embeddings to the rag_db.test collection

Replace the <path-name> placeholder with the path where you want to download the report. On a macOS system, the path should resemble /Users/<username>/MyCompany.RAG/. The path should end with a trailing slash.

Compile and run your project to generate embeddings.
dotnet run MyCompany.RAG.csproj
PDF downloaded and saved to <PATH> Successfully chunked the PDF text into 73 chunks. Generating embeddings. Successfully inserted 73 documents.

Use Atlas Vector Search to retrieve documents.

In this section, you set up Atlas Vector Search to retrieve documents from your vector database. To create an Atlas Vector Search index for a collection using the MongoDB C# driver v3.1.0 or later, perform the following steps:

Define the Atlas Vector Search index.

Add a new CreateVectorIndex() method in the file named MongoDBDataService.cs to define the search index. This code connects to your MongoDB deployment and creates an index of the vectorSearch type on the rag_db.test collection.

MongoDBDataService.cs

namespace MyCompany.RAG;
using MongoDB.Driver;
using MongoDB.Bson;
public class DataService
{
    private static readonly string? ConnectionString = Environment.GetEnvironmentVariable("MONGODB_URI");
    private static readonly MongoClient Client = new MongoClient(ConnectionString);
    private static readonly IMongoDatabase Database = Client.GetDatabase("rag_db");
    private static readonly IMongoCollection<BsonDocument> Collection = Database.GetCollection<BsonDocument>("test");
    public async Task<string> AddDocumentsAsync(Dictionary<string, float[]> embeddings)
    {
        // Method details...
    }
    public string CreateVectorIndex()
    {
        var searchIndexView = Collection.SearchIndexes;
        var name = "vector_index";
        var type = SearchIndexType.VectorSearch;
        var definition = new BsonDocument
        {
            { "fields", new BsonArray
                {
                    new BsonDocument
                    {
                        { "type", "vector" },
                        { "path", "embedding" },
                        { "numDimensions", 1024 },
                        { "similarity", "cosine" }
                    }
                }
            }
        };
        var model = new CreateSearchIndexModel(name, type, definition);
        try
        {
            searchIndexView.CreateOne(model);
            Console.WriteLine($"New search index named {name} is building.");
            // Polling for index status
            Console.WriteLine("Polling to check if the index is ready. This may take up to a minute.");
            bool queryable = false;
            while (!queryable)
            {
                var indexes = searchIndexView.List();
                foreach (var index in indexes.ToEnumerable())
                {
                    if (index["name"] == name)
                    {
                        queryable = index["queryable"].AsBoolean;
                    }
                }
                if (!queryable)
                {
                    Thread.Sleep(5000);
                }
            }
        }
        catch (Exception e)
        {
            throw new ApplicationException("Error creating the vector index: "  + e.Message);
        }
        return $"{name} is ready for querying.";
    }
}

Update the Program.cs file.
Replace the code in Program.cs with the following code to create the index:
Program.cs
```
using MyCompany.RAG;
var dataService = new MongoDBDataService();
var result = dataService.CreateVectorIndex();
Console.WriteLine(result);
```
Compile and run your project to create the index.
```
dotnet run MyCompany.RAG.csproj
```

Define a function to retrieve relevant data.

Add a new PerformVectorQuery method in the file named MongoDBDataService.cs to retrieve relevant documents. To learn more, refer to Run Vector Search Queries.

MongoDBDataService.cs

namespace MyCompany.RAG;
using MongoDB.Driver;
using MongoDB.Bson;
public class MongoDBDataService
{
    private static readonly string? ConnectionString = Environment.GetEnvironmentVariable("MONGODB_URI");
    private static readonly MongoClient Client = new MongoClient(ConnectionString);
    private static readonly IMongoDatabase Database = Client.GetDatabase("rag_db");
    private static readonly IMongoCollection<BsonDocument> Collection = Database.GetCollection<BsonDocument>("test");
    
    public async Task<string> AddDocumentsAsync(Dictionary<string, float[]> embeddings)
    {
        // Method details...
    }
    public string CreateVectorIndex()
    {
        // Method details...
    }
    public List<BsonDocument>? PerformVectorQuery(float[] vector)
    {
        var vectorSearchStage = new BsonDocument
        {
            {
                "$vectorSearch",
                new BsonDocument
                {
                    { "index", "vector_index" },
                    { "path", "embedding" },
                    { "queryVector", new BsonArray(vector) },
                    { "exact", true },
                    { "limit", 5 }
                }
            }
        };
        var projectStage = new BsonDocument
        {
            {
                "$project",
                new BsonDocument
                {
                    { "_id", 0 },
                    { "text", 1 },
                    { "score", 
                        new BsonDocument
                        {
                            { "$meta", "vectorSearchScore"}
                        }
                    }
                }
            }
        };
        var pipeline = new[] { vectorSearchStage, projectStage };
        return Collection.Aggregate<BsonDocument>(pipeline).ToList();
    }
}

Test retrieving the data.

Create a new class named PerformTestQuery in a file of the same name by pasting the following code. This code transforms a text input string into vector embeddings, and queries the database for matching results. It uses the GetEmbeddingsAsync() function to create embeddings from the search query. Then, it runs the query to return semantically-similar documents.

PerformTestQuery.cs

namespace MyCompany.RAG;
public class PerformTestQuery
{
    private readonly MongoDBDataService _dataService = new();
    private readonly AIService _AiService = new();
    public async Task<string> GetQueryResults(string question)
    {
        // Get the vector embedding for the query
        var query = question;
        var queryEmbeddings = await _AiService.GetEmbeddingsAsync([query]);
        // Query the vector database for applicable query results
        var matchingDocuments = _dataService.PerformVectorQuery(queryEmbeddings[query]);
        // Construct a string from the query results for performing QA with the LLM
        var sb = new System.Text.StringBuilder();
        if (matchingDocuments != null)
        {
            foreach (var doc in matchingDocuments)
            {
                sb.AppendLine($"Text: {doc.GetValue("text").ToString()}");
                sb.AppendLine($"Score: {doc.GetValue("score").ToString()}");
            }
        }
        else
        {
            return "No matching documents found.";
        }
        return sb.ToString();
    }
}

Update the Program.cs file.
Replace the code in Program.cs with the following code to perform a test query:
Program.cs
```
using MyCompany.RAG;
var query = "AI Technology";
var queryCoordinator = new PerformTestQuery();
var result = await queryCoordinator.GetQueryResults(query);
Console.WriteLine(result);
```

Compile and run your project to check the query results.

dotnet run MyCompany.RAG.csproj

Text: time series queries—and the general availability of Atlas Stream Processing to build sophisticated,event-driven applications with real-time data.MongoDB continues to expand its AI ecosystem with the announcement of the MongoDB AI Applications Program (MAAP),
which provides customers with reference architectures, pre-built partner integrations, and professional services to helpthem quickly build AI
Score: 0.72528624534606934
Text: hem quickly build AI-powered applications. Accenture will establish a center of excellence focused on MongoDB projects,and is the first global systems integrator to join MAAP.Bendigo and Adelaide Bank partnered with MongoDB to modernize their core banking technology. With the help ofMongoDB Relational Migrator and generative AI-powered modernization tools, Bendigo and Adelaide Bank decomposed anou
Score: 0.71915638446807861
Text: and regulatory issues relating to the use of new and evolving technologies, such asartificial intelligence, in our offerings or partnerships; the growth and expansion of the market for database products and our ability to penetrate thatmarket; our ability to integrate acquired businesses and technologies successfully or achieve the expected benefits of such acquisitions; our ability tomaintain the
Score: 0.70376789569854736
Text: architecture is particularly well-suited for the variety and scale of data required by AI-powered applications. We are confident MongoDB will be a substantial beneficiary of this next wave of application development."First Quarter Fiscal 2025 Financial HighlightsRevenue: Total revenue was $450.6 million for the first quarter of fiscal 2025, an increase of 22% year-over-year.Subscription revenue wa
Score: 0.67905724048614502
Text: tures, services orenhancements; our ability to effectively expand our sales and marketing organization; our ability to continue to build and maintain credibility with thedeveloper community; our ability to add new customers or increase sales to our existing customers; our ability to maintain, protect, enforce andenhance our intellectual property; the effects of social, ethical and regulatory issue
Score: 0.64435118436813354

Generate responses with the LLM.

In this section, you generate responses by prompting an LLM to use the retrieved documents as context. This example uses the function you just defined to retrieve matching documents from the database, and additionally:

Accesses the gpt-4o-mini model from OpenAI.
Instructs the LLM to include the user's question and retrieved documents in the prompt.
Prompts the LLM about MongoDB's latest AI announcements.

Add the imports, the new ChatClient information, and a new method called GenerateAnswer in the file named AIService.cs.

AIService.cs

namespace MyCompany.RAG;
using OpenAI.Chat;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Text;
using System.Text.Json;
using System.Text.Json.Serialization;
using System.Threading.Tasks;
public class AIService
{
 private static readonly string? VoyageApiKey = Environment.GetEnvironmentVariable("VOYAGE_API_KEY");
 private static readonly string EmbeddingModelName = "voyage-3-large";
 private static readonly string ApiEndpoint = "https://api.voyageai.com/v1/embeddings";
 private static readonly string? OpenAIApiKey = Environment.GetEnvironmentVariable("OPENAI_API_KEY");
 private static readonly string ChatModelName = "gpt-4o-mini";
 private static readonly ChatClient ChatClient = new(model: ChatModelName, apiKey: OpenAIApiKey);
 public async Task<Dictionary<string, float[]>> GetEmbeddingsAsync(string[] texts)
 {
  // Method details...
 }
 public async Task<string> GenerateAnswer(string question, string context)
 {
  string prompt = $"""
                         Answer the following question based on the given context.
                         Context: {context}
                         Question: {question}
                         """;
  byte[] binaryContent = Encoding.UTF8.GetBytes(prompt);
  IEnumerable<ChatMessage> messages = new List<ChatMessage>([prompt]);
  ChatCompletion responses = await ChatClient.CompleteChatAsync(messages, new ChatCompletionOptions { MaxOutputTokenCount = 400 });
  var summaryResponse = responses.Content[0].Text;
  if (summaryResponse is null)
  {
   throw new ApplicationException("No response from the chat client.");
  }
  return summaryResponse;
 }
 // Rest of code...
}

Create a RAGPipeline class.

Create a new class named RAGPipeline in a file of the same name by pasting the following code. This code coordinates the following components:

GetEmbeddingsAsync() function: transform the string query into vector embeddings.
PerformVectorQuery function: retrieve semantically-similar results from the database.
GenerateAnswer function: pass the documents retrieved from the database to the LLM to generate the response.

RAGPipeline.cs

namespace MyCompany.RAG;
public class RAGPipeline
{
    private readonly MongoDBDataService _dataService = new();
    private readonly AIService _AiService = new();
    public async Task<string> GenerateResults(string question)
    {
        // Get the vector embedding for the query
        var query = question;
        var queryEmbedding = await _AiService.GetEmbeddingsAsync([query]);
        // Query the vector database for applicable query results
        var matchingDocuments = _dataService.PerformVectorQuery(queryEmbedding[query]);
        // Construct a string from the query results for performing QA with the LLM
        var sb = new System.Text.StringBuilder();
        if (matchingDocuments != null)
        {
            foreach (var doc in matchingDocuments)
            {
                sb.AppendLine($"Text: {doc.GetValue("text").ToString()}");
            }
        }
        else
        {
            return "No matching documents found.";
        }
        return await _AiService.GenerateAnswer(question, sb.ToString());
    }
}

Update the Program.cs file.

Replace the code in Program.cs with the following code to call your RAG pipeline:

Program.cs

using MyCompany.RAG;
var question = "In a few sentences, what are MongoDB's latest AI announcements?";
var ragPipeline = new RAGPipeline();
var result = await ragPipeline.GenerateResults(question);
Console.WriteLine(result);

Compile and run your project to perform RAG. The generated response might vary.

dotnet run MyCompany.RAG.csproj

MongoDB has recently announced the MongoDB AI Applications Program (MAAP),
which aims to support customers in building AI-powered applications through
reference architectures, pre-built partner integrations, and professional
services. Additionally, the program includes a partnership with Accenture,
which will establish a center of excellence focused on MongoDB projects. These
initiatives demonstrate MongoDB's commitment to expanding its AI ecosystem and
its strategy to adapt its document-based architecture for the demands of
AI-driven application development.

Set up the environment.

Initialize your Go project.
Run the following commands in your terminal to create a new directory named rag-mongodb and initialize your project:
```
mkdir rag-mongodb
cd rag-mongodb
go mod init rag-mongodb
```

Install and import dependencies.

Run the following commands:

go get github.com/joho/godotenv
go get go.mongodb.org/mongo-driver/v2/mongo
go get github.com/tmc/langchaingo/llms
go get github.com/tmc/langchaingo/documentloaders
go get github.com/tmc/langchaingo/embeddings/huggingface
go get github.com/tmc/langchaingo/embeddings/voyageai
go get github.com/tmc/langchaingo/llms/openai
go get github.com/tmc/langchaingo/prompts
go get github.com/tmc/langchaingo/vectorstores/mongovector

Create a .env file.

In your project, create a .env file to store your MongoDB connection string and any API keys that you need to access the models.

.env

MONGODB_URI = "<connection-string>"
VOYAGEAI_API_KEY = "<voyage-api-key>"   # If using Voyage AI embedding model
HUGGINGFACEHUB_API_TOKEN = "<hf-token>" # If using Hugging Face embedding model
OPENAI_API_KEY = "<openai-api-key>"

Replace the placeholder values with your credentials.

Replace <connection-string> with the connection string for your Atlas cluster or local Atlas deployment.

Your connection string should use the following format:

mongodb+srv://<db_username>:<db_password>@<clusterName>.<hostname>.mongodb.net

To learn more, see Connect to a Cluster via Drivers.

Your connection string should use the following format:

mongodb://localhost:<port-number>/?directConnection=true

To learn more, see Connection Strings.

Create a function to retrieve and process your data.

In this section, you download and process sample data into MongoDB that LLMs don't have access to. The following code uses the Go library for LangChain to perform the following tasks:

Create a HTML file that contains a MongoDB earnings report.
Split the data into chunks, specifying the chunk size (number of characters) and chunk overlap (number of overlapping characters between consecutive chunks).

Run the following command to create a directory that stores common functions.
```
mkdir common && cd common
```

Create a file called process-file.go in the common directory, and paste the following code into it:

process-file.go

package common
import (
	"context"
	"io"
	"log"
	"net/http"
	"os"
	"github.com/tmc/langchaingo/documentloaders"
	"github.com/tmc/langchaingo/schema"
	"github.com/tmc/langchaingo/textsplitter"
)
func DownloadReport(filename string) {
	_, err := os.Stat(filename)
	if err == nil {
		return
	}
	const url = "https://investors.mongodb.com/node/12236"
	log.Println("Downloading ", url, " to ", filename)
	resp, err := http.Get(url)
	if err != nil {
		log.Fatalf("failed to connect to download the report: %v", err)
	}
	defer func() {
		if err := resp.Body.Close(); err != nil {
			log.Fatalf("failed to close the resource: %v", err)
		}
	}()
	f, err := os.Create(filename)
	if err != nil {
		return
	}
	defer func() {
		if err := f.Close(); err != nil {
			log.Fatalf("failed to close file: %v", err)
		}
	}()
	_, err = io.Copy(f, resp.Body)
	if err != nil {
		log.Fatalf("failed to copy the report: %v", err)
	}
}
func ProcessFile(filename string) []schema.Document {
	ctx := context.Background()
	f, err := os.Open(filename)
	if err != nil {
		log.Fatalf("failed to open file: %v", err)
	}
	defer func() {
		if err := f.Close(); err != nil {
			log.Fatalf("failed to close file: %v", err)
		}
	}()
	html := documentloaders.NewHTML(f)
	split := textsplitter.NewRecursiveCharacter()
	split.ChunkSize = 400
	split.ChunkOverlap = 20
	docs, err := html.LoadAndSplit(ctx, split)
	if err != nil {
		log.Fatalf("failed to chunk the HTML into documents: %v", err)
	}
	log.Printf("Successfully chunked the HTML into %v documents.\n", len(docs))
	return docs
}

Ingest data into your MongoDB deployment.

In this section, you ingest sample data into MongoDB that LLMs don't have access to. The following code uses the Go library for LangChain and Go driver to perform the following tasks:

Load the embedding model.
Create an instance of mongovector from your Go driver client and Hugging Face embedding model to implement the vector store.
Create and store vector embeddings from the chunked data by using the mongovector.AddDocuments() method. The code stores the chunked data and corresponding embeddings in the rag_db.test collection.

Navigate to the root of the rag-mongodb project directory.

Create a file called ingest-data.go in your project, and paste the following code into it:

This code uses the voyage-3-large embedding model from Voyage AI to generate vector embeddings.

ingest-data.go

package main
import (
	"context"
	"fmt"
	"log"
	"os"
	"rag-mongodb/common"
	"github.com/joho/godotenv"
	"github.com/tmc/langchaingo/embeddings/voyageai"
	"github.com/tmc/langchaingo/vectorstores/mongovector"
	"go.mongodb.org/mongo-driver/v2/mongo"
	"go.mongodb.org/mongo-driver/v2/mongo/options"
)
func main() {
	filename := "investor-report.html"
	common.DownloadReport(filename)
	docs := common.ProcessFile(filename)
	if err := godotenv.Load(); err != nil {
		log.Fatal("no .env file found")
	}
	// Connect to your MongoDB cluster
	uri := os.Getenv("MONGODB_URI")
	if uri == "" {
		log.Fatal("set your 'MONGODB_URI' environment variable.")
	}
	client, err := mongo.Connect(options.Client().ApplyURI(uri))
	if err != nil {
		log.Fatalf("failed to connect to server: %v", err)
	}
	defer func() {
		if err := client.Disconnect(context.Background()); err != nil {
			log.Fatalf("error disconnecting the client: %v", err)
		}
	}()
	coll := client.Database("rag_db").Collection("test")
	embedder, err := voyageai.NewVoyageAI(
		voyageai.WithModel("voyage-3-large"),
	)
	if err != nil {
		log.Fatal("failed to create an embedder: %v", err)
	}
	store := mongovector.New(coll, embedder, mongovector.WithPath("embedding"))
	// Add documents to the MongoDB collection.
	log.Println("Generating embeddings.")
	result, err := store.AddDocuments(context.Background(), docs)
	if err != nil {
		log.Fatalf("failed to insert documents: %v", err)
	}
	fmt.Printf("Successfully inserted %v documents\n", len(result))
}

This code uses the mxbai-embed-large-v1 embedding model from Hugging Face to generate vector embeddings.

ingest-data.go

package main
import (
	"context"
	"fmt"
	"log"
	"os"
	"rag-mongodb/common"
	"github.com/joho/godotenv"
	"github.com/tmc/langchaingo/embeddings/huggingface"
	"github.com/tmc/langchaingo/vectorstores/mongovector"
	"go.mongodb.org/mongo-driver/v2/mongo"
	"go.mongodb.org/mongo-driver/v2/mongo/options"
)
func main() {
	filename := "investor-report.html"
	common.DownloadReport(filename)
	docs := common.ProcessFile(filename)
	if err := godotenv.Load(); err != nil {
		log.Fatal("no .env file found")
	}
	// Connect to your MongoDB cluster
	uri := os.Getenv("MONGODB_URI")
	if uri == "" {
		log.Fatal("set your 'MONGODB_URI' environment variable.")
	}
	client, err := mongo.Connect(options.Client().ApplyURI(uri))
	if err != nil {
		log.Fatalf("failed to connect to server: %v", err)
	}
	defer func() {
		if err := client.Disconnect(context.Background()); err != nil {
			log.Fatalf("error disconnecting the client: %v", err)
		}
	}()
	coll := client.Database("rag_db").Collection("test")
	embedder, err := huggingface.NewHuggingface(
		huggingface.WithModel("mixedbread-ai/mxbai-embed-large-v1"),
		huggingface.WithTask("feature-extraction"))
	if err != nil {
		log.Fatal("failed to create an embedder: %v", err)
	}
	store := mongovector.New(coll, embedder, mongovector.WithPath("embedding"))
	// Add documents to the MongoDB collection.
	log.Println("Generating embeddings.")
	result, err := store.AddDocuments(context.Background(), docs)
	if err != nil {
		log.Fatalf("failed to insert documents: %v", err)
	}
	fmt.Printf("Successfully inserted %v documents\n", len(result))
}

Run the following command to execute the code:
go run ingest-data.go
Successfully chunked the HTML into 163 documents. Generating embeddings. Successfully inserted 163 documents

Use Atlas Vector Search to retrieve documents.

In this section, you set up Atlas Vector Search to retrieve documents from your vector database. Complete the following steps:

Create an Atlas Vector Search index on your vector embeddings.

Create a new file named rag-vector-index.go and paste the following code. This code connects to your MongoDB deployment and creates an index of the vectorSearch type on the rag_db.test collection.

rag-vector-index.go

package main
import (
	"context"
	"log"
	"os"
	"time"
	"github.com/joho/godotenv"
	"go.mongodb.org/mongo-driver/v2/bson"
	"go.mongodb.org/mongo-driver/v2/mongo"
	"go.mongodb.org/mongo-driver/v2/mongo/options"
)
func main() {
	ctx := context.Background()
	if err := godotenv.Load(); err != nil {
		log.Fatal("no .env file found")
	}
	// Connect to your MongoDB cluster
	uri := os.Getenv("MONGODB_URI")
	if uri == "" {
		log.Fatal("set your 'MONGODB_URI' environment variable.")
	}
	clientOptions := options.Client().ApplyURI(uri)
	client, err := mongo.Connect(clientOptions)
	if err != nil {
		log.Fatalf("failed to connect to the server: %v", err)
	}
	defer func() { _ = client.Disconnect(ctx) }()
	// Specify the database and collection
	coll := client.Database("rag_db").Collection("test")
	indexName := "vector_index"
	opts := options.SearchIndexes().SetName(indexName).SetType("vectorSearch")
	type vectorDefinitionField struct {
		Type          string `bson:"type"`
		Path          string `bson:"path"`
		NumDimensions int    `bson:"numDimensions"`
		Similarity    string `bson:"similarity"`
	}
	type filterField struct {
		Type string `bson:"type"`
		Path string `bson:"path"`
	}
	type vectorDefinition struct {
		Fields []vectorDefinitionField `bson:"fields"`
	}
	indexModel := mongo.SearchIndexModel{
		Definition: vectorDefinition{
			Fields: []vectorDefinitionField{{
				Type:          "vector",
				Path:          "embedding",
				NumDimensions: 1024,
				Similarity:    "cosine"}},
		},
		Options: opts,
	}
	log.Println("Creating the index.")
	searchIndexName, err := coll.SearchIndexes().CreateOne(ctx, indexModel)
	if err != nil {
		log.Fatalf("failed to create the search index: %v", err)
	}
	// Await the creation of the index.
	log.Println("Polling to confirm successful index creation.")
	log.Println("NOTE: This may take up to a minute.")
	searchIndexes := coll.SearchIndexes()
	var doc bson.Raw
	for doc == nil {
		cursor, err := searchIndexes.List(ctx, options.SearchIndexes().SetName(searchIndexName))
		if err != nil {
			log.Printf("failed to list search indexes: %w", err)
		}
		if !cursor.Next(ctx) {
			break
		}
		name := cursor.Current.Lookup("name").StringValue()
		queryable := cursor.Current.Lookup("queryable").Boolean()
		if name == searchIndexName && queryable {
			doc = cursor.Current
		} else {
			time.Sleep(5 * time.Second)
		}
	}
	log.Println("Name of Index Created: " + searchIndexName)
}

Run the following command to create the index:
```
go run rag-vector-index.go
```

Define a function to retrieve relevant data.

In this step, you create a retrieval function called GetQueryResults() that runs a query to retrieve relevant documents. It uses the mongovector.SimilaritySearch() method, which automatically generates a vector representation of your query string and returns relevant results.

To learn more, refer to Run Vector Search Queries.

In the common directory, create a new file called get-query-results.go, and paste the following code into it:

get-query-results.go

package common
import (
	"context"
	"log"
	"os"
	"github.com/joho/godotenv"
	"github.com/tmc/langchaingo/embeddings/voyageai"
	"github.com/tmc/langchaingo/schema"
	"github.com/tmc/langchaingo/vectorstores/mongovector"
	"go.mongodb.org/mongo-driver/v2/mongo"
	"go.mongodb.org/mongo-driver/v2/mongo/options"
)
func GetQueryResults(query string) []schema.Document {
	ctx := context.Background()
	if err := godotenv.Load(); err != nil {
		log.Fatal("no .env file found")
	}
	// Connect to your MongoDB cluster
	uri := os.Getenv("MONGODB_URI")
	if uri == "" {
		log.Fatal("set your 'MONGODB_URI' environment variable.")
	}
	clientOptions := options.Client().ApplyURI(uri)
	client, err := mongo.Connect(clientOptions)
	if err != nil {
		log.Fatalf("failed to connect to the server: %v", err)
	}
	defer func() { _ = client.Disconnect(ctx) }()
	// Specify the database and collection
	coll := client.Database("rag_db").Collection("test")
	embedder, err := voyageai.NewVoyageAI(
		voyageai.WithModel("voyage-3-large"),
	)
	if err != nil {
		log.Fatal("failed to create an embedder: %v", err)
	}
	store := mongovector.New(coll, embedder, mongovector.WithPath("embedding"))
	// Search for similar documents.
	docs, err := store.SimilaritySearch(context.Background(), query, 5)
	if err != nil {
		log.Fatal("error performing similarity search: %v", err)
	}
	return docs
}

This code uses the mxbai-embed-large-v1 embedding model from Hugging Face to generate vector embeddings.

get-query-results.go

package common
import (
	"context"
	"log"
	"os"
	"github.com/joho/godotenv"
	"github.com/tmc/langchaingo/embeddings/huggingface"
	"github.com/tmc/langchaingo/schema"
	"github.com/tmc/langchaingo/vectorstores/mongovector"
	"go.mongodb.org/mongo-driver/v2/mongo"
	"go.mongodb.org/mongo-driver/v2/mongo/options"
)
func GetQueryResults(query string) []schema.Document {
	ctx := context.Background()
	if err := godotenv.Load(); err != nil {
		log.Fatal("no .env file found")
	}
	// Connect to your MongoDB cluster
	uri := os.Getenv("MONGODB_URI")
	if uri == "" {
		log.Fatal("set your 'MONGODB_URI' environment variable.")
	}
	clientOptions := options.Client().ApplyURI(uri)
	client, err := mongo.Connect(clientOptions)
	if err != nil {
		log.Fatalf("failed to connect to the server: %v", err)
	}
	defer func() { _ = client.Disconnect(ctx) }()
	// Specify the database and collection
	coll := client.Database("rag_db").Collection("test")
	embedder, err := huggingface.NewHuggingface(
		huggingface.WithModel("mixedbread-ai/mxbai-embed-large-v1"),
		huggingface.WithTask("feature-extraction"))
	if err != nil {
		log.Fatal("failed to create an embedder: %v", err)
	}
	store := mongovector.New(coll, embedder, mongovector.WithPath("embedding"))
	// Search for similar documents.
	docs, err := store.SimilaritySearch(context.Background(), query, 5)
	if err != nil {
		log.Fatal("error performing similarity search: %v", err)
	}
	return docs
}

Test retrieving the data.

In the rag-mongodb project directory, create a new file called retrieve-documents-test.go. In this step, you check that the function you just defined returns relevant results.

Paste this code into your file:

retrieve-documents-test.go

package main
import (
	"fmt"
	"rag-mongodb/common" // Module that contains the GetQueryResults function
)
func main() {
	query := "AI Technology"
	documents := common.GetQueryResults(query)
	for _, doc := range documents {
		fmt.Printf("Text: %s \nScore: %v \n\n", doc.PageContent, doc.Score)
	}
}

Run the following command to execute the code:

go run retrieve-documents-test.go

Text: for the variety and scale of data required by AI-powered applications.  We are confident MongoDB will be a substantial beneficiary of this next wave of application development.&#34; 
Score: 0.83503306 
Text: &#34;As we look ahead, we continue to be incredibly excited by our large market opportunity, the potential to increase share, and become a standard within more of our customers. We also see a tremendous opportunity to win more legacy workloads, as AI has now become a catalyst to modernize these applications. MongoDB&#39;s document-based architecture is particularly well-suited for the variety and 
Score: 0.82807535 
Text: to the use of new and evolving technologies, such as artificial intelligence, in our offerings or partnerships; the growth and expansion of the market for database products and our ability to penetrate that market; our ability to integrate acquired businesses and technologies successfully or achieve the expected benefits of such acquisitions; our ability to maintain the security of our software 
Score: 0.8165897 
Text: MongoDB continues to expand its AI ecosystem with the announcement of the MongoDB AI Applications Program (MAAP), which provides customers with reference architectures, pre-built partner integrations, and professional services to help them quickly build AI-powered applications. Accenture will establish a center of excellence focused on MongoDB projects, and is the first global systems 
Score: 0.8023907 
Text: assumptions, our ability to capitalize on our market opportunity and deliver strong growth for the foreseeable future as well as the criticality of MongoDB to artificial intelligence application development. These forward-looking statements include, but are not limited to, plans, objectives, expectations and intentions and other statements contained in this press release that are not historical 
Score: 0.7829329

Generate responses with the LLM.

In this section, you generate responses by prompting an LLM from OpenAI to use the retrieved documents as context. This example uses the function you just defined to retrieve matching documents from the database, and additionally:

Instructs the LLM to include the user's question and retrieved documents in the prompt.
Prompts the LLM about MongoDB's latest AI announcements.

Create a new file called generate-responses.go, and paste the following code into it:

generate-responses.go

package main
import (
	"context"
	"fmt"
	"log"
	"os"
	"rag-mongodb/common" // Module that contains the GetQueryResults function
	"strings"
	"github.com/tmc/langchaingo/llms"
	"github.com/tmc/langchaingo/llms/openai"
	"github.com/tmc/langchaingo/prompts"
)
func main() {
	ctx := context.Background()
	question := "In a few sentences, what are MongoDB's latest AI announcements?"
	documents := common.GetQueryResults(question)
	var textDocuments strings.Builder
	for _, doc := range documents {
		textDocuments.WriteString(doc.PageContent)
	}
	template := prompts.NewPromptTemplate(
		`Answer the following question based on the given context.
			Question: {{.question}}
			Context: {{.context}}`,
		[]string{"question", "context"},
	)
	prompt, err := template.Format(map[string]any{
		"question": question,
		"context":  textDocuments.String(),
	})
	// Loads OpenAI API key from environment
	openaiApiKey := os.Getenv("OPENAI_API_KEY")
	if openaiApiKey == "" {
		log.Fatal("Set your OPENAI_API_KEY environment variable in the .env file")
	}
	// Creates an OpenAI LLM client
	llm, err := openai.New(
		openai.WithToken(openaiApiKey),
		openai.WithModel("gpt-4o"),
	)
	if err != nil {
		log.Fatalf("Failed to create an LLM client: %v", err)
	}
	completion, err := llms.GenerateFromSinglePrompt(ctx, llm, prompt)
	if err != nil {
		log.Fatalf("failed to generate a response from the prompt: %v", err)
	}
	fmt.Println(completion)
}

Run this command to execute the code. The generated response might vary.

go run generate-responses.go

MongoDB recently announced several developments in its AI ecosystem. 
These include the MongoDB AI Applications Program (MAAP), which offers 
reference architectures, pre-built partner integrations, and professional
services to help customers efficiently build AI-powered applications. 
Accenture is the first global systems integrator to join MAAP and will 
establish a center of excellence for MongoDB projects. Additionally, 
MongoDB introduced significant updates, including faster performance 
in version 8.0 and the general availability of Atlas Stream Processing 
to enable real-time, event-driven applications. These advancements 
highlight MongoDB's focus on supporting AI-powered applications and 
modernizing legacy workloads.

Create your Java project and install dependencies.

From your IDE, create a Java project using Maven or Gradle.

Add the following dependencies, depending on your package manager:

If you are using Maven, add the following dependencies to the dependencies array and Bill of Materials (BOM) to the dependencyManagement array in your project's pom.xml file:

pom.xml

<dependencies>
    <!-- MongoDB Java Sync Driver -->
    <dependency>
        <groupId>org.mongodb</groupId>
        <artifactId>mongodb-driver-sync</artifactId>
        <version>5.2.0</version>
    </dependency>
    <!-- Core LangChain4j library (provides Document interface, etc.) -->
    <dependency>
        <groupId>dev.langchain4j</groupId>
        <artifactId>langchain4j</artifactId>
    </dependency>
    <!-- Voyage AI integration -->
    <dependency>
        <groupId>dev.langchain4j</groupId>
        <artifactId>langchain4j-voyage-ai</artifactId>
    </dependency>
    <!-- Hugging Face integration -->
    <dependency>
        <groupId>dev.langchain4j</groupId>
        <artifactId>langchain4j-hugging-face</artifactId>
    </dependency>
    <!-- Open AI integration -->
    <dependency>
        <groupId>dev.langchain4j</groupId>
        <artifactId>langchain4j-open-ai</artifactId>
    </dependency>
    <!-- Apache PDFBox Document Parser -->
    <dependency>
        <groupId>dev.langchain4j</groupId>
        <artifactId>langchain4j-document-parser-apache-pdfbox</artifactId>
    </dependency>
</dependencies>
<dependencyManagement>
   <dependencies>
         <!-- Bill of Materials (BOM) to manage Java library versions -->
         <dependency>
            <groupId>dev.langchain4j</groupId>
            <artifactId>langchain4j-bom</artifactId>
            <version>1.1.0</version>
            <type>pom</type>
            <scope>import</scope>
         </dependency>
   </dependencies>
</dependencyManagement>

If you are using Gradle, add the following Bill of Materials (BOM) and dependencies to the dependencies array in in your project's build.gradle file:

build.gradle

dependencies {
   // Bill of Materials (BOM) to manage Java library versions
   implementation platform('dev.langchain4j:langchain4j-bom:1.1.0')
   // MongoDB Java Sync Driver v5.2.0 or later
   implementation 'org.mongodb:mongodb-driver-sync:5.2.0'
   // Java library for Voyage AI models
   implementation 'dev.langchain4j:langchain4j-voyage-ai'
   // Java library for Hugging Face models
   implementation 'dev.langchain4j:langchain4j-hugging-face'
   // Java library for Open AI models
   implementation 'dev.langchain4j:langchain4j-open-ai'
   // Java library for URL Document Loader
   implementation 'dev.langchain4j:langchain4j'
   // Java library for Apache PDFBox Document Parser
   implementation 'dev.langchain4j:langchain4j-document-parser-apache-pdfbox'
}

Run your package manager to install the dependencies to your project.

Set your environment variables.

Note

This example sets the variables for the project in the IDE. Production applications might manage environment variables through a deployment configuration, CI/CD pipeline, or secrets manager, but you can adapt the provided code to fit your use case.

Set only the environment variables that you need for your project.

In your IDE, create a new configuration template and add the following variables to your project:

If you are using IntelliJ IDEA, create a new Application run configuration template, then add your variables as semicolon-separated values in the Environment variables field (for example, FOO=123;BAR=456). Apply the changes and click OK.
To learn more, see the Create a run/debug configuration from a template section of the IntelliJ IDEA documentation.
If you are using Eclipse, create a new Java Application launch configuration, then add each variable as a new key-value pair in the Environment tab. Apply the changes and click OK.
To learn more, see the Creating a Java application launch configuration section of the Eclipse IDE documentation.

Environment variables

   VOYAGE_AI_KEY=<voyage-api-key> # If using Voyage AI embedding models
   HUGGING_FACE_ACCESS_TOKEN=<access-token> # If using Hugging Face embedding models
   OPENAI_API_KEY=<openai-api-key>
   MONGODB_URI=<connection-string>

Update the placeholders with the following values:

Replace the <access-token> placeholder value with your Hugging Face access token.
Replace the <api-key> placeholder value with your Voyage AI API key, if you're using Voyage AI.
Replace <connection-string> with the connection string for your Atlas cluster or local Atlas deployment.
Your connection string should use the following format:
mongodb+srv://<db_username>:<db_password>@<clusterName>.<hostname>.mongodb.net
To learn more, see Connect to a Cluster via Drivers.
Your connection string should use the following format:
mongodb://localhost:<port-number>/?directConnection=true
To learn more, see Connection Strings.

Define methods to parse and split the data.

Create a file named PDFProcessor.java and paste the following code.

This code defines the following methods:

The parsePDFDocument method uses the Apache PDFBox library and LangChain4j URL Document Loader to load and parse a PDF file at a given URL. The method returns the parsed PDF as a langchain4j Document.
The splitDocument method splits a given langchain4j Document into chunks according to the specified chunk size (number of characters) and chunk overlap (number of overlapping characters between consecutive chunks). The method returns a list of text segments.

PDFProcessor.java

import dev.langchain4j.data.document.Document;
import dev.langchain4j.data.document.DocumentParser;
import dev.langchain4j.data.document.DocumentSplitter;
import dev.langchain4j.data.document.loader.UrlDocumentLoader;
import dev.langchain4j.data.document.parser.apache.pdfbox.ApachePdfBoxDocumentParser;
import dev.langchain4j.data.document.splitter.DocumentByCharacterSplitter;
import dev.langchain4j.data.segment.TextSegment;
import java.util.List;
public class PDFProcessor {
    /** Parses a PDF document from the specified URL, and returns a
     * langchain4j Document object.
     * */
    public static Document parsePDFDocument(String url) {
        DocumentParser parser = new ApachePdfBoxDocumentParser();
        return UrlDocumentLoader.load(url, parser);
    }
    /** Splits a parsed langchain4j Document based on the specified chunking
     * parameters, and returns an array of text segments.
     */
    public static List<TextSegment> splitDocument(Document document) {
        int maxChunkSize = 400; // number of characters
        int maxChunkOverlap = 20; // number of overlapping characters between consecutive chunks
        DocumentSplitter splitter = new DocumentByCharacterSplitter(maxChunkSize, maxChunkOverlap);
        return splitter.split(document);
    }
}

Define a method to generate vector embeddings.

Create a file named EmbeddingProvider.java and paste the following code.

This code defines two methods to generate embeddings for a given input using the voyage-3-large embedding model from Voyage AI:

Multiple Inputs: The getEmbeddings() method accepts an array of text inputs (List<String>), allowing you to create multiple embeddings in a single API call. The method converts the API-provided arrays of floats to BSON arrays of doubles for storing in MongoDB.
Single Input: The getEmbedding() method accepts a single String, which represents a query you want to make against your vector data. The method converts the API-provided array of floats to a BSON array of doubles to use when querying your collection.

EmbeddingProvider.java

import dev.langchain4j.data.embedding.Embedding;
import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.model.embedding.EmbeddingModel;
import dev.langchain4j.model.voyageai.VoyageAiEmbeddingModel;
import dev.langchain4j.model.output.Response;
import org.bson.BsonArray;
import org.bson.BsonDouble;
import java.util.List;
import static java.time.Duration.ofSeconds;
public class EmbeddingProvider {
    private static EmbeddingModel embeddingModel;
    private static EmbeddingModel getEmbeddingModel() {
        if (embeddingModel == null) {
            String apiKey = System.getenv("VOYAGE_AI_KEY");
            if (apiKey == null || apiKey.isEmpty()) {
                throw new IllegalStateException("VOYAGE_AI_KEY env variable is not set or is empty.");
            }
            return VoyageAiEmbeddingModel.builder()
                    .apiKey(apiKey)
                    .modelName("voyage-3-large")
                    .build();
        }
        return embeddingModel;
    }
    /**
     * Takes an array of strings and returns a BSON array of embeddings to
     * store in the database.
     */
    public List<BsonArray> getEmbeddings(List<String> texts) {
        List<TextSegment> textSegments = texts.stream()
                .map(TextSegment::from)
                .toList();
        Response<List<Embedding>> response = getEmbeddingModel().embedAll(textSegments);
        return response.content().stream()
                .map(e -> new BsonArray(
                        e.vectorAsList().stream()
                                .map(BsonDouble::new)
                                .toList()))
                .toList();
    }
    /**
     * Takes a single string and returns a BSON array embedding to
     * use in a vector query.
     */
    public BsonArray getEmbedding(String text) {
        Response<Embedding> response = getEmbeddingModel().embed(text);
        return new BsonArray(
                response.content().vectorAsList().stream()
                        .map(BsonDouble::new)
                        .toList());
    }
}

This code defines two methods to generate embeddings for a given input using the mxbai-embed-large-v1 open-source embedding model:

Multiple Inputs: The getEmbeddings() method accepts an array of text segment inputs (List<TextSegment>), allowing you to create multiple embeddings in a single API call. The method converts the API-provided arrays of floats to BSON arrays of doubles for storing in MongoDB.
Single Input: The getEmbedding() method accepts a single String, which represents a query you want to make against your vector data. The method converts the API-provided array of floats to a BSON array of doubles to use when querying your collection.

EmbeddingProvider.java

import dev.langchain4j.data.embedding.Embedding;
import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.model.huggingface.HuggingFaceEmbeddingModel;
import dev.langchain4j.model.output.Response;
import org.bson.BsonArray;
import org.bson.BsonDouble;
import java.util.List;
import static java.time.Duration.ofSeconds;
public class EmbeddingProvider {
    private static HuggingFaceEmbeddingModel embeddingModel;
    private static HuggingFaceEmbeddingModel getEmbeddingModel() {
        if (embeddingModel == null) {
            String accessToken = System.getenv("HUGGING_FACE_ACCESS_TOKEN");
            if (accessToken == null || accessToken.isEmpty()) {
                throw new RuntimeException("HUGGING_FACE_ACCESS_TOKEN env variable is not set or is empty.");
            }
            embeddingModel = HuggingFaceEmbeddingModel.builder()
                    .accessToken(accessToken)
                    .modelId("mixedbread-ai/mxbai-embed-large-v1")
                    .waitForModel(true)
                    .timeout(ofSeconds(60))
                    .build();
        }
        return embeddingModel;
    }
    /**
     * Takes an array of text segments and returns a BSON array of embeddings to
     * store in the database.
     */
    public List<BsonArray> getEmbeddings(List<TextSegment> texts) {
        List<TextSegment> textSegments = texts.stream()
                .toList();
        Response<List<Embedding>> response = getEmbeddingModel().embedAll(textSegments);
        return response.content().stream()
                .map(e -> new BsonArray(
                        e.vectorAsList().stream()
                                .map(BsonDouble::new)
                                .toList()))
                .toList();
    }
    /**
     * Takes a single string and returns a BSON array embedding to
     * use in a vector query.
     */
    public static BsonArray getEmbedding(String text) {
        Response<Embedding> response = getEmbeddingModel().embed(text);
        return new BsonArray(
                response.content().vectorAsList().stream()
                        .map(BsonDouble::new)
                        .toList());
    }
}

Define a method to ingest data into your MongoDB deployment.

Create a file named DataIngest.java and paste the following code.

This code uses the LangChain4j library and the MongoDB Java Sync Driver to ingest sample data into MongoDB that LLMs don't have access to.

Specifically, this code does the following:

Connects to your MongoDB deployment.
Loads and parses the MongoDB earnings report PDF file from the URL using the parsePDFDocument method that you previously defined.
Splits the data into chunks using the splitDocument method that you previously defined.
Creates vector embeddings from the chunked data using the GetEmbeddings() method that you previously defined.

Stores the embeddings alongside the chunked data in the rag_db.test collection.

DataIngest.java

import com.mongodb.MongoException;
import com.mongodb.client.MongoClient;
import com.mongodb.client.MongoClients;
import com.mongodb.client.MongoCollection;
import com.mongodb.client.MongoDatabase;
import com.mongodb.client.result.InsertManyResult;
import dev.langchain4j.data.segment.TextSegment;
import org.bson.BsonArray;
import org.bson.Document;
import java.util.ArrayList;
import java.util.List;
import java.util.stream.Collectors;
public class DataIngest {
    public static void main(String[] args) {
        String uri = System.getenv("MONGODB_URI");
        if (uri == null || uri.isEmpty()) {
            throw new RuntimeException("MONGODB_URI env variable is not set or is empty.");
        }
        // establish connection and set namespace
        try (MongoClient mongoClient = MongoClients.create(uri)) {
            MongoDatabase database = mongoClient.getDatabase("rag_db");
            MongoCollection<Document> collection = database.getCollection("test");
            // parse the PDF file at the specified URL
            String url = "https://investors.mongodb.com/node/12236/pdf";
            String fileName = "mongodb_annual_report.pdf";
            System.out.println("Parsing the [" + fileName + "] file from url: " + url);
            dev.langchain4j.data.document.Document parsedDoc = PDFProcessor.parsePDFDocument(url);
            // split (or "chunk") the parsed document into text segments
            List<TextSegment> segments = PDFProcessor.splitDocument(parsedDoc);
            System.out.println(segments.size() + " text segments created successfully.");
            
            // create vector embeddings from the chunked data (i.e. text segments)
            System.out.println("Creating vector embeddings from the parsed data segments. This may take a few moments.");
            List<Document> documents = embedText(segments);
            // insert the embeddings into the MongoDB collection
            try {
                System.out.println("Ingesting data into the " + collection.getNamespace() + " collection.");
                insertDocuments(documents, collection);
            }
            catch (MongoException me) {
                throw new RuntimeException("Failed to insert documents", me);
            }
        } catch (MongoException me) {
            throw new RuntimeException("Failed to connect to MongoDB", me);
        } catch (Exception e) {
            throw new RuntimeException("Operation failed: ", e);
        }
    }
    
    /** 
     * Embeds text segments into vector embeddings using the EmbeddingProvider
     * class and returns a list of BSON documents containing the text and 
     * generated embeddings.
    */
    private static List<Document> embedText(List<TextSegment> segments) {
        EmbeddingProvider embeddingProvider = new EmbeddingProvider();
        List<String> texts = segments.stream()
                                     .map(TextSegment::text)
                                     .collect(Collectors.toList());
        List<BsonArray> embeddings = embeddingProvider.getEmbeddings(texts);
        List<Document> documents = new ArrayList<>();
        int i = 0;
        for (TextSegment segment : segments) {
            Document doc = new Document("text", segment.text()).append("embedding", embeddings.get(i));
            documents.add(doc);
            i++;
        }
        return documents;
    }
    /**
     * Inserts a list of BSON documents into the specified MongoDB collection.
     */
    private static void insertDocuments(List<Document> documents, MongoCollection<Document> collection) {
        List<String> insertedIds = new ArrayList<>();
        InsertManyResult result = collection.insertMany(documents);
        result.getInsertedIds().values()
                .forEach(doc -> insertedIds.add(doc.toString()));
        System.out.println(insertedIds.size() + " documents inserted into the " + collection.getNamespace() + " collection successfully.");
    }
}

Generate the embeddings.

Note

503 when calling Hugging Face models

You may occasionally get 503 errors when calling Hugging Face model hub models. To resolve this issue, retry after a short delay.

Save and run the DataIngest.java file. The output resembles:

Parsing the [mongodb_annual_report.pdf] file from url: https://investors.mongodb.com/node/12236/pdf
72 text segments created successfully.
Creating vector embeddings from the parsed data segments. This may take a few moments...
Ingesting data into the rag_db.test collection.
72 documents inserted into the rag_db.test collection successfully.

Use Atlas Vector Search to retrieve documents.

In this section, you set up Atlas Vector Search to retrieve documents from your vector database.

Create a file named VectorIndex.java and paste the following code.

This code creates an Atlas Vector Search index on your collection using the following index definition:

Index the embedding field in a vector index type for the rag_db.test collection. This field contains the embedding created using the embedding model.
Enforce 1024 vector dimensions and measure similarity between vectors using cosine.

VectorIndex.java

import com.mongodb.MongoException;
import com.mongodb.client.ListSearchIndexesIterable;
import com.mongodb.client.MongoClient;
import com.mongodb.client.MongoClients;
import com.mongodb.client.MongoCollection;
import com.mongodb.client.MongoCursor;
import com.mongodb.client.MongoDatabase;
import com.mongodb.client.model.SearchIndexModel;
import com.mongodb.client.model.SearchIndexType;
import org.bson.Document;
import org.bson.conversions.Bson;
import java.util.Collections;
import java.util.List;
public class VectorIndex {
    public static void main(String[] args) {
        String uri = System.getenv("MONGODB_URI");
        if (uri == null || uri.isEmpty()) {
            throw new IllegalStateException("MONGODB_URI env variable is not set or is empty.");
        }
        // establish connection and set namespace
        try (MongoClient mongoClient = MongoClients.create(uri)) {
            MongoDatabase database = mongoClient.getDatabase("rag_db");
            MongoCollection<Document> collection = database.getCollection("test");
            // define the index details for the index model
            String indexName = "vector_index";
            Bson definition = new Document(
                    "fields",
                    Collections.singletonList(
                            new Document("type", "vector")
                                    .append("path", "embedding")
                                    .append("numDimensions", 1024)
                                    .append("similarity", "cosine")));
            SearchIndexModel indexModel = new SearchIndexModel(
                    indexName,
                    definition,
                    SearchIndexType.vectorSearch());
            // create the index using the defined model
            try {
                List<String> result = collection.createSearchIndexes(Collections.singletonList(indexModel));
                System.out.println("Successfully created vector index named: " + result);
                System.out.println("It may take up to a minute for the index to build before you can query using it.");
            } catch (Exception e) {
                throw new RuntimeException(e);
            }
            // wait for index to build and become queryable
            System.out.println("Polling to confirm the index has completed building.");
            waitForIndexReady(collection, indexName);
        } catch (MongoException me) {
            throw new RuntimeException("Failed to connect to MongoDB", me);
        } catch (Exception e) {
            throw new RuntimeException("Operation failed: ", e);
        }
    }
    /**
     * Polls the collection to check whether the specified index is ready to query.
     */
    public static void waitForIndexReady(MongoCollection<Document> collection, String indexName) throws InterruptedException {
        ListSearchIndexesIterable<Document> searchIndexes = collection.listSearchIndexes();
        while (true) {
            try (MongoCursor<Document> cursor = searchIndexes.iterator()) {
                if (!cursor.hasNext()) {
                    break;
                }
                Document current = cursor.next();
                String name = current.getString("name");
                boolean queryable = current.getBoolean("queryable");
                if (name.equals(indexName) && queryable) {
                    System.out.println(indexName + " index is ready to query");
                    return;
                } else {
                    Thread.sleep(500);
                }
            }
        }
    }
}

Create the Atlas Vector Search index.

Save and run the file. The output resembles:

Successfully created a vector index named: [vector_index]
Polling to confirm the index has completed building.
It may take up to a minute for the index to build before you can query using it.
vector_index index is ready to query

Create the code to generate responses with the LLM.

In this section, you generate responses by prompting an LLM to use the retrieved documents as context.

Create a new file called LLMPrompt.java, and paste the following code into it.

This code does the following:

Queries the rag_db.test collection for any matching documents using a retrieveDocuments method.
This method uses the getEmbedding() method that you created earlier to generate an embedding from the search query, then runs the query to return semantically-similar documents.
To learn more, refer to Run Vector Search Queries.
Accesses an LLM from OpenAI, and creates a templated prompt using a createPrompt method.
The method instructs the LLM to include the user's question and retrieved documents in the defined prompt.

Prompts the LLM about MongoDB's latest AI announcements, then returns a generated response.

LLMPrompt.java

import com.mongodb.MongoException;
import com.mongodb.client.MongoClient;
import com.mongodb.client.MongoClients;
import com.mongodb.client.MongoCollection;
import com.mongodb.client.MongoDatabase;
import com.mongodb.client.model.search.FieldSearchPath;
import dev.langchain4j.data.message.AiMessage;
import dev.langchain4j.model.chat.ChatModel;
import dev.langchain4j.model.chat.request.ChatRequest;
import dev.langchain4j.model.chat.response.ChatResponse;
import dev.langchain4j.model.input.Prompt;
import dev.langchain4j.model.input.PromptTemplate;
import dev.langchain4j.model.openai.OpenAiChatModel;
import org.bson.BsonArray;
import org.bson.BsonValue;
import org.bson.Document;
import org.bson.conversions.Bson;
import java.util.ArrayList;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import static com.mongodb.client.model.Aggregates.project;
import static com.mongodb.client.model.Aggregates.vectorSearch;
import static com.mongodb.client.model.Projections.exclude;
import static com.mongodb.client.model.Projections.fields;
import static com.mongodb.client.model.Projections.include;
import static com.mongodb.client.model.Projections.metaVectorSearchScore;
import static com.mongodb.client.model.search.SearchPath.fieldPath;
import static com.mongodb.client.model.search.VectorSearchOptions.exactVectorSearchOptions;
import static java.util.Arrays.asList;
public class LLMPrompt {
    // User input: the question to answer
    static String question = "In a few sentences, what are MongoDB's latest AI announcements?";
    public static void main(String[] args) {
        String uri = System.getenv("MONGODB_URI");
        if (uri == null || uri.isEmpty()) {
            throw new IllegalStateException("MONGODB_URI env variable is not set or is empty.");
        }
        // establish connection and set namespace
        try (MongoClient mongoClient = MongoClients.create(uri)) {
            MongoDatabase database = mongoClient.getDatabase("rag_db");
            MongoCollection<Document> collection = database.getCollection("test");
            // generate a response to the user question
            try {
                createPrompt(question, collection);
            } catch (Exception e) {
                throw new RuntimeException("An error occurred while generating the response: ", e);
            }
        } catch (MongoException me) {
            throw new RuntimeException("Failed to connect to MongoDB ", me);
        } catch (Exception e) {
            throw new RuntimeException("Operation failed: ", e);
        }
    }
    /**
     * Returns a list of documents from the specified MongoDB collection that
     * match the user's question.
     * NOTE: Update or omit the projection stage to change the desired fields in the response
     */
    public static List<Document> retrieveDocuments(String question, MongoCollection<Document> collection) {
        try {
            // generate the query embedding to use in the vector search
            EmbeddingProvider embeddingProvider = new EmbeddingProvider();
            BsonArray queryEmbeddingBsonArray = embeddingProvider.getEmbedding(question);
            List<Double> queryEmbedding = new ArrayList<>();
            for (BsonValue value : queryEmbeddingBsonArray.stream().toList()) {
                queryEmbedding.add(value.asDouble().getValue());
            }
            // define the pipeline stages for the vector search index
            String indexName = "vector_index";
            FieldSearchPath fieldSearchPath = fieldPath("embedding");
            int limit = 5;
            List<Bson> pipeline = asList(
                    vectorSearch(
                            fieldSearchPath,
                            queryEmbedding,
                            indexName,
                            limit,
                            exactVectorSearchOptions()),
                    project(
                            fields(
                                    exclude("_id"),
                                    include("text"),
                                    metaVectorSearchScore("score"))));
            // run the query and return the matching documents
            List<Document> matchingDocuments = new ArrayList<>();
            collection.aggregate(pipeline).forEach(matchingDocuments::add);
            return matchingDocuments;
        } catch (Exception e) {
            System.err.println("Error occurred while retrieving documents: " + e.getMessage());
            return new ArrayList<>();
        }
    }
    /**
     * Creates a templated prompt from a submitted question string and any retrieved documents,
     * then generates a response using the OpenAI chat model.
     */
    public static void createPrompt(String question, MongoCollection<Document> collection) {
        // retrieve documents matching the user's question
        List<Document> retrievedDocuments = retrieveDocuments(question, collection);
        if (retrievedDocuments.isEmpty()) {
            System.out.println("No relevant documents found. Unable to generate a response.");
            return;
        } else
            System.out.println("Generating a response from the retrieved documents. This may take a few moments.");
        // define a prompt template
        PromptTemplate promptBuilder = PromptTemplate.from("""
                Answer the following question based on the given context:
                Question: {{question}}
                Context: {{information}}
                -------
                """);
        // build the information string from the retrieved documents
        StringBuilder informationBuilder = new StringBuilder();
        for (Document doc : retrievedDocuments) {
            String text = doc.getString("text");
            informationBuilder.append(text).append("\n");
        }
        Map<String, Object> variables = new HashMap<>();
        variables.put("question", question);
        variables.put("information", informationBuilder);
        // generate and output the response from the chat model
        Prompt prompt = promptBuilder.apply(variables);
        ChatRequest chatRequest = ChatRequest.builder()
                .messages(Collections.singletonList(prompt.toUserMessage()))
                .build();
        String openAIApiKey = System.getenv("OPENAI_API_KEY");
        if (openAIApiKey == null || openAIApiKey.isEmpty()) {
            throw new IllegalStateException("OPENAI_API_KEY env variable is not set or is empty.");
        }
        ChatModel chatModel = OpenAiChatModel.builder()
                .apiKey(openAIApiKey)
                .modelName("gpt-4o")
                .build();
        ChatResponse chatResponse = chatModel.chat(chatRequest);
        AiMessage aiMessage = chatResponse.aiMessage();
        // extract the generated text to output a formatted response
        String responseText = aiMessage.text();
        String marker = "-------";
        int markerIndex = responseText.indexOf(marker);
        String generatedResponse;
        if (markerIndex != -1) {
            generatedResponse = responseText.substring(markerIndex + marker.length()).trim();
        } else {
            generatedResponse = responseText; // else fallback to the full response
        }
        // output the question and formatted response
        System.out.println("Question:\n " + question);
        System.out.println("Response:\n " + generatedResponse);
        // output the filled-in prompt and context information for demonstration purposes
        System.out.println("\n" + "---- Prompt Sent to LLM ----");
        System.out.println(prompt.text() + "\n");
    }
}

Generate responses with the LLM.

Save and run the file. The output resembles the following, but note that the generated response might vary.

Question:
 In a few sentences, what are MongoDB's latest AI announcements?
Response:
 MongoDB recently made significant AI-related announcements, including the launch of the MongoDB AI Applications Program (MAAP). This initiative provides customers with tools such as reference architectures, pre-built partner integrations, and professional services to accelerate the development of AI-powered applications. Accenture has joined as the first global systems integrator for MAAP and will establish a center of excellence focused on MongoDB projects. Additionally, MongoDB unveiled version 8.0 with major performance improvements, including faster reads, updates, and bulk inserts, as well as enhanced time series queries. The company also announced the general availability of Atlas Stream Processing for real-time, event-driven applications. These advancements position MongoDB to support the growing demands of AI-driven workloads.
---- Prompt Sent to Azure OpenAI LLM ----
Answer the following question based on the given context:
Question: In a few sentences, what are MongoDB's latest AI announcements?
Context: MongoDB continues to expand its AI ecosystem with the announcement of the MongoDB AI Applications Program (MAAP),
more of our customers. We also see a tremendous opportunity to win more legacy workloads, as AI has now become a catalyst to modernize these
applications. MongoDB's document-based architecture is particularly well-suited for the variety and scale of data required by AI-powered applications. 
We are confident MongoDB will be a substantial beneficiary of this next wave of application development."
of MongoDB 8.0—with significant performance improvements such as faster reads and updates, along with significantly
faster bulk inserts and time series queries—and the general availability of Atlas Stream Processing to build sophisticated,
event-driven applications with real-time data.
which provides customers with reference architectures, pre-built partner integrations, and professional services to help
them quickly build AI-powered applications. Accenture will establish a center of excellence focused on MongoDB projects,
and is the first global systems integrator to join MAAP.
included at the end of this press release. An explanation of these measures is also included below under the heading "Non-GAAP Financial
Measures."
First Quarter Fiscal 2025 and Recent Business Highlights
MongoDB announced a number of new products and capabilities at MongoDB.local NYC. Highlights included the preview

Set up the environment.

Initialize your Node.js project.
Run the following commands in your terminal to create a new directory named rag-mongodb and initialize your project:
```
mkdir rag-mongodb
cd rag-mongodb
npm init -y
```
Install and import dependencies.
Run the following command:
```
npm install mongodb voyageai openai @huggingface/inference @xenova/transformers langchain @langchain/community pdf-parse
```
Update your package.json file.
In your project's package.json file, specify the type field as shown in the following example, and then save the file.
```
{
   "name": "rag-mongodb",
   "type": "module",
   ...
```

Create a .env file.

In your project, create a .env file to store your MongoDB connection string and API keys for the models that you want to use:

MONGODB_URI = "<connection-string>"
VOYAGE_API_KEY = "<voyage-api-key>"       # If using Voyage AI embedding model
HUGGING_FACE_ACCESS_TOKEN = "<hf-token>"  # If using Hugging Face embedding or generative model
OPENAI_API_KEY = "<openai-api-key>"       # If using OpenAI generative model

Replace <connection-string> with the connection string for your Atlas cluster or local Atlas deployment.

Your connection string should use the following format:

mongodb+srv://<db_username>:<db_password>@<clusterName>.<hostname>.mongodb.net

To learn more, see Connect to a Cluster via Drivers.

Your connection string should use the following format:

mongodb://localhost:<port-number>/?directConnection=true

To learn more, see Connection Strings.

Note

Minimum Node.js Version Requirements

Node.js v20.x introduced the --env-file option. If you are using an older version of Node.js, add the dotenv package to your project, or use a different method to manage your environment variables.

Create a function to generate vector embeddings.

To generate embeddings, use an embedding model. For this tutorial, you can use an open-source model from Hugging Face or a proprietary model from Voyage AI.

In your project, create a file called get-embeddings.js and paste the following code:

import { VoyageAIClient } from 'voyageai';
// Set up Voyage AI configuration
const client = new VoyageAIClient({apiKey: process.env.VOYAGE_API_KEY});
// Function to generate embeddings using the Voyage AI API
export async function getEmbedding(text) {
    const results = await client.embed({
        input: text,
        model: "voyage-3-large"
    });
    return results.data[0].embedding;
}

The getEmbedding() function generates vector embeddings by using the voyage-3-large embedding model from Voyage AI.

Tip

To learn more, see Voyage AI Typescript Library.

import { pipeline } from '@xenova/transformers';
// Function to generate embeddings for a given data source
export async function getEmbedding(data) {
    const embedder = await pipeline(
        'feature-extraction', 
        'Xenova/nomic-embed-text-v1');
    const results = await embedder(data, { pooling: 'mean', normalize: true });
    return Array.from(results.data);
}

The getEmbedding() function generates vector embeddings by using the nomic-embed-text-v1 embedding model from Sentence Transformers.

Ingest data into your MongoDB deployment.

In this section, you ingest sample data into MongoDB that LLMs don't have access to. The following code uses the LangChain integration and Node.js driver to do the following:

Load a PDF that contains a MongoDB earnings report.
Split the data into chunks, specifying the chunk size (number of characters) and chunk overlap (number of overlapping characters between consecutive chunks).
Create vector embeddings from the chunked data by using the getEmbedding() function that you defined.
Store these embeddings alongside the chunked data in the rag_db.test collection.

Create a file called ingest-data.js in your project, and paste the following code:

import { PDFLoader } from "@langchain/community/document_loaders/fs/pdf";
import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import { MongoClient } from 'mongodb';
import { getEmbedding } from './get-embeddings.js';
import * as fs from 'fs';
async function run() {
    const client = new MongoClient(process.env.MONGODB_URI);
    try {
        // Save online PDF as a file
        const rawData = await fetch("https://investors.mongodb.com/node/12236/pdf");
        const pdfBuffer = await rawData.arrayBuffer();
        const pdfData = Buffer.from(pdfBuffer);
        fs.writeFileSync("investor-report.pdf", pdfData);
        const loader = new PDFLoader(`investor-report.pdf`);
        const data = await loader.load();
        // Chunk the text from the PDF
        const textSplitter = new RecursiveCharacterTextSplitter({
            chunkSize: 400,
            chunkOverlap: 20,
        });
        const docs = await textSplitter.splitDocuments(data);
        console.log(`Successfully chunked the PDF into ${docs.length} documents.`);
        // Connect to your MongoDB cluster
        await client.connect();
        const db = client.db("rag_db");
        const collection = db.collection("test");
        console.log("Generating embeddings and inserting documents...");
        const insertDocuments = [];
        await Promise.all(docs.map(async doc => {
            // Generate embeddings using the function that you defined
            const embedding = await getEmbedding(doc.pageContent);
            // Add the document with the embedding to array of documents for bulk insert
            insertDocuments.push({
                document: doc,
                embedding: embedding
            });
        }))
        // Continue processing documents if an error occurs during an operation
        const options = { ordered: false };
        // Insert documents with embeddings into collection
        const result = await collection.insertMany(insertDocuments, options);  
        console.log("Count of documents inserted: " + result.insertedCount); 
    } catch (err) {
        console.log(err.stack);
    }
    finally {
        await client.close();
    }
}
run().catch(console.dir);

Then, run the following command to execute the code:

node --env-file=.env ingest-data.js

Generating embeddings and inserting documents...
Count of documents inserted: 86

Tip

This code takes some time to run. If you're using Atlas, you can verify your vector embeddings by navigating to the rag_db.test namespace in the Atlas UI.

Use Atlas Vector Search to retrieve documents.

In this section, you set up Atlas Vector Search to retrieve documents from your vector database. Complete the following steps:

Create an Atlas Vector Search index on your vector embeddings.

Create a new file named rag-vector-index.js and paste the following code. This code connects to your MongoDB deployment and creates an index of the vectorSearch type on the rag_db.test collection. Replace the <dimensions> placeholder with one of the following values:

768 if you used nomic-embed-text-v1
1024 if you used voyage-3-large

import { MongoClient } from 'mongodb';
// Connect to your MongoDB cluster
const client = new MongoClient(process.env.MONGODB_URI);
async function run() {
    try {
      const database = client.db("rag_db");
      const collection = database.collection("test");
     
      // Define your Vector Search index
      const index = {
          name: "vector_index",
          type: "vectorSearch",
          definition: {
            "fields": [
              {
                "type": "vector",
                "path": "embedding",
                "similarity": "cosine",
                "numDimensions": <dimensions> // Replace with the number of dimensions of your embeddings
              }
            ]
          }
      }
 
      // Call the method to create the index
      const result = await collection.createSearchIndex(index);
      console.log(result);
    } finally {
      await client.close();
    }
}
run().catch(console.dir);

Then, run the following command to execute the code:

node --env-file=.env rag-vector-index.js

Define a function to retrieve relevant data.

Create a new file called retrieve-documents.js.

In this step, you create a retrieval function called getQueryResults() that runs a query to retrieve relevant documents. It uses the getEmbedding() function to create an embedding from the search query. Then, it runs the query to return semantically-similar documents.

To learn more, refer to Run Vector Search Queries.

Paste this code into your file:

import { MongoClient } from 'mongodb';
import { getEmbedding } from './get-embeddings.js';
// Function to get the results of a vector query
export async function getQueryResults(query) {
    // Connect to your Atlas cluster
    const client = new MongoClient(process.env.MONGODB_URI);
    
    try {
        // Get embedding for a query
        const queryEmbedding = await getEmbedding(query);
        await client.connect();
        const db = client.db("rag_db");
        const collection = db.collection("test");
        const pipeline = [
            {
                $vectorSearch: {
                    index: "vector_index",
                    queryVector: queryEmbedding,
                    path: "embedding",
                    exact: true,
                    limit: 5
                }
            },
            {
                $project: {
                    _id: 0,
                    document: 1,
                }
            }
        ];
        // Retrieve documents using a Vector Search query
        const result = collection.aggregate(pipeline);
        const arrayOfQueryDocs = [];
        for await (const doc of result) {
            arrayOfQueryDocs.push(doc);
        }
        return arrayOfQueryDocs;
    } catch (err) {
        console.log(err.stack);
    }
    finally {
        await client.close();
    }
}

Test retrieving the data.

Create a new file called retrieve-documents-test.js. In this step, you check that the function you just defined returns relevant results.

Paste this code into your file:

import { getQueryResults } from './retrieve-documents.js';
async function run() {
    try {
        const query = "AI Technology";
        const documents = await getQueryResults(query);
        documents.forEach( doc => {
            console.log(doc);
        }); 
    } catch (err) {
        console.log(err.stack);
    }
}
run().catch(console.dir);

Then, run the following command to execute the code. Your results might vary depending on the embedding model you use.

node --env-file=.env retrieve-documents-test.js

{
  document: {
    pageContent: 'MongoDB continues to expand its AI ecosystem with the announcement of the MongoDB AI Applications Program (MAAP),',
    metadata: { source: 'investor-report.pdf', pdf: [Object], loc: [Object] },
    id: null
  }
}
{
  document: {
    pageContent: 'artificial intelligence, in our offerings or partnerships; the growth and expansion of the market for database products and our ability to penetrate that\n' +
      'market; our ability to integrate acquired businesses and technologies successfully or achieve the expected benefits of such acquisitions; our ability to',
    metadata: { source: 'investor-report.pdf', pdf: [Object], loc: [Object] },
    id: null
  }
}
{
  document: {
    pageContent: 'more of our customers. We also see a tremendous opportunity to win more legacy workloads, as AI has now become a catalyst to modernize these\n' +
      "applications. MongoDB's document-based architecture is particularly well-suited for the variety and scale of data required by AI-powered applications. \n" +
      'We are confident MongoDB will be a substantial beneficiary of this next wave of application development."',
    metadata: { source: 'investor-report.pdf', pdf: [Object], loc: [Object] },
    id: null
  }
}
{
  document: {
    pageContent: 'which provides customers with reference architectures, pre-built partner integrations, and professional services to help\n' +
      'them quickly build AI-powered applications. Accenture will establish a center of excellence focused on MongoDB projects,\n' +
      'and is the first global systems integrator to join MAAP.',
    metadata: { source: 'investor-report.pdf', pdf: [Object], loc: [Object] },
    id: null
  }
}
{
  document: {
    pageContent: 'Bendigo and Adelaide Bank partnered with MongoDB to modernize their core banking technology. With the help of\n' +
      'MongoDB Relational Migrator and generative AI-powered modernization tools, Bendigo and Adelaide Bank decomposed an\n' +
      'outdated consumer-servicing application into microservices and migrated off its underlying legacy relational database',
    metadata: { source: 'investor-report.pdf', pdf: [Object], loc: [Object] },
    id: null
  }
}

Generate responses with the LLM.

In this section, you generate responses by prompting an LLM to use the retrieved documents as context. For this tutorial, you can use a model from OpenAI or an open-source model from Hugging Face. This example uses the function you just defined to retrieve matching documents from the database, and additionally:

Instructs the LLM to include the user's question and retrieved documents in the prompt.
Prompts the LLM about MongoDB's latest AI announcements.

Create a new file called generate-responses.js, and paste the following code into it:

import { getQueryResults } from './retrieve-documents.js';
import OpenAI from 'openai';
async function run() {
    try {
        // Specify search query and retrieve relevant documents
        const question = "In a few sentences, what are MongoDB's latest AI announcements?";
        const documents = await getQueryResults(question);
        // Build a string representation of the retrieved documents to use in the prompt
        let textDocuments = "";
        documents.forEach(doc => {
            textDocuments += doc.document.pageContent;
        });
        // Create a prompt consisting of the question and context to pass to the LLM
        const prompt = `Answer the following question based on the given context.
            Question: {${question}}
            Context: {${textDocuments}}
        `;
        // Initialize OpenAI client
        const client = new OpenAI({
            apiKey: process.env.OPENAI_API_KEY,
        });
        // Prompt the LLM to generate a response based on the context
        const chatCompletion = await client.chat.completions.create({
            model: "gpt-4o",
            messages: [
                {
                    role: "user",
                    content: prompt
                },
            ],
        });
        // Output the LLM's response as text.
        console.log(chatCompletion.choices[0].message.content);
    } catch (err) {
        console.log(err.stack);
    }
}
run().catch(console.dir);

import { getQueryResults } from './retrieve-documents.js';
import { HfInference } from '@huggingface/inference'
async function run() {
    try {
        // Specify search query and retrieve relevant documents
        const question = "In a few sentences, what are MongoDB's latest AI announcements?";
        const documents = await getQueryResults(question);
        // Build a string representation of the retrieved documents to use in the prompt
        let textDocuments = "";
        documents.forEach(doc => {
            textDocuments += doc.document.pageContent;
        });
        // Create a prompt consisting of the question and context to pass to the LLM
        const prompt = `Answer the following question based on the given context.
            Question: {${question}}
            Context: {${textDocuments}}
        `;
        // Prompt the LLM to generate a response based on the context
        const client = new InferenceClient(process.env.HUGGING_FACE_ACCESS_TOKEN);
        const chatCompletion = await client.chatCompletion({
            provider: "fireworks-ai",
            model: "mistralai/Mixtral-8x22B-Instruct-v0.1",
            messages: [
                {
                    role: "user",
                    content: prompt
                },
            ],
        });
        // Output the LLM's response as text.
        console.log(chatCompletion.choices[0].message.content);
    } catch (err) {
        console.log(err.stack);
    }
}
run().catch(console.dir);

Then, run this command to execute the code. The generated response might vary.

node --env-file=.env generate-responses.js

MongoDB's latest AI announcements include the launch of the MongoDB
AI Applications Program (MAAP), which provides customers with
reference architectures, pre-built partner integrations, and
professional services to help them build AI-powered applications
quickly. Accenture has joined MAAP as the first global systems
integrator, establishing a center of excellence focused on MongoDB
projects. Additionally, Bendigo and Adelaide Bank have partnered
with MongoDB to modernize their core banking technology using
MongoDB's Relational Migrator and generative AI-powered
modernization tools.

Set up the environment.

Create an interactive Python notebook by saving a file with the .ipynb extension. This notebook allows you to run Python code snippets individually. In your notebook, run the following code to install the dependencies for this tutorial:

pip install --quiet --upgrade pymongo sentence_transformers voyageai huggingface_hub openai einops langchain langchain_community pypdf

Then, run the following code to set the environment variables for this tutorial, replacing the placeholders with any API keys that you need to access the models.

import os
os.environ["VOYAGE_API_KEY"] = "<voyage-api-key>" # If using Voyage AI embedding model
os.environ["HF_TOKEN"] = "<hf-token>"             # If using Hugging Face embedding or generative model
os.environ["OPENAI_API_KEY"] = "<openai-api-key>" # If using OpenAI generative model

Ingest data into your MongoDB deployment.

In this section, you ingest sample data into MongoDB that LLMs don't have access to. Paste and run each of the following code snippets in your notebook:

Define a function to generate vector embeddings.

To generate embeddings, use an embedding model. For this tutorial, you can use an open-source model from Hugging Face or a proprietary model from Voyage AI.

Paste and run the following code in your notebook to create a function named get_embedding() that generates vector embeddings by using an embedding model from Voyage AI. Replace <api-key> with your Voyage API key.

The function specifies the following:

voyage-3-large as the embedding model to use.
input_type parameter to optimize your embeddings for retrieval. To learn more, see Voyage AI Python API.

Tip

For all models and parameters, see Voyage AI Text Embeddings.

import os
import voyageai
# Specify the embedding model
model = "voyage-3-large"
vo = voyageai.Client()
# Define a function to generate embeddings
def get_embedding(data, input_type = "document"):
  embeddings = vo.embed(
      data, model = model, input_type = input_type
  ).embeddings
  return embeddings[0]

Paste and run the following code in your notebook to create a function named get_embedding() that generates vector embeddings by using the nomic-embed-text-v1 embedding model from Sentence Transformers.

from sentence_transformers import SentenceTransformer
# Load the embedding model (https://huggingface.co/nomic-ai/nomic-embed-text-v1")
model = SentenceTransformer("nomic-ai/nomic-embed-text-v1", trust_remote_code=True)
    
# Define a function to generate embeddings
def get_embedding(data):
    """Generates vector embeddings for the given data."""
    embedding = model.encode(data)
    return embedding.tolist()

Load and split the data.

Run this code to load and split sample data by using the LangChain integration. Specifically, this code does the following:

Loads a PDF that contains a MongoDB earnings report.
Splits the data into chunks, specifying the chunk size (number of characters) and chunk overlap (number of overlapping characters between consecutive chunks).

from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
# Load the PDF
loader = PyPDFLoader("https://investors.mongodb.com/node/12236/pdf")
data = loader.load()
# Split the data into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=400, chunk_overlap=20)
documents = text_splitter.split_documents(data)

Convert the data to vector embeddings.
Run this code to prepare the chunked documents for ingestion by creating a list of documents with their corresponding vector embeddings. You generate these embeddings by using the get_embedding() function that you just defined.
```
# Prepare documents for insertion
docs_to_insert = [{
    "text": doc.page_content,
    "embedding": get_embedding(doc.page_content)
} for doc in documents]
```

Store the data and embeddings in MongoDB.

Run this code to insert the documents containing the embeddings into the rag_db.test collection. Before running the code, replace <connection-string> with your MongoDB connection string.

from pymongo import MongoClient
# Connect to your MongoDB deployment
client = MongoClient("<connection-string>")
collection = client["rag_db"]["test"]
# Insert documents into the collection
result = collection.insert_many(docs_to_insert)

Tip

After you run the code, if you're using Atlas, you can verify your vector embeddings by navigating to the rag_db.test namespace in the Atlas UI.

Use Atlas Vector Search to retrieve documents.

In this section, you create a retrieval system using Atlas Vector Search to get relevant documents from your vector database. Paste and run each of the following code snippets in your notebook:

Create an Atlas Vector Search index on your vector embeddings.

Run the following code to create the index directly from your application with the PyMongo Driver. This code also includes a polling mechanism to check if the index is ready to use.

To learn more, see How to Index Fields for Vector Search.

from pymongo.operations import SearchIndexModel
import time
# Create your index model, then create the search index
index_name="vector_index"
search_index_model = SearchIndexModel(
  definition = {
    "fields": [
      {
        "type": "vector",
        "numDimensions": 768,
        "path": "embedding",
        "similarity": "cosine"
      }
    ]
  },
  name = index_name,
  type = "vectorSearch"
)
collection.create_search_index(model=search_index_model)
# Wait for initial sync to complete
print("Polling to check if the index is ready. This may take up to a minute.")
predicate=None
if predicate is None:
   predicate = lambda index: index.get("queryable") is True
while True:
   indices = list(collection.list_search_indexes(index_name))
   if len(indices) and predicate(indices[0]):
      break
   time.sleep(5)
print(index_name + " is ready for querying.")

from pymongo.operations import SearchIndexModel
import time
# Create your index model, then create the search index
index_name="vector_index"
search_index_model = SearchIndexModel(
  definition = {
    "fields": [
      {
        "type": "vector",
        "numDimensions": 1024,
        "path": "embedding",
        "similarity": "cosine"
      }
    ]
  },
  name = index_name,
  type = "vectorSearch"
)
collection.create_search_index(model=search_index_model)
# Wait for initial sync to complete
print("Polling to check if the index is ready. This may take up to a minute.")
predicate=None
if predicate is None:
   predicate = lambda index: index.get("queryable") is True
while True:
   indices = list(collection.list_search_indexes(index_name))
   if len(indices) and predicate(indices[0]):
      break
   time.sleep(5)
print(index_name + " is ready for querying.")

Define a function to run vector search queries.

Run this code to create a retrieval function called get_query_results() that runs a basic vector search query. It uses the get_embedding() function to create embeddings from the search query. Then, it runs the query to return semantically similar documents. Your results might vary depending on the embedding model you use.

To learn more, see Run Vector Search Queries.

# Define a function to run vector search queries
def get_query_results(query):
  """Gets results from a vector search query."""
  query_embedding = get_embedding(query)
  pipeline = [
      {
            "$vectorSearch": {
              "index": "vector_index",
              "queryVector": query_embedding,
              "path": "embedding",
              "exact": True,
              "limit": 5
            }
      }, {
            "$project": {
              "_id": 0,
              "text": 1
         }
      }
  ]
  results = collection.aggregate(pipeline)
  array_of_results = []
  for doc in results:
      array_of_results.append(doc)
  return array_of_results
# Test the function with a sample query
import pprint
pprint.pprint(get_query_results("AI technology"))

[{'text': 'more of our customers. We also see a tremendous opportunity to win '
          'more legacy workloads, as AI has now become a catalyst to modernize '
          'these\n'
          "applications. MongoDB's  document-based architecture is "
          'particularly well-suited for the variety and scale of data required '
          'by AI-powered applications.'},
 {'text': 'artificial intelligence, in our offerings or partnerships; the '
          'growth and expansion of the market for database products and our '
          'ability to penetrate that\n'
          'market; our ability to integrate acquired businesses and '
          'technologies successfully or achieve the expected benefits of such '
          'acquisitions; our ability to'},
 {'text': 'MongoDB  continues to expand its AI ecosystem with the announcement '
          'of the MongoDB AI Applications Program (MAAP),'},
 {'text': 'which provides customers with reference architectures, pre-built '
          'partner integrations, and professional services to help\n'
          'them quickly build AI-powered applications. Accenture will '
          'establish a center of excellence focused on MongoDB  projects,\n'
          'and is the first global systems integrator to join MAAP.'},
 {'text': 'Bendigo and Adelaide Bank partnered with MongoDB  to modernize '
          'their core banking technology. With the help of\n'
          'MongoDB Relational Migrator and generative AI-powered modernization '
          'tools, Bendigo and Adelaide Bank decomposed an\n'
          'outdated consumer-servicing application into microservices and '
          'migrated off its underlying legacy relational database'}]

# Define a function to run vector search queries
def get_query_results(query):
  """Gets results from a vector search query."""
  query_embedding = get_embedding(query, input_type="query")
  pipeline = [
      {
            "$vectorSearch": {
              "index": "vector_index",
              "queryVector": query_embedding,
              "path": "embedding",
              "exact": True,
              "limit": 5
            }
      }, {
            "$project": {
              "_id": 0,
              "text": 1
         }
      }
  ]
  results = collection.aggregate(pipeline)
  array_of_results = []
  for doc in results:
      array_of_results.append(doc)
  return array_of_results
# Test the function with a sample query
import pprint
pprint.pprint(get_query_results("AI technology"))

[{'text': 'more of our customers. We also see a tremendous opportunity to win '
          'more legacy workloads, as AI has now become a catalyst to modernize '
          'these\n'
          "applications. MongoDB's  document-based architecture is "
          'particularly well-suited for the variety and scale of data required '
          'by AI-powered applications.'},
 {'text': 'artificial intelligence, in our offerings or partnerships; the '
          'growth and expansion of the market for database products and our '
          'ability to penetrate that\n'
          'market; our ability to integrate acquired businesses and '
          'technologies successfully or achieve the expected benefits of such '
          'acquisitions; our ability to'},
 {'text': 'MongoDB  continues to expand its AI ecosystem with the announcement '
          'of the MongoDB AI Applications Program (MAAP),'},
 {'text': 'which provides customers with reference architectures, pre-built '
          'partner integrations, and professional services to help\n'
          'them quickly build AI-powered applications. Accenture will '
          'establish a center of excellence focused on MongoDB  projects,\n'
          'and is the first global systems integrator to join MAAP.'},
 {'text': 'Bendigo and Adelaide Bank partnered with MongoDB  to modernize '
          'their core banking technology. With the help of\n'
          'MongoDB Relational Migrator and generative AI-powered modernization '
          'tools, Bendigo and Adelaide Bank decomposed an\n'
          'outdated consumer-servicing application into microservices and '
          'migrated off its underlying legacy relational database'}]

Generate responses with the LLM.

Uses the get_query_results() function you defined to retrieve relevant documents from your collection.
Creates a prompt using the user's question and retrieved documents as context.
Prompts the LLM about MongoDB's latest AI announcements. The generated response might vary.

from openai import OpenAI
# Specify search query, retrieve relevant documents, and convert to string
query = "What are MongoDB's latest AI announcements?"
context_docs = get_query_results(query)
context_string = " ".join([doc["text"] for doc in context_docs])
# Construct prompt for the LLM using the retrieved documents as the context
prompt = f"""Use the following pieces of context to answer the question at the end.
    {context_string}
    Question: {query}
"""
openai_client = OpenAI()
# OpenAI model to use
model_name = "gpt-4o"
completion = openai_client.chat.completions.create(
model=model_name,
messages=[{"role": "user",
    "content": prompt
  }]
)
print(completion.choices[0].message.content)

MongoDB recently announced several developments in its AI ecosystem. 
These include the MongoDB AI Applications Program (MAAP), which offers 
reference architectures, pre-built partner integrations, and professional
services to help customers efficiently build AI-powered applications. 
Accenture is the first global systems integrator to join MAAP and will 
establish a center of excellence for MongoDB projects. Additionally, 
MongoDB introduced significant updates, including faster performance 
in version 8.0 and the general availability of Atlas Stream Processing 
to enable real-time, event-driven applications. These advancements 
highlight MongoDB's focus on supporting AI-powered applications and 
modernizing legacy workloads.

from huggingface_hub import InferenceClient
# Specify search query, retrieve relevant documents, and convert to string
query = "What are MongoDB's latest AI announcements?"
context_docs = get_query_results(query)
context_string = " ".join([doc["text"] for doc in context_docs])
# Construct prompt for the LLM using the retrieved documents as the context
prompt = f"""Use the following pieces of context to answer the question at the end.
    {context_string}
    Question: {query}
"""
# Use a model from Hugging Face
llm = InferenceClient(
    "mistralai/Mixtral-8x22B-Instruct-v0.1",
    provider = "fireworks-ai"
    token = os.getenv("HF_TOKEN"))
# Prompt the LLM (this code varies depending on the model you use)
output = llm.chat_completion(
    messages=[{"role": "user", "content": prompt}],
    max_tokens=150
)
print(output.choices[0].message.content)

MongoDB's latest AI announcements include the
MongoDB AI Applications Program (MAAP), a program designed
to help customers build AI-powered applications more efficiently.
Additionally, they have announced significant performance
improvements in MongoDB 8.0, featuring faster reads, updates,
bulk inserts, and time series queries. Another announcement is the
general availability of Atlas Stream Processing to build sophisticated,
event-driven applications with real-time data.

Next Steps

For additional RAG tutorials, see the following resources:

To learn how to implement RAG with popular LLM frameworks and AI services, see Integrate MongoDB with AI Technologies.
To learn how to implement RAG using a local Atlas deployment and local models, see Build a Local RAG Implementation with Atlas Vector Search.
For use-case based tutorials and interactive Python notebooks, see Docs Notebooks Repository and Generative AI Use Cases Repository.

To build AI agents and implement agentic RAG, see Build AI Agents with MongoDB.

Improve Your Results

To optimize your RAG applications, ensure that you're using a powerful embedding model like Voyage AI to generate high-quality vector embeddings.

Additionally, Atlas Vector Search supports advanced retrieval systems. You can seamlessly index vector data along with your other data in your cluster. This allows you to improve your results by pre-filtering on other fields in your collection or performing hybrid search that combine semantic search with full-text search results.

You can also use the following resources:

To learn more about choosing an embedding model, chunking strategies, and evaluations, see the following resources:

Back

Use Compatible Views

Playground Chatbot Demo Builder

Get Started

Why use RAG?

RAG with Atlas Vector Search

Learn by Watching

Ingestion

Retrieval

Generation

Tutorial

Prerequisites

Procedure

Set up the environment.

Create a function to generate vector embeddings.

Ingest data into your MongoDB deployment.

Use Atlas Vector Search to retrieve documents.

Generate responses with the LLM.

Set up the environment.

Create a function to retrieve and process your data.

Ingest data into your MongoDB deployment.

Use Atlas Vector Search to retrieve documents.

Generate responses with the LLM.

Create your Java project and install dependencies.

Set your environment variables.

Note

Define methods to parse and split the data.

Define a method to generate vector embeddings.

Define a method to ingest data into your MongoDB deployment.

Generate the embeddings.

Note

503 when calling Hugging Face models

Use Atlas Vector Search to retrieve documents.

Create the code to generate responses with the LLM.

Generate responses with the LLM.

Set up the environment.

Note

Minimum Node.js Version Requirements

Create a function to generate vector embeddings.

Tip

Ingest data into your MongoDB deployment.

Tip

Use Atlas Vector Search to retrieve documents.

Generate responses with the LLM.

Set up the environment.

Ingest data into your MongoDB deployment.

Tip

Tip

Use Atlas Vector Search to retrieve documents.

Generate responses with the LLM.

Next Steps

Improve Your Results

Earn a Skill Badge