MongoDB provides several features for building AI agents. As both a vector and document database, MongoDB supports various search methods for agentic RAG, as well as storing agent interactions in the same database for short and long-term agent memory.
What is an AI agent?
In the context of generative AI, an AI agent typically refers to a system that can complete a task autonomously or semi-autonomously by combining AI models such as LLMs with a set of pre-defined tools.
AI agents can use tools to gather context, interact with external systems, and perform actions. They can define their own execution flow (planning) and remember previous interactions to inform their responses (memory). Therefore, AI agents are best suited for complex tasks that require reasoning, planning, and decision-making.
Architecture
An AI agent typically includes a combination of the following components:
Perception | Your input for the agent. Text inputs are the most common perception mechanism for AI agents, but inputs can also be audio, images, or multimodal data. |
Planning | How the agent determines what to do next. This component typically includes LLMs and prompts, using feedback loops and various prompt engineering techniques such as chain-of-thought and reAct, to help the LLM reason through complex tasks. AI agents can consist of a single LLM as the decision maker, LLM with multiple prompts, multiple LLMs working together, or any combination of these approaches. |
Tools | How the agent gathers context for a task. Tools allow agents to interact with external systems and perform actions such as vector search, web search, or calling APIs from other services. |
Memory | A system for storing agent interactions, so the agent can learn from past experiences to inform its responses. Memory can be short-term (for the current session) or long-term (persisted across sessions). |
Note
AI agents vary in design pattern, function, and complexity. To learn about other agent architectures, including multi-agent systems, see Agentic Design Patterns.
Build AI Agents with MongoDB
MongoDB supports the following components for building AI agents:
Tools: Leverage MongoDB search features as tools for your agent to retrieve relevant information and implement agentic RAG.
Memory: Store agent interactions in MongoDB collections for both short and long-term memory.
Agent Tools
In the context of AI agents, a tool is anything that can be programmatically defined and invoked by the agent. Tools extend the agent's capabilities beyond generating text, allowing it to interact with external systems, retrieve information, and take actions. Tools are typically defined with a specific interface that includes:
A name and description that help the agent understand when to use the tool.
Required parameters and their expected formats.
A function that performs the actual operation when invoked.
The agent uses its reasoning capabilities to determine which tool to use, when to use it, and what parameters to provide, based on the user's input and the task at hand.
In addition to standard MongoDB queries, MongoDB provides several search capabilities that you can implement as tools for your agent.
MongoDB Vector Search: Perform vector search to retrieve relevant context based on semantic meaning and similarity. To learn more, see MongoDB Vector Search Overview.
MongoDB Search: Perform full-text search to retrieve relevant context based on keyword matching and relevance scoring. To learn more, see MongoDB Search Overview.
Hybrid Search: Combine MongoDB Vector Search with MongoDB Search to leverage the strengths of both approaches. To learn more, see How to Perform Hybrid Search.
You can define tools manually or by using frameworks such as LangChain and LangGraph, which provide built-in abstractions for tool creation and calling.
Tools are defined as functions that the agent can call to perform specific tasks. For example, the following syntax illustrates how you might define a tool that runs a vector search query:
async function vectorSearchTool(query) { const pipeline = [ { $vectorSearch: { // Vector search query pipeline... } } ]; const results = await collection.aggregate(pipeline).toArray(); return results; }
def vector_search_tool(query: str) -> str: pipeline = [ { "$vectorSearch": { # Vector search query pipeline... } } ] results = collection.aggregate(pipeline) array_of_results = [] for doc in results: array_of_results.append(doc) return array_of_results
Tool calls are what the agent uses to execute the tools. You can define how to process tool calls in your agent, or use a framework to handle this for you. These are typically defined as JSON objects that include the tool name and other arguments to pass to the tool, so the agent can call the tool with the appropriate parameters. For example, the following syntax illustrates how an agent might call the vector search tool:
{ "tool": "vector_search_tool", "args": { "query": "What is MongoDB?" }, "id": "call_H5TttXb423JfoulF1qVfPN3m" }
Agentic RAG
By using MongoDB as a vector database, you can create retrieval tools that implement agentic RAG, which is an advanced form of RAG that allows you to dynamically orchestrate the retrieval and generation process through an AI agent.
This approach enables more complex workflows and user interactions. For example, you can configure your AI agent to determine the optimal retrieval tool based on the task, such as using MongoDB Vector Search for semantic search and MongoDB Search for full-text search. You can also define different retrieval tools for different collections to further customize the agent's retrieval capabilities.
Agent Memory
Memory for agents involves storing information about previous interactions, so that the agent can learn from past experiences and provide more relevant and personalized responses. This is particularly important for tasks that require context, such as conversational agents, where the agent needs to remember previous turns in the conversation to provide coherent and contextually relevant responses. There are two primary types of agent memory:
Short-term Memory: Stores information for the current session, like recent conversation turns and active task context.
Long-term Memory: Persists information across sessions, which can include past conversations and personalized preferences over time.
Since MongoDB is also a document database, you can implement memory for agents by storing its interactions in a MongoDB collection. The agent can then query or update this collection as needed. There are several ways to implement agent memory with MongoDB:
For short-term memory, you might include a
session_id
field to identify a specific session when storing interactions, and then query for interactions with the same ID to pass to the agent as context.For long-term memory, you might process several interactions with an LLM to extract relevant information such as user preferences or important context, and then store this information in a separate collection that the agent can query when needed.
To build robust memory management systems that enable more efficient and complex retrieval of conversation histories, leverage MongoDB Search or MongoDB Vector Search to store, index, and query important interactions across sessions.
A document in a collection that stores short-term memory might resemble the following:
{ "session_id": "123", "user_id": "jane_doe", "interactions": [ { "role": "user", "content": "What is MongoDB?", "timestamp": "2025-01-01T12:00:00Z" }, { "role": "assistant", "content": "MongoDB is the world's leading modern database.", "timestamp": "2025-01-01T12:00:05Z" } ] }
A document in a collection that stores long-term memory might resemble the following:
{ "user_id": "jane_doe", "last_updated": "2025-05-22T09:15:00Z", "preferences": { "conversation_tone": "casual", "custom_instructions": [ "I prefer concise answers." ], }, "facts": [ { "interests": ["AI", "MongoDB"], } ] }
The following frameworks also provide direct abstractions for agent memory with MongoDB:
Framework | Features |
---|---|
LangChain |
To learn more, see the tutorial. |
LangGraph |
To learn more, see LangGraph and LangGraph.js. |
Get Started
The following tutorial demonstrates how to build an AI agent using MongoDB for agentic RAG and memory, without an agent framework.
➤ Use the Select your language drop-down menu to set the language for this tutorial.
Work with a runnable version of this tutorial as a Python notebook.
Prerequisites
To complete this tutorial, you must have the following:
One of the following MongoDB cluster types:
An Atlas cluster running MongoDB version 6.0.11, 7.0.2, or later. Ensure that your IP address is included in your Atlas project's access list.
A local Atlas deployment created using the Atlas CLI. To learn more, see Create a Local Atlas Deployment.
A MongoDB Community or Enterprise cluster with Search and Vector Search installed.
A Voyage AI API key.
An OpenAI API key.
Note
This tutorial uses models from Voyage AI and OpenAI, but you can modify the code to use your models of choice.
Procedure
This AI agent can be used to answer questions about a custom data source and perform calculations. It can also remember previous interactions to inform its responses. It uses the following components:
Perception: Text inputs.
Planning: An LLM and various prompts to reason through the task.
Tools: A vector search tool and calculator tool.
Memory: Stores the interactions in a MongoDB collection.
Set up the environment.
Initialize the project and install dependencies.
Create a new project directory, then install the required dependencies:
mkdir mongodb-ai-agent cd mongodb-ai-agent npm init -y npm install --quiet dotenv mongodb voyageai openai langchain @langchain/community @langchain/core mathjs pdf-parse Note
Your project will use the following structure:
mongodb-ai-agent ├── .env ├── config.js ├── ingest-data.js ├── tools.js ├── memory.js ├── planning.js └── index.js Configure the environment.
Create an environment file named
.env
in your project. This file will contain your API keys for the agent, the MongoDB connection string, and MongoDB database and collection names.Replace the placeholder values with your MongoDB connection string and your Voyage AI and OpenAI API keys.
MONGODB_URI="<mongodb-connection-string>" VOYAGE_API_KEY="<voyage-api-key>" OPENAI_API_KEY= "<openai-api-key>" Note
Replace
<connection-string>
with the connection string for your Atlas cluster or local Atlas deployment.Your connection string should use the following format:
mongodb+srv://<db_username>:<db_password>@<clusterName>.<hostname>.mongodb.net To learn more, see Connect to a Cluster via Drivers.
Your connection string should use the following format:
mongodb://localhost:<port-number>/?directConnection=true To learn more, see Connection Strings.
Configure the agent.
Create a file named config.js
in your project. This file will
read in your environment variables and connect the application to
services like the MongoDB database and OpenAI.
import dotenv from 'dotenv'; import { MongoClient } from 'mongodb'; import OpenAI from "openai"; // Load environment variables from .env file dotenv.config(); // MongoDB cluster configuration export const MONGODB_URI = process.env.MONGODB_URI; export const mongoClient = new MongoClient(MONGODB_URI); export const agentDb = mongoClient.db("ai_agent_db"); export const vectorCollection = agentDb.collection("embeddings"); export const memoryCollection = agentDb.collection("chat_history"); // Model Configuration export const OPENAI_MODEL = "gpt-4o"; export const VOYAGE_MODEL = "voyage-3-large"; export const VOYAGE_API_KEY = process.env.VOYAGE_API_KEY; // Initialize OpenAI Client export const openAIClient = new OpenAI({ apiKey: process.env.OPENAI_API_KEY,});
Use MongoDB as a vector database.
Create a file named ingest-data.js
in your project. This script
ingests a sample PDF that contains a recent MongoDB earnings report into a collection
in MongoDB by using the voyage-3-large
embedding model. This
code also includes a function to create a vector search index on your
data if it doesn't already exist.
To learn more, see Ingestion.
import { PDFLoader } from "@langchain/community/document_loaders/fs/pdf"; import { RecursiveCharacterTextSplitter } from "langchain/text_splitter"; import { vectorCollection } from "./config.js"; import { VOYAGE_API_KEY, VOYAGE_MODEL } from "./config.js"; import { VoyageAIClient } from "voyageai"; import { MONGODB_URI } from "./config.js"; import * as fs from 'fs'; import fetch from 'node-fetch'; console.log("Connecting to MongoDB:", MONGODB_URI); const EMBEDDING_DIMENSIONS = 1024; // Use Voyage AI Client SDK to get embeddings export async function getEmbedding(data, input_type) { if (!VOYAGE_API_KEY) { throw new Error("VOYAGE_API_KEY is not set in environment variables."); } try { const client = new VoyageAIClient({ apiKey: VOYAGE_API_KEY }); const response = await client.embed({ input: [data], model: VOYAGE_MODEL, input_type: input_type // "document" or "query" }); if (response.data && response.data.length > 0) { return response.data[0].embedding; } throw new Error("No embedding data found from Voyage AI response."); } catch (error) { console.error("Error generating Voyage AI embedding:", error); return null; } } // Ingest data from a PDF, generate embeddings, and store in MongoDB export async function ingestData() { try { // download PDF const rawData = await fetch("https://investors.mongodb.com/node/13176/pdf"); const pdfBuffer = await rawData.arrayBuffer(); fs.writeFileSync("investor-report.pdf", Buffer.from(pdfBuffer)); // load and split PDF const loader = new PDFLoader("investor-report.pdf"); const data = await loader.load(); const textSplitter = new RecursiveCharacterTextSplitter({ chunkSize: 400, chunkOverlap: 20, }); const docs = await textSplitter.splitDocuments(data); console.log(`Chunked PDF into ${docs.length} documents.`); // generate embeddings and insert const insertDocuments = await Promise.all(docs.map(async doc => ({ document: doc, embedding: await getEmbedding(doc.pageContent, "document"), }))); const result = await vectorCollection.insertMany(insertDocuments, { ordered: false }); console.log("Inserted documents:", result.insertedCount); } catch (err) { console.error("Ingestion error:", err); } } // Create a vector search index export async function createVectorIndex() { try { // check if the index already exists const existingIndexes = await vectorCollection.listSearchIndexes().toArray(); if (existingIndexes.some(index => index.name === "vector_index")) { console.log("Vector index already exists. Skipping creation."); return; } // define your Vector Search index const index = { name: "vector_index", type: "vectorSearch", definition: { "fields": [ { "type": "vector", "path": "embedding", "numDimensions": EMBEDDING_DIMENSIONS, "similarity": "cosine", } ] } } // run the helper method to ensure the index is created const result = await vectorCollection.createSearchIndex(index); console.log(`New index named ${result} is building.`); // wait for the index to be ready to query console.log("Polling to check if the index is ready. This may take up to a minute.") let isQueryable = false; while (!isQueryable) { const cursor = vectorCollection.listSearchIndexes(); for await (const index of cursor) { if (index.name === result) { if (index.queryable) { console.log(`${result} is ready for querying.`); isQueryable = true; } else { await new Promise(resolve => setTimeout(resolve, 5000)); } } } } } catch (err) { console.error("Error creating vector index:", err); throw err; } }
Define tools for the agent.
Create a file named tools.js
in your project. This file defines
the tools that the agent can use to answer questions. In this
example, you define the following tools:
vectorSearchTool
: Runs a vector search query to retrieve relevant documents from your collection.calculatorTool
: Uses themathjs
library for basic math operations.
import { getEmbedding } from './ingest-data.js'; import { vectorCollection } from './config.js'; import { evaluate } from 'mathjs'; // Vector search tool export async function vectorSearchTool(userInput) { const queryEmbedding = await getEmbedding(userInput, "query"); const pipeline = [ { $vectorSearch: { index: "vector_index", queryVector: queryEmbedding, path: "embedding", exact: true, limit: 5 } }, { $project: { _id: 0, "document.pageContent": 1 } } ]; const cursor = vectorCollection.aggregate(pipeline); const results = await cursor.toArray(); return results; } // Simple calculator tool export function calculatorTool(userInput) { try { const result = evaluate(userInput); return String(result); } catch (e) { return `Error: ${e.message}`; } }
Add memory to the agent.
Create a file named memory.js
in your project. This file defines the
system that the agent uses to store its interactions. In this example, you
implement short-term memory by defining the following functions:
storeChatMessage
: to store information about an interaction in a MongoDB collection.retrieveSessionHistory
: to get all interactions for a specific session by using thesession_id
field.
import { memoryCollection } from './config.js'; /** * Store a chat message in the memory collection. * @param {string} sessionId - unique identifier for the chat session * @param {string} role - role of the sender (user or system) * @param {string} content - content of the message */ export async function storeChatMessage(sessionId, role, content) { const message = { session_id: sessionId, role, content, timestamp: new Date(), // use JS date for timestamp }; await memoryCollection.insertOne(message); } /** * Retrieve the chat history for a session. * @param {string} sessionId - unique identifier for the chat session * @returns {Promise<Array<{role: string, content: string}>>} */ export async function retrieveSessionHistory(sessionId) { const cursor = memoryCollection .find({ session_id: sessionId }) .sort({ timestamp: 1 }); const messages = []; await cursor.forEach(msg => { messages.push({ role: msg.role, content: msg.content }); }); return messages; }
Define the agent's planning.
Create a file named planning.js
in your project. This file will
include various prompts and LLM calls to determine the agent's
execution flow. In this example, you define the following functions:
openAIChatCompletion
: Helper function to call the OpenAI API for generating responses.toolSelector
: Determines how the LLM selects the appropriate tool for a task.generateAnswer
: Orchestrates the agent's execution flow by using tools, calling the LLM, and processing the results.getLLMResponse
: Helper function for LLM response generation.
import { vectorSearchTool, calculatorTool } from './tools.js'; import { storeChatMessage, retrieveSessionHistory } from './memory.js'; import { openAIClient, OPENAI_MODEL } from './config.js'; // OpenAI chat completion helper export async function openAIChatCompletion(messages) { try { const completion = await openAIClient.chat.completions.create({ model: OPENAI_MODEL, messages, max_tokens: 1024, }); return completion.choices[0].message.content; } catch (error) { console.error("Error in openAIChatCompletion:", error); throw error; } } // Tool selector function to determine which tool to use based on user input and session history export async function toolSelector(userInput, sessionHistory = []) { const systemPrompt = ` Select the appropriate tool from the options below. Consider the full context of the conversation before deciding. Tools available: - vector_search_tool: Retrieve specific context about recent MongoDB earnings and announcements - calculator_tool: For mathematical operations - none: For general questions without additional context Process for making your decision: 1. Analyze if the current question relates to or follows up on a previous vector search query 2. For follow-up questions, incorporate context from previous exchanges to create a comprehensive search query 3. Only use calculator_tool for explicit mathematical operations 4. Default to none only when certain the other tools won't help When continuing a conversation: - Identify the specific topic being discussed - Include relevant details from previous exchanges - Formulate a query that stands alone but preserves conversation context Return a JSON object only: {"tool": "selected_tool", "input": "your_query"} `.trim(); const messages = [ { role: "system", content: systemPrompt }, ...sessionHistory, { role: "user", content: userInput } ]; try { const response = await openAIChatCompletion(messages); let toolCall; try { toolCall = JSON.parse(response); } catch { try { toolCall = eval(`(${response})`); } catch { return { tool: "none", input: userInput }; } } return { tool: toolCall.tool || "none", input: toolCall.input || userInput }; } catch (err) { console.error("Error in toolSelector:", err); return { tool: "none", input: userInput }; } } // Function to get LLM response based on messages and system message content async function getLlmResponse(messages, systemMessageContent) { const systemMessage = { role: "system", content: systemMessageContent }; let fullMessages; if (messages.some(msg => msg.role === "system")) { fullMessages = [...messages, systemMessage]; } else { fullMessages = [systemMessage, ...messages]; } const response = await openAIChatCompletion(fullMessages); return response; } // Function to generate response based on user input export async function generateResponse(sessionId, userInput) { await storeChatMessage(sessionId, "user", userInput); const sessionHistory = await retrieveSessionHistory(sessionId); const llmInput = [...sessionHistory, { role: "user", content: userInput }]; const { tool, input: toolInput } = await toolSelector(userInput, sessionHistory); console.log("Tool selected:", tool); let response; if (tool === "vector_search_tool") { const contextResults = await vectorSearchTool(toolInput); const context = contextResults.map(doc => doc.document?.pageContent || JSON.stringify(doc)).join('\n---\n'); const systemMessageContent = ` Answer the user's question based on the retrieved context and conversation history. 1. First, understand what specific information the user is requesting 2. Then, locate the most relevant details in the context provided 3. Finally, provide a clear, accurate response that directly addresses the question If the current question builds on previous exchanges, maintain continuity in your answer. Only state facts clearly supported by the provided context. If information is not available, say 'I DON'T KNOW'. Context: ${context} `.trim(); response = await getLlmResponse(llmInput, systemMessageContent); } else if (tool === "calculator_tool") { response = calculatorTool(toolInput); } else { const systemMessageContent = "You are a helpful assistant. Respond to the user's prompt as best as you can based on the conversation history."; response = await getLlmResponse(llmInput, systemMessageContent); } await storeChatMessage(sessionId, "system", response); return response; }
Test the agent.
Finally, create a file named index.js
in your project. This file
runs the agent and allows you to interact with it.
import readline from 'readline'; import { mongoClient } from './config.js'; import { ingestData, createVectorIndex } from './ingest-data.js'; import { generateResponse } from './planning.js'; // Prompt for user input async function prompt(question) { const rl = readline.createInterface({ input: process.stdin, output: process.stdout }); return new Promise(resolve => rl.question(question, answer => { rl.close(); resolve(answer); })); } async function main() { try { await mongoClient.connect(); const runIngest = await prompt("Ingest sample data? (y/n): "); if (runIngest.trim().toLowerCase() === 'y') { await ingestData(); console.log("\nAttempting to create/verify Vector Search Index..."); await createVectorIndex(); } else { await createVectorIndex(); // ensure index exists even if not ingesting data } const sessionId = await prompt("Enter a session ID: "); while (true) { const userQuery = await prompt("\nEnter your query (or type 'quit' to exit): "); if (userQuery.trim().toLowerCase() === 'quit') break; if (!userQuery.trim()) { console.log("Query cannot be empty. Please try again."); continue; } const answer = await generateResponse(sessionId, userQuery); console.log("\nAnswer:"); console.log(answer); } } finally { await mongoClient.close(); } } main();
Save your project, then run the following command. When you run the agent:
If you haven't already, instruct the agent to ingest the sample data.
Enter a session ID to start a new session or continue an existing session.
Ask questions. The agent generates a response based on your tools, the previous interactions, and the prompts defined in the planning phase.
Refer to the example output for a sample interaction:
node index.js
Ingest sample data? (y/n): y Chunked PDF into 100 documents. Inserted documents: 100 Attempting to create/verify Vector Search Index... New index named vector_index is building. Polling to check if the index is ready. This may take up to a minute. vector_index is ready for querying. Enter a session ID: 123 Enter your query (or type 'quit' to exit): What was MongoDB's latest acquisition? Tool selected: vector_search_tool Answer: MongoDB recently acquired Voyage AI, a pioneer in embedding and reranking models that power next-generation AI applications. Enter your query (or type 'quit' to exit): What do they do? Tool selected: vector_search_tool Answer: Voyage AI is a company that specializes in state-of-the-art embedding and reranking models designed to power next-generation AI applications. These technologies help organizations build more advanced and trustworthy AI capabilities. Enter your query (or type 'quit' to exit): What is 123+456? Tool selected: calculator_tool Answer: 579
Tip
If you're using Atlas, you can verify your embeddings and interactions
by navigating to the ai_agent_db.embeddings
namespace
in the Atlas UI.
Continue building.
Now that you have a basic AI agent, you can continue developing it by:
Improving the performance of your vector search tools and fine tuning your RAG pipelines.
Adding more tools to the agent, such as hybrid or full-text search tools.
Refining the planning phase by using more advanced prompts and LLM calls.
Implementing long-term memory and more advanced memory systems by using MongoDB Search and MongoDB Vector Search to store and retrieve important interactions across sessions.
Set up the environment.
Initialize the project and install dependencies.
Create a new project directory, then install the required dependencies:
mkdir mongodb-ai-agent cd mongodb-ai-agent pip install --quiet --upgrade pymongo voyageai openai langchain langchain-mongodb langchain-community python-dotenv Note
Your project will use the following structure:
mongodb-ai-agent ├── .env ├── config.py ├── ingest_data.py ├── tools.py ├── memory.py ├── planning.py ├── main.py Configure the environment.
Create an environment file named
.env
in your project. This file will contain your API keys for the agent, the MongoDB connection string, and MongoDB database and collection names.Replace the placeholder values with your MongoDB connection string and your Voyage AI and OpenAI API keys.
MONGODB_URI="<mongodb-connection-string>" VOYAGE_API_KEY="<voyage-api-key>" OPENAI_API_KEY= "<openai-api-key>" Note
Replace
<connection-string>
with the connection string for your Atlas cluster or local Atlas deployment.Your connection string should use the following format:
mongodb+srv://<db_username>:<db_password>@<clusterName>.<hostname>.mongodb.net To learn more, see Connect to a Cluster via Drivers.
Your connection string should use the following format:
mongodb://localhost:<port-number>/?directConnection=true To learn more, see Connection Strings.
Configure the agent.
Create a file named config.py
in your project. This file will
read in your environment variables and connect the application to
services like the MongoDB database and OpenAI.
from pymongo import MongoClient from openai import OpenAI import voyageai from dotenv import load_dotenv import os # Load environment variables from .env file load_dotenv() # Environment variables (private) MONGODB_URI = os.getenv("MONGODB_URI") VOYAGE_API_KEY = os.getenv("VOYAGE_API_KEY") OPENAI_API_KEY = os.getenv("OPENAI_API_KEY") # MongoDB cluster configuration mongo_client = MongoClient(MONGODB_URI) agent_db = mongo_client["ai_agent_db"] vector_collection = agent_db["embeddings"] memory_collection = agent_db["chat_history"] # Model configuration voyage_client = voyageai.Client(api_key=VOYAGE_API_KEY) client = OpenAI(api_key=OPENAI_API_KEY) VOYAGE_MODEL = "voyage-3-large" OPENAI_MODEL = "gpt-4o"
Use MongoDB as a vector database.
Create a file named ingest_data.py
in your project. This script
ingests a sample PDF that contains a recent MongoDB earnings report into a collection
in MongoDB by using the voyage-3-large
embedding model. This
code also includes a function to create a vector search index on your
data if it doesn't already exist.
To learn more, see Ingestion.
from config import vector_collection, voyage_client, VOYAGE_MODEL from pymongo.operations import SearchIndexModel from langchain_community.document_loaders import PyPDFLoader from langchain.text_splitter import RecursiveCharacterTextSplitter import time # Define a function to generate embeddings def get_embedding(data, input_type = "document"): embeddings = voyage_client.embed( data, model = VOYAGE_MODEL, input_type = input_type ).embeddings return embeddings[0] # --- Ingest embeddings into MongoDB --- def ingest_data(): # Chunk PDF data loader = PyPDFLoader("https://investors.mongodb.com/node/13176/pdf") data = loader.load() text_splitter = RecursiveCharacterTextSplitter(chunk_size=400, chunk_overlap=20) documents = text_splitter.split_documents(data) print(f"Successfully split PDF into {len(documents)} chunks.") # Ingest chunked documents into collection print("Generating embeddings and ingesting documents...") docs_to_insert = [] for i, doc in enumerate(documents): embedding = get_embedding(doc.page_content) if embedding: docs_to_insert.append({ "text": doc.page_content, "embedding": embedding }) if docs_to_insert: result = vector_collection.insert_many(docs_to_insert) print(f"Inserted {len(result.inserted_ids)} documents into the collection.") else: print("No documents were inserted. Check embedding generation process.") # --- Create the vector search index --- index_name = "vector_index" search_index_model = SearchIndexModel( definition = { "fields": [ { "type": "vector", "numDimensions": 1024, "path": "embedding", "similarity": "cosine" } ] }, name=index_name, type="vectorSearch" ) try: vector_collection.create_search_index(model=search_index_model) print(f"Search index '{index_name}' creation initiated.") except Exception as e: print(f"Error creating search index: {e}") return # Wait for initial sync to complete print("Polling to check if the index is ready. This may take up to a minute.") predicate=None if predicate is None: predicate = lambda index: index.get("queryable") is True while True: indices = list(vector_collection.list_search_indexes(index_name)) if len(indices) and predicate(indices[0]): break time.sleep(5) print(index_name + " is ready for querying.")
Define tools for the agent.
Create a file named tools.py
in your project. This file defines
the tools that the agent can use to answer questions. In this example,
you define the following tools:
vector_search_tool
: Runs a vector search query to retrieve relevant documents from your collection.calculator_tool
: Uses theeval()
function for basic math operations.
from config import vector_collection from ingest_data import get_embedding # Define a vector search tool def vector_search_tool(user_input: str) -> str: query_embedding = get_embedding(user_input) pipeline = [ { "$vectorSearch": { "index": "vector_index", "queryVector": query_embedding, "path": "embedding", "exact": True, "limit": 5 } }, { "$project": { "_id": 0, "text": 1 } } ] results = vector_collection.aggregate(pipeline) array_of_results = [] for doc in results: array_of_results.append(doc) return array_of_results # Define a simple calculator tool def calculator_tool(user_input: str) -> str: try: result = eval(user_input) return str(result) except Exception as e: return f"Error: {str(e)}"
Add memory to the agent.
Create a file named memory.py
in your project. This file defines
the system that the agent uses to store its interactions. In this
example, you implement short-term memory by defining the following
functions:
store_chat_message
: to store information about an interaction in a MongoDB collection.retrieve_session_history
: to get all interactions for a specific session by using thesession_id
field.
from config import memory_collection from datetime import datetime from typing import List def store_chat_message(session_id: str, role: str, content: str) -> None: message = { "session_id": session_id, # Unique identifier for the chat session "role": role, # Role of the sender (user or system) "content": content, # Content of the message "timestamp": datetime.now(), # Timestamp of when the message was sent } memory_collection.insert_one(message) def retrieve_session_history(session_id: str) -> List: # Query the collection for messages with a specific "session_id" in ascending order cursor = memory_collection.find({"session_id": session_id}).sort("timestamp", 1) # Iterate through the cursor and return a JSON object with the message role and content if cursor: messages = [{"role": msg["role"], "content": msg["content"]} for msg in cursor] else: messages = [] return messages
Define the agent's planning.
Create a file named planning.py
in your project. This file will
include various prompts and LLM calls to determine the agent's
execution flow. In this example, you define the following functions:
tool_selector
: Determines how the LLM selects the appropriate tool for a task.generate_answer
: Orchestrates the agent's execution flow by using tools, calling the LLM, and processing the results.get_llm_response
: Helper function for LLM response generation.
from config import openai_client, OPENAI_MODEL from tools import vector_search_tool, calculator_tool from memory import store_chat_message, retrieve_session_history # Define a tool selector function that decides which tool to use based on user input and message history def tool_selector(user_input, session_history=None): messages = [ { "role": "system", "content": ( "Select the appropriate tool from the options below. Consider the full context of the conversation before deciding.\n\n" "Tools available:\n" "- vector_search_tool: Retrieve specific context about recent MongoDB earnings and announcements\n" "- calculator_tool: For mathematical operations\n" "- none: For general questions without additional context\n" "Process for making your decision:\n" "1. Analyze if the current question relates to or follows up on a previous vector search query\n" "2. For follow-up questions, incorporate context from previous exchanges to create a comprehensive search query\n" "3. Only use calculator_tool for explicit mathematical operations\n" "4. Default to none only when certain the other tools won't help\n\n" "When continuing a conversation:\n" "- Identify the specific topic being discussed\n" "- Include relevant details from previous exchanges\n" "- Formulate a query that stands alone but preserves conversation context\n\n" "Return a JSON object only: {\"tool\": \"selected_tool\", \"input\": \"your_query\"}" ) } ] if session_history: messages.extend(session_history) messages.append({"role": "user", "content": user_input}) response = openai_client.chat.completions.create( model=OPENAI_MODEL, messages=messages ).choices[0].message.content try: tool_call = eval(response) return tool_call.get("tool"), tool_call.get("input") except: return "none", user_input # Define the agent workflow def generate_response(session_id: str, user_input: str) -> str: # Store the user input in the chat history collection store_chat_message(session_id, "user", user_input) # Initialize a list of inputs to pass to the LLM llm_input = [] # Retrieve the session history for the current session and add it to the LLM input session_history = retrieve_session_history(session_id) llm_input.extend(session_history) # Append the user message in the correct format user_message = { "role": "user", "content": user_input } llm_input.append(user_message) # Call the tool_selector function to determine which tool to use tool, tool_input = tool_selector(user_input, session_history) print("Tool selected: ", tool) # Process based on selected tool if tool == "vector_search_tool": context = vector_search_tool(tool_input) # Construct the system prompt using the retrieved context and append it to the LLM input system_message_content = ( f"Answer the user's question based on the retrieved context and conversation history.\n" f"1. First, understand what specific information the user is requesting\n" f"2. Then, locate the most relevant details in the context provided\n" f"3. Finally, provide a clear, accurate response that directly addresses the question\n\n" f"If the current question builds on previous exchanges, maintain continuity in your answer.\n" f"Only state facts clearly supported by the provided context. If information is not available, say 'I DON'T KNOW'.\n\n" f"Context:\n{context}" ) response = get_llm_response(llm_input, system_message_content) elif tool == "calculator_tool": # Perform the calculation using the calculator tool response = calculator_tool(tool_input) else: system_message_content = "You are a helpful assistant. Respond to the user's prompt as best as you can based on the conversation history." response = get_llm_response(llm_input, system_message_content) # Store the system response in the chat history collection store_chat_message(session_id, "system", response) return response # Helper function to get the LLM response def get_llm_response(messages, system_message_content): # Add the system message to the messages list system_message = { "role": "system", "content": system_message_content, } # If the system message should go at the end (for context-based queries) if any(msg.get("role") == "system" for msg in messages): messages.append(system_message) else: # For general queries, put system message at beginning messages = [system_message] + messages # Get response from LLM response = openai_client.chat.completions.create( model=OPENAI_MODEL, messages=messages ).choices[0].message.content return response
Test the agent.
Finally, create a file named main.py
in your project. This file
runs the agent and allows you to interact with it.
from config import mongo_client from ingest_data import ingest_data from planning import generate_response if __name__ == "__main__": try: run_ingest = input("Ingest sample data? (y/n): ") if run_ingest.lower() == 'y': ingest_data() session_id = input("Enter a session ID: ") while True: user_query = input("\nEnter your query (or type 'quit' to exit): ") if user_query.lower() == 'quit': break if not user_query.strip(): print("Query cannot be empty. Please try again.") continue answer = generate_response(session_id, user_query) print("\nAnswer:") print(answer) finally: mongo_client.close()
Save your project, then run the following command. When you run the agent:
If you haven't already, instruct the agent to ingest the sample data.
Enter a session ID to start a new session or continue an existing session.
Ask questions. The agent generates a response based on your tools, the previous interactions, and the prompts defined in the planning phase.
Refer to the example output for a sample interaction:
python main.py
Ingest sample data? (y/n): y Successfully split PDF into 104 chunks. Generating embeddings and ingesting documents... Inserted 104 documents into the collection. Search index 'vector_index' creation initiated. Polling to check if the index is ready. This may take up to a minute. vector_index is ready for querying. Enter a session ID: 123 Enter your query (or type 'quit' to exit): What was MongoDB's latest acquisition? Tool selected: vector_search_tool Answer: MongoDB's latest acquisition was Voyage AI. Enter your query (or type 'quit' to exit): What do they do? Tool selected: vector_search_tool Answer: Voyage AI is a company that specializes in state-of-the-art embedding and reranking models designed to power next-generation AI applications. These technologies help organizations build more advanced and trustworthy AI capabilities. Enter your query (or type 'quit' to exit): What is 123+456? Tool selected: calculator_tool Answer: 579
Tip
If you're using Atlas, you can verify your embeddings and interactions
by navigating to the ai_agent_db.embeddings
namespace
in the Atlas UI.
Continue building.
Now that you have a basic AI agent, you can continue developing it by:
Improving the performance of your vector search tools and fine tuning your RAG pipelines.
Adding more tools to the agent, such as hybrid or full-text search tools.
Refining the planning phase by using more advanced prompts and LLM calls.
Implementing long-term memory and more advanced memory systems by using MongoDB Search and MongoDB Vector Search to store and retrieve important interactions across sessions.
Tutorials
For more tutorials on building AI agents with MongoDB, refer to the following table:
Frameworks | |
Enterprise platforms | |
Additional resources |