BlogAtlas Vector Search voted most loved vector database in 2024 Retool State of AI reportLearn more >>
MongoDB Developer
Atlas
plus
Sign in to follow topics
MongoDB Developer Centerchevron-right
Developer Topicschevron-right
Productschevron-right
Atlaschevron-right

Boosting AI: Build Your Chatbot Over Your Data With MongoDB Atlas Vector Search and LangChain Templates Using the RAG Pattern

Arek Borucki6 min read • Published Dec 12, 2023 • Updated Jan 16, 2024
AIAtlasSearchPython
Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
In this tutorial, I will show you the simplest way to implement an AI chatbot-style application using MongoDB Atlas Vector Search with LangChain Templates and the retrieval-augmented generation (RAG) pattern for more precise chat responses.

Retrieval-augmented generation (RAG) pattern

The retrieval-augmented generation (RAG) model enhances LLMs by supplementing them with additional, relevant data, ensuring grounded and precise responses for business purposes. Through vector search, RAG identifies and retrieves pertinent documents from databases, which it uses as context sent to the LLM along with the query, thereby improving the LLM's response quality. This approach decreases inaccuracies by anchoring responses in factual content and ensures responses remain relevant with the most current data. RAG optimizes token use without expanding an LLM's token limit, focusing on the most relevant documents to inform the response process.
Vector Store Diagram
To elaborate on the implementation: Consider a question-answering system that uses an LLM like OpenAI, operating with the RAG model. It starts with Atlas Vector Search to pinpoint relevant documents or text snippets within a database, providing the necessary context for the question. This context, along with the question, is then processed through OpenAI's API, enabling a more informed and accurate response.
Atlas Vector Search plays a vital role for developers within the retrieval-augmented generation framework. A key technology in funnelling external data into LLMs is LangChain. This framework facilitates the development of applications that integrate LLMs, covering a range of uses that align with the capabilities of language models themselves. These uses encompass tasks like document analysis and summarization, the operation of chatbots, and the scrutiny of code.
MongoDB has streamlined the process for developers to integrate AI into their applications by teaming up with LangChain for the introduction of LangChain Templates. This collaboration has produced a retrieval-augmented generation template that capitalizes on the strengths of MongoDB Atlas Vector Search along with OpenAI's technologies. The template offers a developer-friendly approach to crafting and deploying chatbot applications tailored to specific data sets. The LangChain templates serve as a deployable reference framework, accessible as a REST API via LangServe.
The alliance has also been instrumental in showcasing the latest Atlas Vector Search advancements, notably the $vectorSearch aggregation stage, now embedded within LangChain's Python and JavaScript offerings. The joint venture is committed to ongoing development, with plans to unveil more templates. These future additions are intended to further accelerate developers' abilities to realise and launch their creative projects.

LangChain Templates

LangChain Templates present a selection of reference architectures that are designed for quick deployment, available to any user. These templates introduce an innovative system for the crafting, exchanging, refreshing, acquiring, and tailoring of diverse chains and agents. They are crafted in a uniform format for smooth integration with LangServe, enabling the swift deployment of production-ready APIs. Additionally, these templates provide a free sandbox for experimental and developmental purposes.
The rag-mongo template is specifically designed to perform retrieval-augmented generation utilizing MongoDB and OpenAI technologies. We will take a closer look at the rag-mongo template in the following section of this tutorial.

Using LangChain RAG templates

To get started, you only need to install the langchain-cli.
Use the LangChain CLI to bootstrap a LangServe project quickly. The application will be named my-blog-article, and the name of the template must also be specified. I’ll name it rag-mongo.
This will create a new directory called my-app with two folders:
  • app: This is where LangServe code will live.
  • packages: This is where your chains or agents will live.
Now, it is necessary to modify the my-blog-article/app/server.py file by adding the following code:
We will need to insert data to MongoDB Atlas. In our exercise, we utilize a publicly accessible PDF document titled "MongoDB Atlas Best Practices" as a data source for constructing a text-searchable vector space. The data will be ingested into the MongoDB langchain.vectorSearchnamespace.
In order to do it, navigate to the directory my-blog-article/packages/rag-mongo and in the file ingest.py, change the default names of the MongoDB database and collection. Additionally, modify the URL of the document you wish to use for generating embeddings.
My ingest.py is located on GitHub. Note that if you change the database and collection name in ingest.py, you also need to change it in rag_mongo/chain.py. My chain.py is also located on GitHub. Next, export your OpenAI API Key and MongoDB Atlas URI.
Creating and inserting embeddings into MongoDB Atlas using LangChain templates is very easy. You just need to run the ingest.pyscript. It will first load a document from a specified URL using the PyPDFLoader. Then, it splits the text into manageable chunks using the RecursiveCharacterTextSplitter. Finally, the script uses the OpenAI Embeddings API to generate embeddings for each chunk and inserts them into the MongoDB Atlas langchain.vectorSearch namespace.
Now, it's time to initialize Atlas Vector Search. We will do this through the Atlas UI. In the Atlas UI, choose Search and then Create Search. Afterwards, choose the JSON Editor to declare the index parameters as well as the database and collection where the Atlas Vector Search will be established (langchain.vectorSearch). Set index name as default. The definition of my index is presented below.
A detailed procedure is available on GitHub.
Let's now take a closer look at the central component of the LangChain rag-mongo template: the chain.py script. This script utilizes the MongoDBAtlasVectorSearch
class and is used to create an object — vectorstore — that interfaces with MongoDB Atlas's vector search capabilities for semantic similarity searches. The retriever is then configured from vectorstore to perform these searches, specifying the search type as "similarity."
This configuration ensures the most contextually relevant document is retrieved from the database. Upon retrieval, the script merges this document with a user's query and leverages the ChatOpenAI class to process the input through OpenAI's GPT models, crafting a coherent answer. To further enhance this process, the ChatOpenAI class is initialized with the gpt-3.5-turbo-16k-0613 model, chosen for its optimal performance. The temperature is set to 0, promoting consistent, deterministic outputs for a streamlined and precise user experience.
This class permits tailoring API requests, offering control over retry attempts, token limits, and response temperature. It adeptly manages multiple response generations, response caching, and callback operations. Additionally, it facilitates asynchronous tasks to streamline response generation and incorporates metadata and tagging for comprehensive API run tracking.

LangServe Playground

After successfully creating and storing embeddings in MongoDB Atlas, you can start utilizing the LangServe Playground by executing the langchain serve command, which grants you access to your chatbot.
This will start the FastAPI application, with a server running locally at http://127.0.0.1:8000. All templates can be viewed at http://127.0.0.1:8000/docs, and the playground can be accessed at http://127.0.0.1:8000/rag-mongo/playground/.
The chatbot will answer questions about best practices for using MongoDB Atlas with the help of context provided through vector search. Questions on other topics will not be considered by the chatbot.
Go to the following URL:
And start using your template! You can ask questions related to MongoDB Atlas in the chat.
LangServe Playground
By expanding the Intermediate steps menu, you can trace the entire process of formulating a response to your question. This process encompasses searching for the most pertinent documents related to your query, and forwarding them to the Open AI API to serve as the context for the query. This methodology aligns with the RAG pattern, wherein relevant documents are retrieved to furnish context for generating a well-informed response to a specific inquiry.
We can also use curl to interact with LangServe REST API and contact endpoints, such as /rag-mongo/invoke:
We can also send batch requests to the API using the /rag-mongo/batch endpoint, for example:
For comprehensive documentation and further details, please visit http://127.0.0.1:8000/docs.

Summary

In this article, we've explored the synergy of MongoDB Atlas Vector Search with LangChain Templates and the RAG pattern to significantly improve chatbot response quality. By implementing these tools, developers can ensure their AI chatbots deliver highly accurate and contextually relevant answers. Step into the future of chatbot technology by applying the insights and instructions provided here. Elevate your AI and engage users like never before. Don't just build chatbots — craft intelligent conversational experiences. Start now with MongoDB Atlas and LangChain!

Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Related
Tutorial

Building a Restaurant Locator Using Atlas, Neurelo, and AWS Lambda


Apr 02, 2024 | 8 min read
Tutorial

Leveraging OpenAI and MongoDB Atlas for Improved Search Functionality


Dec 04, 2023 | 5 min read
Article

Model Unstructured Data: More Flexibility for Our Developers


Jul 09, 2024 | 4 min read
Tutorial

Exploring Window Operators in Atlas Stream Processing


May 02, 2024 | 4 min read
Table of Contents