Introducing Semantic Caching and a Dedicated MongoDB LangChain Package for Gen AI Apps
We are in an unprecedented time in history where developers can build transformative AI applications quickly, without being AI experts themselves. This ability is enabling new classes of applications that can better serve customers with conversational AI for assistance and automation, advanced reasoning and analysis using AI-powered retrieval, and recommendation systems.
Behind this revolution are large language models (LLMs) that can be prompted to solve for a wide range of use cases. However, LLMs have various limitations, like knowledge cutoff and a tendency to hallucinate. To overcome these limitations, they must be integrated with proprietary enterprise data sources to build reliable, relevant, and high-quality generative AI applications. That’s where MongoDB plays a critical role in the modern generative AI stack.
Developers use
MongoDB Atlas Vector Search
as a vital part of the generative AI technique known as retrieval-augmented generation (RAG). RAG is the process of feeding LLMs the supplementary data necessary to ground their responses, ensuring they're dependable and precise. LangChain has been a critical part of this journey since the public launch of Atlas Vector Search, enabling developers to build better retriever systems powered by vector search and store conversation history in the operational database.
Today, we are excited to announce support for two enhancements:
Semantic cache powered by Atlas vector search, which improves the performance of your apps
A dedicated LangChain-MongoDB package for Python and JS/TS developers, enabling them to build advanced applications even more efficiently
The MongoDB Atlas integration with LangChain can now power all the database requirements for building modern generative AI applications: vector search, semantic caching (currently only available in Python), and conversation history.
Earlier, we announced the launch of
MongoDB LangChain Templates
, which enable the developers to quickly deploy RAG applications, and provided a reference implementation of a basic
RAG template
using MongoDB Atlas Vector Search and OpenAI and a more advanced
Parent-document Retrieval RAG template
using MongoDB Atlas Vector Search. We are excited about our partnership with LangChain and will continue innovating.
Improve LLM application performance with semantic cache
Semantic cache improves the performance of LLM applications by caching responses based on the semantic meaning or context within the queries themselves. This is different from a traditional cache that works based on exact keyword matching. In the era of LLM the value of semantic cache is increasing tremendously, enabling sophisticated user experiences that closely mimic human interactions. For example, if two different users enter two different prompts, “give me suggestions for a comedy movie” and “recommend a comedy movie”, the semantic cache can understand that the intent behind the queries are same and return a similar response, even though different keywords are used, whereas a traditional cache will fail.
Figure 1:
Semantic cache using MongoDB Atlas Vector Search
Check out this video walkthrough for the semantic cache:
Accelerate development with a dedicated package
With a dedicated LangChain-MongoDB package, MongoDB is even more deeply integrated with LangChain. The Python and Javascript packages contain the following LangChain Integrations:
MongoDBAtlasVectorSearch
(
Vector stores
) and
MongoDBChatMessageHistory
(
Chat Messages Memory
). In addition, the Python package includes the
MongoDBAtlasSemanticCache
(
LLM Caching
).
The new package
langchain-mongodb
contains all the MongoDB-specific implementations and needs to be installed separately from
langchain,
which includes all the core abstractions. Earlier, everything was in the same package, making it challenging to correctly version and communicate what version should be used and whether any breaking changes were made.
Find out more about the langchain-mongodb package:
Python:
Source code
,
LangChain docs
,
MongoDB docs
Javascript:
Source code
,
LangChain.js docs
,
MongoDB docs
Get started today
Check out this accompanying
tutorial
and
notebook
on building advanced RAG with MongoDB and LangChain, which contains a walkthrough and use cases for using semantic cache, vector search, and chat message history.
Check out the “
PDFtoChat
” app to see langchain-mongodb JS in action. It allows you to have a conversation with your proprietary PDFs using AI and is built with MongoDB Atlas, LangChain.js, and TogetherAI. It’s an end-to-end SaaS-in-a-box app and includes user authentication, saving PDFs, and saving chats per PDF.
Read the excellent
overview of semantic caching
using LangChain and MongoDB.
March 20, 2024