/ /

使用 LangChain 和MongoDB添加内存和语义缓存

本教程演示了如何通过使用 LangChain MongoDB 集成来添加对话记忆和语义缓存，从而增强您的 RAG 应用程序。

内存允许您在多个用户交互中维护对话上下文。
语义缓存通过缓存语义相似的查询来减少响应延迟。

使用本教程的可运行版本以作为 Python 笔记本。

先决条件

在开始之前，请确保您具备以下内容：

以下MongoDB 集群类型之一：
- 运行MongoDB 6.0.11 版本的 Atlas 集群，7.0.2 或更高版本。确保您的IP解决包含在Atlas项目的访问权限列表中。
- 使用Atlas CLI创建的本地Atlas部署。要学习；了解更多信息，请参阅创建本地Atlas部署。
- 安装了Search 和 Vector Search的MongoDB Community或 Enterprise集群。
Voyage AI API密钥。要创建帐户和API密钥，请参阅 Voyage AI网站。
OpenAI API密钥。您必须拥有一个具有可用于API请求的积分的 OpenAI 帐户。要学习；了解有关注册 OpenAI 帐户的更多信息，请参阅 OpenAI API网站。
运行交互式Python笔记本（例如 Colab）的环境。

提示

我们建议您在完成本教程之前先完成入门教程，学习；了解如何创建原生RAG实施。

使用MongoDB作为向量存储

在本部分中，您将使用MongoDB 集群作为向量数据库来创建向量存储实例。

设置环境。

为此教程设置环境。通过保存具有 .ipynb 扩展名的文件来创建交互式Python笔记本。此 Notebook 允许您单独运行Python代码片段，并且您将使用它来运行本教程中的代码。

要设立笔记本环境，请执行以下操作：

在笔记本中运行以下命令：

pip install --quiet --upgrade langchain langchain-community langchain-core langchain-mongodb langchain-voyageai langchain-openai pypdf

设置环境变量。
运行以下代码为本教程设立环境变量。提供您的 Voyage API密钥、OpenAI API密钥和MongoDB集群的SRV连接字符串。
```
import os
os.environ["OPENAI_API_KEY"] = "<openai-key>"
os.environ["VOYAGE_API_KEY"] = "<voyage-key>"
MONGODB_URI = "<connection-string>"
```
注意
将 <connection-string> 替换为您的 Atlas 集群或本地部署的连接字符串。
连接字符串应使用以下格式：
mongodb+srv://<db_username>:<db_password>@<clusterName>.<hostname>.mongodb.net
要学习；了解更多信息，请参阅通过驱动程序连接到集群。
连接字符串应使用以下格式：
mongodb://localhost:<port-number>/?directConnection=true
要学习；了解更多信息，请参阅连接字符串。

实例化向量存储。

在笔记本中粘贴并运行以下代码，以使用MongoDB中的 langchain_db.rag_with_memory命名空间创建名为 vector_store 的向量存储实例：

from langchain_mongodb import MongoDBAtlasVectorSearch
from langchain_voyageai import VoyageAIEmbeddings
# Use the voyage-3-large embedding model
embedding_model = VoyageAIEmbeddings(model="voyage-3-large")
# Create the vector store
vector_store = MongoDBAtlasVectorSearch.from_connection_string(
   connection_string = MONGODB_URI,
   embedding = embedding_model,
   namespace = "langchain_db.rag_with_memory"
)

将数据添加到向量存储中。

在笔记本中粘贴并运行以下代码，以将包含最近MongoDB收益报告的示例PDF 导入向量存储中。

该代码使用文本拆分器将 PDF 数据分割成较小的父文档。它指定每个文档的数据块大小（字符数）和数据块重叠（连续数据块之间的重叠字符数）。

from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
# Load the PDF
loader = PyPDFLoader("https://investors.mongodb.com/node/13176/pdf")
data = loader.load()
# Split PDF into documents
text_splitter = RecursiveCharacterTextSplitter(chunk_size=200, chunk_overlap=20)
docs = text_splitter.split_documents(data)
# Add data to the vector store
vector_store.add_documents(docs)

提示

运行此代码后，如果您正在使用Atlas ，则可以导航到Atlas用户用户界面中的langchain_db.rag_with_memory 命名空间，以验证向量嵌入。

创建MongoDB Vector Search索引。

运行以下代码为向量存储创建MongoDB Vector Search索引，从而启用对数据的向量搜索：

# Use LangChain helper method to create the vector search index
vector_store.create_vector_search_index(
   dimensions = 1024 # The dimensions of the vector embeddings to be indexed
)

提示

create_vector_search_index API参考

构建索引大约需要一分钟时间。在建立索引时，索引处于初始同步状态。构建完成后，您可以开始查询集合中的数据。

使用内存实现检索增强生成 (RAG)

本节演示如何使用 LangChain MongoDB 集成实现具有会话内存的 RAG。

定义一个函数来获取聊天消息历史记录。

要维护多次交互中的对话历史记录，请使用 MongoDBChatMessageHistory 类。它允许您将聊天消息存储在 MongoDB 数据库中，并将其扩展到您的 RAG 链，以处理对话上下文。

在您的笔记本中粘贴并运行以下代码，以创建一个名为 get_session_history 的函数，该函数返回一个 MongoDBChatMessageHistory 实例。此实例检索特定会话的聊天记录。

from langchain_mongodb.chat_message_histories import MongoDBChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.prompts import MessagesPlaceholder
def get_session_history(session_id: str) -> MongoDBChatMessageHistory:
    return MongoDBChatMessageHistory(
        connection_string=MONGODB_URI,
        session_id=session_id,
        database_name="langchain_db",
        collection_name="rag_with_memory"
    )

创建用于处理聊天消息历史记录的 RAG 链。

粘贴并运行以下代码片段以创建 RAG 链：

指定要使用的 LLM。
```
from langchain_openai import ChatOpenAI
# Define the model to use for chat completion
llm = ChatOpenAI(model = "gpt-4o")
```

定义一个提示，总结检索器的聊天历史记录。

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
# Create a prompt to generate standalone questions from follow-up questions
standalone_system_prompt = """
Given a chat history and a follow-up question, rephrase the follow-up question to be a standalone question.
Do NOT answer the question, just reformulate it if needed, otherwise return it as is.
Only return the final standalone question.
"""
standalone_question_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", standalone_system_prompt),
        MessagesPlaceholder(variable_name="history"),
        ("human", "{question}"),
    ]
)
# Parse output as a string
parse_output = StrOutputParser()
question_chain = standalone_question_prompt | llm | parse_output

构建一个处理聊天记录并检索文档的检索链。

from langchain_core.runnables import RunnablePassthrough
# Create a retriever
retriever = vector_store.as_retriever(search_type="similarity", search_kwargs={ "k": 5 })
# Create a retriever chain that processes the question with history and retrieves documents
retriever_chain = RunnablePassthrough.assign(context=question_chain | retriever | (lambda docs: "\n\n".join([d.page_content for d in docs])))

定义一个提示，以根据聊天历史记录和检索到的上下文生成回答。

# Create a prompt template that includes the retrieved context and chat history
rag_system_prompt = """Answer the question based only on the following context:
{context}
"""
rag_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", rag_system_prompt),
        MessagesPlaceholder(variable_name="history"),
        ("human", "{question}"),
    ]
)

使用内存实施 RAG。

将您定义的组件组合成完整的 RAG 链：

# Build the RAG chain
rag_chain = (
    retriever_chain
    | rag_prompt
    | llm
    | parse_output
)
# Wrap the chain with message history
rag_with_memory = RunnableWithMessageHistory(
    rag_chain,
    get_session_history,
    input_messages_key="question",
    history_messages_key="history",
)

测试您的 RAG 实现。

调用该链来回答问题。该链维护对话上下文，并返回考虑之前交互的相关答案。您的回答可能会有所不同。

# First question
response_1 = rag_with_memory.invoke(
    {"question": "What was MongoDB's latest acquisition?"},
    {"configurable": {"session_id": "user_1"}}
)
print(response_1)

MongoDB's latest acquisition was Voyage AI, a pioneer in state-of-the-art embedding and reranking models for next-generation AI applications.

# Follow-up question that references the previous question
response_2 = rag_with_memory.invoke(
    {"question": "Why did they do it?"},
    {"configurable": {"session_id": "user_1"}}
)
print(response_2)

MongoDB acquired Voyage AI to enable organizations to easily build trustworthy AI applications by integrating advanced embedding and reranking models into their technology. This acquisition aligns with MongoDB's goal of helping businesses innovate at "AI speed" using its flexible document model and seamless scalability.

添加语义缓存

本节在您的 RAG 链之上添加语义缓存。语义缓存是一种缓存形式，它根据查询之间的语义相似性来检索缓存的提示。

注意

您可以独立使用语义缓存和对话内存，但在本教程中，您将同时使用这两种功能。

有关此功能的视频教程，请参阅边观看边学习。

配置语义缓存。

运行以下代码以使用 MongoDBAtlasSemanticCache 类配置语义缓存：

from langchain_mongodb.cache import MongoDBAtlasSemanticCache
from langchain_core.globals import set_llm_cache
# Configure the semantic cache
set_llm_cache(MongoDBAtlasSemanticCache(
    connection_string = MONGODB_URI,
    database_name = "langchain_db",
    collection_name = "semantic_cache",
    embedding = embedding_model,
    index_name = "vector_index",
    similarity_threshold = 0.5  # Adjust based on your requirements
))

使用您的 RAG 链测试语义缓存。

语义缓存会自动缓存您的提示。运行以下示例查询，您应该会看到第二个查询的响应时间显著缩短。您的响应和响应时间可能会有所不同。

提示

您可以在 semantic_cache集合中查看缓存的提示。语义缓存仅缓存 LLM 的输入。在检索链中使用它时，请注意，检索的文档可能会在运行之间发生变化，从而导致语义相似的查询缓存未命中。

%%time
# First query (not cached)
rag_with_memory.invoke(
  {"question": "What was MongoDB's latest acquisition?"},
  {"configurable": {"session_id": "user_2"}}
)

CPU times: user 54.7 ms, sys: 34.2 ms, total: 88.9 ms
Wall time: 7.42 s
"MongoDB's latest acquisition was Voyage AI, a pioneer in state-of-the-art embedding and reranking models that power next-generation AI applications."

%%time
# Second query (cached)
rag_with_memory.invoke(
  {"question": "What company did MongoDB acquire recently?"},
  {"configurable": {"session_id": "user_2"}}
)

CPU times: user 79.7 ms, sys: 24 ms, total: 104 ms
Wall time: 3.87 s
'MongoDB recently acquired Voyage AI.'

通过观看学习

请观看此视频教程，以了解有关使用 LangChain 和 MongoDB 进行语义缓存的更多信息。

时长：30 分钟

后退

开始体验

来年

混合搜索