/ /

LgChuin とMongoDBによるメモリとセマンティックキャッシュの追加

このチュートリアルでは、LangChain MongoDB 統合を使用して会話メモリとセマンティックキャッシュを追加することで RAG アプリケーションを強化する方法を説明します。

メモリを使用すると、複数のユーザーインタラクションにわたってチャットコンテキストを維持できます。
セマンティックキャッシュは、セマン場合に類似したクエリをキャッシュすることで応答レイテンシを軽減します。

このチュートリアルの実行可能なバージョンを Python エディタとして操作します。

前提条件

始める前に、以下のものを必ず用意してください。

次のいずれかのMongoDBクラスタータイプ
- MongoDBバージョンを実行中 Atlas クラスター6.0.117.0.2、またはそれ以降。IPアドレスが Atlas プロジェクトのアクセスリストに含まれていることを確認します。
- Atlas CLI を使用して作成されたローカル Atlas 配置。詳細については、「Atlas 配置のローカル配置の作成」を参照してください。
- Search とベクトル検索がインストールされたMongoDB Community または Enterprise クラスター。
投票AI APIキー。アカウントとAPIキーを作成するには、Vyage AI のウェブサイトを参照してください。
OpenAI APIキー。APIリクエストに使用できるクレジットを持つ OpenAI アカウントが必要です。OpenAI アカウントの登録の詳細については、OpenAI APIウェブサイトを参照してください。
Colab などのインタラクティブPythonノートを実行するための環境。

Tip

このチュートリアルを完了する前に、ネイティブ RAG 実装を作成する方法については、「はじめに」チュートリアルを完了することをお勧めします。

MongoDB をベクトルストアとして使用

このセクションでは、 MongoDBクラスターをベクトルデータベースとして使用してベクトルストアのインスタンスを作成します。

環境を設定します。

このチュートリアルの環境を設定します。 .ipynb 拡張子を持つファイルを保存して、インタラクティブPythonノートを作成します。このノートはPythonコードスニペットを個別に実行でき、このチュートリアルのコードを実行するために使用します。

ノートク環境を設定するには、次の手順に従います。

ノートブックで次のコマンドを実行します。

pip install --quiet --upgrade langchain langchain-community langchain-core langchain-mongodb langchain-voyageai langchain-openai pypdf

環境変数を設定してください。
このチュートリアルの環境変数を設定するには、次のコードを実行します。Vorage APIキー、OpenAI APIキー、およびMongoDBクラスターの SRV 接続文字列を指定します。
```
import os
os.environ["OPENAI_API_KEY"] = "<openai-key>"
os.environ["VOYAGE_API_KEY"] = "<voyage-key>"
MONGODB_URI = "<connection-string>"
```
注意
<connection-string> を Atlas クラスターまたはローカル Atlas 配置の接続文字列に置き換えます。
接続stringには、次の形式を使用する必要があります。
mongodb+srv://<db_username>:<db_password>@<clusterName>.<hostname>.mongodb.net
詳しくは、ドライバーを使用してクラスターに接続するを参照してください。
接続stringには、次の形式を使用する必要があります。
mongodb://localhost:<port-number>/?directConnection=true
詳細については、「接続文字列」を参照してください。

ベクトルストアをインスタンス化します。

次のコードをノート PC に貼り付けて実行し、 MongoDBの langchain_db.rag_with_memory名前空間を使用して vector_store という名前のベクトルストアインスタンスを作成します。

from langchain_mongodb import MongoDBAtlasVectorSearch
from langchain_voyageai import VoyageAIEmbeddings
# Use the voyage-3-large embedding model
embedding_model = VoyageAIEmbeddings(model="voyage-3-large")
# Create the vector store
vector_store = MongoDBAtlasVectorSearch.from_connection_string(
   connection_string = MONGODB_URI,
   embedding = embedding_model,
   namespace = "langchain_db.rag_with_memory"
)

ベクトルストアにデータを追加します。

次のコードをノートに貼り付けて実行し、最近のMongoDB収益レポートを含むサンプルPDF をベクトルストアに取り込みます。

このコードは、テキストスプリッターを使用して、PDFデータを小さな親ドキュメントに分割します。各ドキュメントのチャンクサイズ（文字数）とチャンクオーバーラップ（連続するチャンク間で重なる文字数）を指定します。

from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
# Load the PDF
loader = PyPDFLoader("https://investors.mongodb.com/node/13176/pdf")
data = loader.load()
# Split PDF into documents
text_splitter = RecursiveCharacterTextSplitter(chunk_size=200, chunk_overlap=20)
docs = text_splitter.split_documents(data)
# Add data to the vector store
vector_store.add_documents(docs)

Tip

Atlaslangchain_db.rag_with_memory を使用している場合は、このコードを実行中後、Atlas UIの名前空間に移動することでベクトル埋め込みを検証できます。

MongoDB ベクトル検索インデックスを作成します。

次のコードを実行してベクトルストアのMongoDB ベクトル検索インデックスを作成し、データに対してベクトル検索を有効にします。

# Use LangChain helper method to create the vector search index
vector_store.create_vector_search_index(
   dimensions = 1024 # The dimensions of the vector embeddings to be indexed
)

Tip

create_vector_search_index API参照

インデックスの構築には約 1 分かかります。構築中、インデックスは最初の同期状態になります。構築が完了したら、コレクション内のデータのクエリを開始できます。

メモリを使用した検索拡張生成（RAG）の実装

このセクションでは、LangChain MongoDB 統合を使用して、会話メモリ付きで RAG を実装する方法を示します。

チャットメッセージの履歴を取得するための関数を定義します。

複数のインタラクションにわたって会話履歴を保持するには、MongoDBChatMessageHistory クラスを使用します。これにより、チャットメッセージを MongoDB データベースに保存し、RAG チェーンに拡張して会話のコンテキストを処理できます。

次のコードをノートブックに貼り付けて実行し、MongoDBChatMessageHistory インスタンスを返すget_session_history という名前の関数を作成します。このインスタンスは、特定のセッションのチャット履歴を検索します。

from langchain_mongodb.chat_message_histories import MongoDBChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.prompts import MessagesPlaceholder
def get_session_history(session_id: str) -> MongoDBChatMessageHistory:
    return MongoDBChatMessageHistory(
        connection_string=MONGODB_URI,
        session_id=session_id,
        database_name="langchain_db",
        collection_name="rag_with_memory"
    )

チャットメッセージ履歴を処理する RG チェーンを作成します。

次のコードスニペットを貼り付けて実行し、RAM チェーンを作成します。

使用する LLM を指定します。
```
from langchain_openai import ChatOpenAI
# Define the model to use for chat completion
llm = ChatOpenAI(model = "gpt-4o")
```

検索者のチャット履歴を要約するプロンプトを定義します。

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
# Create a prompt to generate standalone questions from follow-up questions
standalone_system_prompt = """
Given a chat history and a follow-up question, rephrase the follow-up question to be a standalone question.
Do NOT answer the question, just reformulate it if needed, otherwise return it as is.
Only return the final standalone question.
"""
standalone_question_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", standalone_system_prompt),
        MessagesPlaceholder(variable_name="history"),
        ("human", "{question}"),
    ]
)
# Parse output as a string
parse_output = StrOutputParser()
question_chain = standalone_question_prompt | llm | parse_output

チャット履歴を処理し、ドキュメントを取得する取得チェーンをビルドします。

from langchain_core.runnables import RunnablePassthrough
# Create a retriever
retriever = vector_store.as_retriever(search_type="similarity", search_kwargs={ "k": 5 })
# Create a retriever chain that processes the question with history and retrieves documents
retriever_chain = RunnablePassthrough.assign(context=question_chain | retriever | (lambda docs: "\n\n".join([d.page_content for d in docs])))

チャット履歴と検索したコンテキストに基づいて答えるためのプロンプトを定義します。

# Create a prompt template that includes the retrieved context and chat history
rag_system_prompt = """Answer the question based only on the following context:
{context}
"""
rag_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", rag_system_prompt),
        MessagesPlaceholder(variable_name="history"),
        ("human", "{question}"),
    ]
)

メモリを使用して RAG を実装します。

定義したコンポーネントを完全な RAG チェーンに結合します。

# Build the RAG chain
rag_chain = (
    retriever_chain
    | rag_prompt
    | llm
    | parse_output
)
# Wrap the chain with message history
rag_with_memory = RunnableWithMessageHistory(
    rag_chain,
    get_session_history,
    input_messages_key="question",
    history_messages_key="history",
)

RAG実装をテストしてください。

質問に答えるためにチェーンを呼び出します。この連鎖は対話のコンテキストを維持し、以前のインタラクションを考慮した関連する回答を返します。応答は異なる場合があります。

# First question
response_1 = rag_with_memory.invoke(
    {"question": "What was MongoDB's latest acquisition?"},
    {"configurable": {"session_id": "user_1"}}
)
print(response_1)

MongoDB's latest acquisition was Voyage AI, a pioneer in state-of-the-art embedding and reranking models for next-generation AI applications.

# Follow-up question that references the previous question
response_2 = rag_with_memory.invoke(
    {"question": "Why did they do it?"},
    {"configurable": {"session_id": "user_1"}}
)
print(response_2)

MongoDB acquired Voyage AI to enable organizations to easily build trustworthy AI applications by integrating advanced embedding and reranking models into their technology. This acquisition aligns with MongoDB's goal of helping businesses innovate at "AI speed" using its flexible document model and seamless scalability.

セマンティックキャッシングを追加する

このセクションでは、RAG チェーンの上にセマンティックキャッシュを追加します。セマンティックキャッシュは、クエリ間の意味的な類似性に基づいてキャッシュされたプロンプトを検索するキャッシュの一形態です。

注意

セマンティックキャッシュは会話メモリとは独立して使用できますが、このチュートリアルでは両方の機能を一緒に使用します。

この機能のビデオチュートリアルについては、ビデオで学ぶを参照してください。

セマンティックキャッシュを構成します。

MongoDBAtlasSemanticCache クラスを使用してセマンティックキャッシュを構成するには、次のコードを実行します。

from langchain_mongodb.cache import MongoDBAtlasSemanticCache
from langchain_core.globals import set_llm_cache
# Configure the semantic cache
set_llm_cache(MongoDBAtlasSemanticCache(
    connection_string = MONGODB_URI,
    database_name = "langchain_db",
    collection_name = "semantic_cache",
    embedding = embedding_model,
    index_name = "vector_index",
    similarity_threshold = 0.5  # Adjust based on your requirements
))

セマンティックキャッシュをRAGチェーンでテストしてください。

セマンティックキャッシュはプロンプトを自動的にキャッシュします。次のサンプルクエリを実行すると、2 番目のクエリの応答時間が大幅に短縮されることがわかります。回答と応答時間は異なる場合があります。

Tip

キャッシュされたプロンプトは、 semantic_cacheコレクションで表示できます。セマンティックキャッシュはLM への入力のみをキャッシュします。検索チェーンで使用する場合は、検索されるドキュメントが実行間で変更される可能性があり、セマンティックに類似したキャッシュが失われることに注意してください。

%%time
# First query (not cached)
rag_with_memory.invoke(
  {"question": "What was MongoDB's latest acquisition?"},
  {"configurable": {"session_id": "user_2"}}
)

CPU times: user 54.7 ms, sys: 34.2 ms, total: 88.9 ms
Wall time: 7.42 s
"MongoDB's latest acquisition was Voyage AI, a pioneer in state-of-the-art embedding and reranking models that power next-generation AI applications."

%%time
# Second query (cached)
rag_with_memory.invoke(
  {"question": "What company did MongoDB acquire recently?"},
  {"configurable": {"session_id": "user_2"}}
)

CPU times: user 79.7 ms, sys: 24 ms, total: 104 ms
Wall time: 3.87 s
'MongoDB recently acquired Voyage AI.'

ビデオで学ぶ

このビデオチュートリアルで、LangChain と MongoDB を使用したセマンティックキャッシュについて詳しく学べます。

所要時間: 30分

戻る

はじめる

ハイブリッド検索

前提条件

Tip

MongoDB をベクトル ストアとして使用

環境を設定します。

注意

ベクトル ストアをインスタンス化します。

ベクトルストアにデータを追加します。

Tip

MongoDB ベクトル検索インデックスを作成します。

Tip

メモリを使用した検索拡張生成（RAG）の実装

チャット メッセージの履歴を取得するための関数を定義します。

チャット メッセージ履歴を処理する RG チェーンを作成します。

RAG実装をテストしてください。

セマンティックキャッシングを追加する

注意

セマンティックキャッシュを構成します。

セマンティックキャッシュをRAGチェーンでテストしてください。

Tip

ビデオで学ぶ

MongoDB をベクトルストアとして使用

ベクトルストアをインスタンス化します。

チャットメッセージの履歴を取得するための関数を定義します。

チャットメッセージ履歴を処理する RG チェーンを作成します。