/ /

LangChain 및 MongoDB 로 메모리 및 시맨틱 캐싱 추가하기

이 튜토리얼은 LangChain MongoDB 통합을 사용하여 대화 메모리와 시맨틱 캐싱을 추가함으로써 RAG 애플리케이션을 향상시키는 방법을 설명합니다.

메모리 를 사용하면 여러 사용자 상호 작용에서 대화 컨텍스트를 유지할 수 있습니다.
시맨틱 캐싱 은 의미적으로 유사한 쿼리를 캐싱하여 응답 지연 시간을 줄입니다.

이 튜토리얼의 실행 가능한 버전을 Python 노트북으로 사용합니다.

전제 조건

시작하기 전에 다음 항목이 준비되어 있는지 확인하세요.

다음 MongoDB cluster 유형 중 하나입니다.
- MongoDB 버전 6.0.11 을 실행 Atlas cluster , 7.0.2 이상입니다. 사용자의 IP 주소 가 Atlas 프로젝트의 액세스 목록에 포함되어 있는지 확인하세요.
- Atlas CLI 사용하여 생성된 로컬 Atlas 배포서버 입니다. 자세히 학습 로컬 Atlas 배포 만들기를 참조하세요.
- 검색 및 벡터 검색이 설치된 MongoDB Community 또는 Enterprise 클러스터.
Voyage AI API 키입니다. 계정과 API 키를 만들려면 Voyage AI 웹사이트참조하세요.
OpenAI API 키입니다. API 요청에 사용할 수 있는 크레딧이 있는 OpenAI 계정이 있어야 합니다. OpenAI 계정 등록에 대해 자세히 학습하려면 OpenAI API 웹사이트를 참조하세요.
Colab과같은 대화형 Python 노트북을 실행 수 있는 환경입니다.

팁

이 튜토리얼을 완료하기 전에 간단한 RAG 구현 생성하는 방법을 학습 시작하기 튜토리얼을 완료하는 것이 좋습니다.

MongoDB 벡터 저장소로 사용

이 섹션에서는 MongoDB cluster 벡터 데이터베이스 로 사용하여 벡터 저장 인스턴스 생성합니다.

환경을 설정합니다.

이 튜토리얼의 환경을 설정합니다. 확장자가 .ipynb 인 파일 저장하여 대화형 Python 노트북을 만듭니다. 이 노트북을 사용하면 Python 코드 스니펫을 개별적으로 실행 수 있으며, 이 튜토리얼에서는 이를 사용하여 코드를 실행 .

노트북 환경을 설정하다 하려면 다음을 수행합니다.

노트북에서 다음 명령을 실행합니다.

pip install --quiet --upgrade langchain langchain-community langchain-core langchain-mongodb langchain-voyageai langchain-openai pypdf

환경 변수를 설정합니다.
다음 코드를 실행하여 이 튜토리얼의 환경 변수를 설정하다 . Voyage API 키, OpenAI API 키 및 MongoDB cluster의 SRV 연결 문자열제공합니다.
```
import os
os.environ["OPENAI_API_KEY"] = "<openai-key>"
os.environ["VOYAGE_API_KEY"] = "<voyage-key>"
MONGODB_URI = "<connection-string>"
```
참고
<connection-string>을 Atlas 클러스터 또는 로컬 Atlas 배포서버의 연결 문자열로 교체합니다.
연결 문자열은 다음 형식을 사용해야 합니다.
mongodb+srv://<db_username>:<db_password>@<clusterName>.<hostname>.mongodb.net
자세한 학습은 드라이버를 통해 클러스터에 연결을 참조하세요.
연결 문자열은 다음 형식을 사용해야 합니다.
mongodb://localhost:<port-number>/?directConnection=true
학습 내용은 연결 문자열을 참조하세요.

벡터 저장소를 인스턴스화합니다.

노트북에 다음 코드를 붙여넣고 실행 MongoDB 의 langchain_db.rag_with_memory 네임스페이스 사용하여 vector_store 이라는 이름의 벡터 저장 인스턴스 만듭니다.

from langchain_mongodb import MongoDBAtlasVectorSearch
from langchain_voyageai import VoyageAIEmbeddings
# Use the voyage-3-large embedding model
embedding_model = VoyageAIEmbeddings(model="voyage-3-large")
# Create the vector store
vector_store = MongoDBAtlasVectorSearch.from_connection_string(
   connection_string = MONGODB_URI,
   embedding = embedding_model,
   namespace = "langchain_db.rag_with_memory"
)

벡터 저장 에 데이터를 추가합니다.

노트북에 다음 코드를 붙여넣고 실행 최근 MongoDB 수익 보고서 가 포함된 샘플 PDF를 vector 저장에 수집합니다.

이 코드는 텍스트 분할기를 사용하여 PDF 데이터를 더 작은 상위 문서로 청크화합니다. 각 문서에 대해 청크 크기(문자 수)와 청크 겹침(연속된 청크 사이에 겹치는 문자 수)을 지정합니다.

from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
# Load the PDF
loader = PyPDFLoader("https://investors.mongodb.com/node/13176/pdf")
data = loader.load()
# Split PDF into documents
text_splitter = RecursiveCharacterTextSplitter(chunk_size=200, chunk_overlap=20)
docs = text_splitter.split_documents(data)
# Add data to the vector store
vector_store.add_documents(docs)

팁

이 코드를 실행 후 Atlas 사용하는 경우 langchain_db.rag_with_memory Atlas UI 의 네임스페이스 로이동하여 벡터 임베딩을 확인할 수 있습니다.

MongoDB Vector Search 인덱스 생성합니다.

다음 코드를 실행하여 벡터 저장 대한 MongoDB Vector Search 인덱스 생성하여 데이터에 대한 벡터 검색 활성화 .

# Use LangChain helper method to create the vector search index
vector_store.create_vector_search_index(
   dimensions = 1024 # The dimensions of the vector embeddings to be indexed
)

팁

create_vector_search_index API 참조

인덱스 작성에는 약 1분 정도가 소요됩니다. 인덱스가 작성되는 동안 인덱스는 초기 동기화 상태에 있습니다. 빌드가 완료되면 컬렉션의 데이터 쿼리를 시작할 수 있습니다.

메모리를 사용하여 RAG 구현하기

이 섹션에서는 LangChain MongoDB 통합을 사용하여 대화 메모리를 통해 RAG를 구현하는 방법을 보여줍니다.

채팅 메시지 기록을 가져오는 함수를 정의합니다.

여러 상호 작용에서 대화 기록을 유지하려면 MongoDBChatMessageHistory 클래스를 사용하세요. 채팅 메시지를 MongoDB 데이터베이스에 저장하고 이를 RAG 체인으로 확장하여 대화의 문맥을 처리할 수 있습니다.

노트북에 다음 코드를 붙여넣고 실행하여 MongoDBChatMessageHistory 인스턴스를 반환하는 get_session_history 함수를 만듭니다. 이 인스턴스는 특정 세션의 채팅 기록을 조회합니다.

from langchain_mongodb.chat_message_histories import MongoDBChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_core.prompts import MessagesPlaceholder
def get_session_history(session_id: str) -> MongoDBChatMessageHistory:
    return MongoDBChatMessageHistory(
        connection_string=MONGODB_URI,
        session_id=session_id,
        database_name="langchain_db",
        collection_name="rag_with_memory"
    )

채팅 메시지 기록을 처리하는 RAG 체인을 생성합니다.

다음 코드 스니펫을 붙여넣고 실행 RAG 체인을 생성합니다.

사용할 LLM을 지정합니다.
```
from langchain_openai import ChatOpenAI
# Define the model to use for chat completion
llm = ChatOpenAI(model = "gpt-4o")
```

리트리버의 채팅 기록을 요약하는 프롬프트를 정의합니다.

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
# Create a prompt to generate standalone questions from follow-up questions
standalone_system_prompt = """
Given a chat history and a follow-up question, rephrase the follow-up question to be a standalone question.
Do NOT answer the question, just reformulate it if needed, otherwise return it as is.
Only return the final standalone question.
"""
standalone_question_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", standalone_system_prompt),
        MessagesPlaceholder(variable_name="history"),
        ("human", "{question}"),
    ]
)
# Parse output as a string
parse_output = StrOutputParser()
question_chain = standalone_question_prompt | llm | parse_output

채팅 기록을 처리하고 문서를 조회하는 조회 체인을 구축합니다.

from langchain_core.runnables import RunnablePassthrough
# Create a retriever
retriever = vector_store.as_retriever(search_type="similarity", search_kwargs={ "k": 5 })
# Create a retriever chain that processes the question with history and retrieves documents
retriever_chain = RunnablePassthrough.assign(context=question_chain | retriever | (lambda docs: "\n\n".join([d.page_content for d in docs])))

채팅 기록과 조회된 컨텍스트를 기반으로 답변을 생성하기 위한 프롬프트를 정의합니다.

# Create a prompt template that includes the retrieved context and chat history
rag_system_prompt = """Answer the question based only on the following context:
{context}
"""
rag_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", rag_system_prompt),
        MessagesPlaceholder(variable_name="history"),
        ("human", "{question}"),
    ]
)

메모리로 RAG를 구현합니다.

정의한 구성 요소를 완전한 RAG 체인으로 결합합니다.

# Build the RAG chain
rag_chain = (
    retriever_chain
    | rag_prompt
    | llm
    | parse_output
)
# Wrap the chain with message history
rag_with_memory = RunnableWithMessageHistory(
    rag_chain,
    get_session_history,
    input_messages_key="question",
    history_messages_key="history",
)

RAG 구현을 테스트합니다.

체인을 호출하여 질문에 답변 . 이 체인은 대화 컨텍스트를 유지하고 이전 상호 작용을 고려하는 관련 답변을 반환합니다. 사용자의 응답은 다를 수 있습니다.

# First question
response_1 = rag_with_memory.invoke(
    {"question": "What was MongoDB's latest acquisition?"},
    {"configurable": {"session_id": "user_1"}}
)
print(response_1)

MongoDB's latest acquisition was Voyage AI, a pioneer in state-of-the-art embedding and reranking models for next-generation AI applications.

# Follow-up question that references the previous question
response_2 = rag_with_memory.invoke(
    {"question": "Why did they do it?"},
    {"configurable": {"session_id": "user_1"}}
)
print(response_2)

MongoDB acquired Voyage AI to enable organizations to easily build trustworthy AI applications by integrating advanced embedding and reranking models into their technology. This acquisition aligns with MongoDB's goal of helping businesses innovate at "AI speed" using its flexible document model and seamless scalability.

시맨틱 캐싱 추가

이 섹션은 RAG 체인 위에 시맨틱 캐싱을 추가합니다. 시맨틱 캐싱은 쿼리 간의 의미적 유사성을 기반으로 캐시된 프롬프트를 조회하는 캐싱의 한 형태입니다.

참고

시맨틱 캐싱은 대화 메모리와 독립적으로 사용할 수 있지만, 이 튜토리얼에서는 두 기능을 함께 사용합니다.

이 기능에 대한 동영상 튜토리얼은 시청하면서 배우기를 참조하세요.

시맨틱 캐시 구성합니다.

MongoDBAtlasSemanticCache 클래스를 사용하여 시맨틱 캐시를 구성하려면 다음 코드를 실행하세요.

from langchain_mongodb.cache import MongoDBAtlasSemanticCache
from langchain_core.globals import set_llm_cache
# Configure the semantic cache
set_llm_cache(MongoDBAtlasSemanticCache(
    connection_string = MONGODB_URI,
    database_name = "langchain_db",
    collection_name = "semantic_cache",
    embedding = embedding_model,
    index_name = "vector_index",
    similarity_threshold = 0.5  # Adjust based on your requirements
))

RAG 체인으로 시맨틱 캐시를 테스트하세요.

시맨틱 캐시는 사용자의 프롬프트를 자동으로 캐시합니다. 다음 샘플 쿼리를 실행하면 두 번째 쿼리의 응답 시간이 크게 줄어드는 것을 볼 수 있습니다. 응답 및 응답 시간은 다를 수 있습니다.

팁

semantic_cache 컬렉션 에서 캐시된 프롬프트를 볼 수 있습니다. 시맨틱 캐시 LLM에 대한 입력만 캐시합니다. 검색 체인에서 사용할 때는 검색된 문서가 실행 간에 변경될 수 있으므로 의미적으로 유사한 쿼리에 대해 캐시 누락이 발생할 수 있습니다.

%%time
# First query (not cached)
rag_with_memory.invoke(
  {"question": "What was MongoDB's latest acquisition?"},
  {"configurable": {"session_id": "user_2"}}
)

CPU times: user 54.7 ms, sys: 34.2 ms, total: 88.9 ms
Wall time: 7.42 s
"MongoDB's latest acquisition was Voyage AI, a pioneer in state-of-the-art embedding and reranking models that power next-generation AI applications."

%%time
# Second query (cached)
rag_with_memory.invoke(
  {"question": "What company did MongoDB acquire recently?"},
  {"configurable": {"session_id": "user_2"}}
)

CPU times: user 79.7 ms, sys: 24 ms, total: 104 ms
Wall time: 3.87 s
'MongoDB recently acquired Voyage AI.'

보면서 배우기

이 동영상 튜토리얼을 따라 LangChain 및 MongoDB를 사용한 시맨틱 캐싱에 대해 더 학습하세요.

소요 시간: 30분

돌아가기

시작하기

하이브리드 검색