LlamaIndex 통합 시작하기

MongoDB Vector Search를 LlamaIndex 와 통합하여 LLM 애플리케이션 에서 검색 증대 생성(RAG)을 구현 수 있습니다. 이 튜토리얼에서는 LlamaIndex와 함께 MongoDB Vector Search를 사용하여 데이터에서 시맨틱 검색 수행하고 RAG 구현 빌드 방법을 보여 줍니다. 구체적으로 다음 조치를 수행합니다.

환경을 설정합니다.
사용자 지정 데이터를 MongoDB 에 저장합니다.
데이터에 MongoDB Vector Search 인덱스 생성합니다.
다음 벡터 검색 쿼리를 실행합니다.
- 시맨틱 검색.
- 메타데이터 사전 필터링을 통한 시맨틱 검색.
MongoDB Vector Search를 사용하여 데이터에 대한 질문에 답변 RAG 를 구현합니다.

이 튜토리얼의 실행 가능한 버전을 Python 노트북으로 사용합니다.

배경

LlamaIndex는 사용자 지정 데이터 소스를 LLM에 연결하는 방법을 간소화하도록 설계된 오픈 소스 프레임워크입니다. RAG 애플리케이션에 대한 벡터 임베딩을 로드하고 준비하는 데 도움이 되는 데이터 커넥터, 인덱스 및 쿼리 엔진과 같은 여러 도구를 제공합니다.

MongoDB Vector Search와 LlamaIndex를 통합하면 MongoDB 벡터 데이터베이스로 사용하고 MongoDB Vector Search를 사용하여 데이터에서 의미적으로 유사한 문서를 검색하여 RAG 를 구현. RAG에 대해 자세히 학습 MongoDB 사용한 RAG(검색 보강 생성)를 참조하세요.

절차

전제 조건

이 튜토리얼을 완료하려면 다음 조건을 충족해야 합니다.

다음 MongoDB cluster 유형 중 하나입니다.
- MongoDB 버전 6.0.11을 실행하는 Atlas 클러스터 7.0.2 또는 그 이상. IP 주소가 Atlas 프로젝트의 액세스 목록에 포함되어 있는지 확인하세요.
- Atlas CLI 사용하여 생성된 로컬 Atlas 배포서버 입니다. 자세히 학습 로컬 Atlas 배포 만들기를 참조하세요.
- 검색 및 벡터 검색이 설치된 MongoDB Community 또는 Enterprise 클러스터.
OpenAI API 키입니다. API 요청에 사용할 수 있는 크레딧이 있는 OpenAI 계정이 있어야 합니다. OpenAI 계정 등록에 대해 자세히 학습하려면 OpenAI API 웹사이트를 참조하세요.
Voyage AI API 키입니다. 계정과 API 키를 만들려면 Voyage AI 웹사이트참조하세요.
Colab과같은 대화형 Python 노트북을 실행 수 있는 환경입니다.

환경 설정

이 튜토리얼의 환경을 설정합니다. 확장자가 .ipynb 인 파일 저장하여 대화형 Python 노트북을 만듭니다. 이 노트북을 사용하면 Python 코드 스니펫을 개별적으로 실행 수 있으며, 이 튜토리얼에서는 이를 사용하여 코드를 실행 .

노트북 환경을 설정하다 하려면 다음을 수행합니다.

종속성을 설치하고 가져옵니다.

다음 명령을 실행합니다:

pip install --quiet --upgrade llama-index llama-index-vector-stores-mongodb llama-index-llms-openai llama-index-embeddings-voyageai pymongo

그런 다음, 다음 코드를 실행하여 필요한 패키지를 가져옵니다.

import os, pymongo, pprint
from pymongo.operations import SearchIndexModel
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, StorageContext
from llama_index.core.settings import Settings
from llama_index.core.retrievers import VectorIndexRetriever
from llama_index.core.vector_stores import MetadataFilter, MetadataFilters, ExactMatchFilter, FilterOperator
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.embeddings.voyageai import VoyageEmbedding
from llama_index.llms.openai import OpenAI
from llama_index.vector_stores.mongodb import MongoDBAtlasVectorSearch

환경 변수를 정의합니다.

다음 코드를 실행하고 자리 표시자를 다음 값으로 대체합니다.

사용자의 OpenAI API 키.
Voyage AI API 키.
MongoDB cluster의 SRV 연결 문자열.

os.environ["OPENAI_API_KEY"] = "<openai-api-key>"
os.environ["VOYAGEAI_API_KEY"] = "<voyageai-api-key>"
MONGODB_URI = "<connection-string>"

참고

<connection-string>을 Atlas 클러스터 또는 로컬 Atlas 배포서버의 연결 문자열로 교체합니다.

연결 문자열은 다음 형식을 사용해야 합니다.

mongodb+srv://<db_username>:<db_password>@<clusterName>.<hostname>.mongodb.net

자세한 학습은 드라이버를 통해 클러스터에 연결을 참조하세요.

연결 문자열은 다음 형식을 사용해야 합니다.

mongodb://localhost:<port-number>/?directConnection=true

학습 내용은 연결 문자열을 참조하세요.

LlamaIndex 설정을 구성합니다.

다음 코드를 실행하여 LlamaIndex와 관련된 설정을 구성합니다. 이 설정은 다음을 지정합니다:

voyage-3-large 를 애플리케이션에서 데이터에서 벡터 임베딩을 생성하는 데 사용하는 임베딩 모델로 사용합니다.
OpenAI는 데이터에 대한 질문에 답변 위해 애플리케이션 에서 사용하는 LLM입니다.
청크 크기와 겹침 을 통해 LlamaIndex가 저장 위해 데이터를 분할하는 방법을 사용자 지정할 수 있습니다.

from llama_index.embeddings.voyageai import VoyageEmbedding
embed_model= VoyageEmbedding(
  voyage_api_key = os.environ["VOYAGEAI_API_KEY"],
  model_name = "voyage-3-large",
)
Settings.llm = OpenAI()
       Settings.embed_model = embed_model
Settings.chunk_size = 100
Settings.chunk_overlap = 10

MongoDB 벡터 저장소로 사용

그런 다음 사용자 지정 데이터를 MongoDB 에 로드하고 MongoDB cluster 를 벡터 저장 라고도 하는 벡터 데이터베이스 로 인스턴스화합니다. 다음 코드 스니펫을 복사하여 노트북에 붙여넣습니다.

샘플 데이터를 불러옵니다.

이 튜토리얼에서는 벡터 저장 의 데이터 소스로 최근 MongoDB 수익 보고서 가 포함된 공개적으로 액세스할 수 있는 PDF 문서를 사용합니다. 이 문서 MongoDB의 회계연도 2025 4분기 및 전체 연도에 대한 재무 결과를 설명합니다.

샘플 데이터를 로드하려면 다음 코드 스니펫을 실행합니다. 다음을 수행합니다.

data 이라는 새 디렉토리를 만듭니다.
지정된 URL에서 PDF를 검색하여 디렉토리에 파일로 저장합니다.
SimpleDirectoryReader 데이터 connector 사용하여 파일 에서 원시 텍스트와 메타데이터 추출합니다. 또한 데이터의 형식을 문서로 지정합니다.

# Load the sample data
from urllib.request import urlretrieve
urlretrieve("https://investors.mongodb.com/node/13176/pdf", "mongodb-earnings-report.pdf")
sample_data = SimpleDirectoryReader(input_files=["mongodb-earnings-report.pdf"]).load_data()
# Print the first document
sample_data[0]

Document(id_='62b7cace-30c0-4687-9d87-e178547ae357', embedding=None,
metadata={'page_label': '1', 'file_name': 'mongodb-earnings-report.pdf',
'file_path': 'data/mongodb-earnings-report.pdf', 'file_type':
'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28',
'last_modified_date': '2025-05-28'},
excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size',
'creation_date', 'last_modified_date', 'last_accessed_date'],
excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size',
'creation_date', 'last_modified_date', 'last_accessed_date'],
relationships={}, metadata_template='{key}: {value}', metadata_separator='\n',
text_resource=MediaResource(embeddings=None, data=None, text='MongoDB, Inc.
Announces Fourth Quarter and Full Year Fiscal 2025 Financial Results\nMarch 5,
2025\nFourth Quarter Fiscal 2025 Total Revenue of $548.4 million, up 20%
Year-over-Year\nFull Year Fiscal 2025 Total Revenue of $2.01 billion, up 19%
Year-over-Year\nContinued Strong Customer Growth with Over 54,500 Customers as
of January 31, 2025\nMongoDB Atlas Revenue up 24% Year-over-Year; 71% of Total
Q4 Revenue\nNEW YORK , March 5, 2025 /PRNewswire/ -- MongoDB, Inc. (NASDAQ:
MDB) today announced its financial results for the fourth quarter and
fiscal\nyear ended January 31, 2025.\n\xa0\n  \xa0\n"MongoDB  delivered a
strong end to fiscal 2025 with 24% Atlas revenue growth and significant margin
expansion. Atlas consumption in the quarter\nwas better than expected and we
continue to see good performance in new workload wins due to the flexibility,
scalability and performance of the\nMongoDB  platform. In fiscal year 2026 we
expect to see stable consumption growth in Atlas, our main growth driver,"
said Dev Ittycheria, President\nand Chief Executive Officer of MongoDB
.\n"Looking ahead, we remain incredibly excited about our long-term growth
opportunity. MongoDB  removes the constraints of legacy databases,\nenabling
businesses to innovate at AI speed with our flexible document model and
seamless scalability. Following the Voyage AI acquisition, we\ncombine
real-time data, sophisticated embedding and retrieval models and semantic
search directly in the database, simplifying the development of\ntrustworthy
AI-powered apps."\nFourth Quarter Fiscal 2025 Financial Highlights\nRevenue:
Total revenue was $548.4 million for the fourth quarter of fiscal 2025, an
increase of 20% year-over-year.\nSubscription revenue was $531.0 million, an
increase of 19% year-over-year, and services revenue was $17.4 million,
an\nincrease of 34% year-over-year.\nGross Profit: Gross profit was $399.4
million for the fourth quarter of fiscal 2025, representing a 73% gross
margin\ncompared to 75% in the year-ago period. Non-GAAP gross profit was
$411.7 million, representing a 75% non-GAAP gross\nmargin, compared to a
non-GAAP gross margin of 77% in the year-ago period.\nLoss from Operations:
Loss from operations was $18.6 million for the fourth quarter of fiscal 2025,
compared to a loss\nfrom operations of $71.0 million in the year-ago period.
Non-GAAP income from operations was $112.5 million, compared\nto non-GAAP
income from operations of $69.2 million in the year-ago period.\nNet Income
(Loss): Net income was $15.8 million, or $0.20 per share, based on 77.6
million weighted-average shares\noutstanding, for the fourth quarter of fiscal
2025. This compares to a net loss of $55.5 million, or $0.77 per share, in
the\nyear-ago period. Non-GAAP net income was $108.4 million, or $1.28 per
share, based on 84.6 million fully diluted\nweighted-average shares
outstanding. This compares to a non-GAAP net income of $71.1 million, or $0.86
per share, in\nthe year-ago period.\nCash Flow: As of January 31, 2025,
MongoDB  had $2.3 billion in cash, cash equivalents, short-term investments
and\nrestricted cash. During the three months ended January 31, 2025, MongoDB
generated $50.5 million of cash from\noperations, compared to $54.6 million of
cash from operations in the year-ago period. MongoDB  used $26.0 million of
cash\nin capital expenditures and used $1.6 million of cash in principal
payments of finance leases, leading to free cash flow of\n$22.9 million,
compared to free cash flow of $50.5 million in the year-ago period.\nFull Year
Fiscal 2025 Financial Highlights\nRevenue: Total revenue was $2.01 billion for
the full year fiscal 2025, an increase of 19% year-over-year.
Subscription\nrevenue was $1.94 billion, an increase of 19% year-over-year,
and services revenue was $62.6 million, an increase of
12%\nyear-over-year.\nGross Profit: Gross profit was $1.47 billion for the
full year fiscal 2025, representing a 73% gross margin compared to',
path=None, url=None, mimetype=None), image_resource=None, audio_resource=None,
video_resource=None, text_template='{metadata_str}\n\n{content}')

벡터 저장소를 인스턴스화합니다.

다음 코드를 실행하여 다음을 지정하는 MongoDBAtlasVectorSearch 메서드를 사용하여 벡터 저장 만듭니다.

MongoDB cluster 에 대한 연결.
llamaindex_db.test 문서를 저장 데 사용되는 MongoDB database 및 컬렉션 으로 사용됩니다.
vector_index 를 벡터 저장소를 쿼리하는 데 사용할 인덱스로 사용합니다.

그런 다음 저장 위한 데이터를 준비하는 데 사용되는 LlamaIndex 컨테이너 객체 인 저장 컨텍스트 에 벡터 저장 저장합니다.

# Connect to your MongoDB cluster
mongo_client = pymongo.MongoClient(MONGODB_URI)
# Instantiate the vector store
vector_store = MongoDBAtlasVectorSearch(
    mongo_client,
    db_name = "llamaindex_db",
    collection_name = "test",
    vector_index_name = "vector_index"
)
vector_store_context = StorageContext.from_defaults(vector_store=vector_store)

데이터를 벡터 임베딩으로 저장합니다.

데이터를 로드하고 Atlas 벡터 저장 로 인스턴스화한 후에는 데이터에서 벡터 임베딩을 생성하여 Atlas 에 저장 . 이렇게 하려면 벡터 저장 인덱스를 빌드 해야 합니다. 이 유형의 인덱스 는 데이터를 분할, 임베딩한 후 벡터 저장 에 저장하는 LlamaIndex 데이터 구조입니다.

다음 코드에서는 VectorStoreIndex.from_documents 메서드를 사용하여 샘플 데이터에 벡터 저장 인덱스 빌드 . 벡터 저장소의 저장 컨텍스트에 지정된 대로 샘플 데이터를 벡터 임베딩으로 변환하고 이러한 임베딩을 MongoDB cluster 의 llamaindex_db.test 컬렉션 에 문서로 저장합니다.

참고

이 메서드는 환경을 설정하다 때 구성한 임베딩 모델 및 청크 설정을 사용합니다.

vector_store_index = VectorStoreIndex.from_documents(
   sample_data, storage_context=vector_store_context, show_progress=True
)

팁

샘플 코드를 실행 후 Atlas 사용하는 경우 llamaindex_db.test Atlas UI 의 네임스페이스 로이동하여 벡터 임베딩을 확인할 수 있습니다.

MongoDB Vector Search 인덱스 만들기

벡터 저장 에서 벡터 검색 쿼리를 활성화 하려면 llamaindex_db.test 컬렉션 에 MongoDB Vector Search 인덱스 만듭니다.

노트북에서 다음 코드를 실행하여 다음 필드를 인덱싱하는 vectorSearch 유형의 인덱스를 생성합니다.

embedding 필드를 벡터 유형으로 지정합니다. embedding 필드 에는 VoyageAI의 voyage-3-large 임베딩 모델을 사용하여 생성된 임베딩이 포함되어 있습니다. 인덱스 정의는 1024 벡터 차원을 지정하고 cosine를 사용하여 유사성을 측정합니다.
metadata.page_label 필드를 PDF의 페이지 번호를 기준으로 데이터를 사전 필터링하는 필터 유형으로 지정합니다.

# Specify the collection for which to create the index
collection = mongo_client["llamaindex_db"]["test"]
# Create your index model, then create the search index
search_index_model = SearchIndexModel(
  definition={
    "fields": [
      {
        "type": "vector",
        "path": "embedding",
        "numDimensions": 1024,
        "similarity": "cosine"
      },
      {
        "type": "filter",
        "path": "metadata.page_label"
      }
    ]
  },
  name="vector_index",
  type="vectorSearch"
)
collection.create_search_index(model=search_index_model)

인덱스 작성에는 약 1분 정도가 소요됩니다. 인덱스가 작성되는 동안 인덱스는 초기 동기화 상태에 있습니다. 빌드가 완료되면 컬렉션의 데이터 쿼리를 시작할 수 있습니다.

전제 조건

이 튜토리얼을 완료하려면 다음 조건을 충족해야 합니다.

다음 MongoDB cluster 유형 중 하나입니다.
- MongoDB 버전 6.0.11을 실행하는 Atlas 클러스터 7.0.2 또는 그 이상. IP 주소가 Atlas 프로젝트의 액세스 목록에 포함되어 있는지 확인하세요.
- Atlas CLI 사용하여 생성된 로컬 Atlas 배포서버 입니다. 자세히 학습 로컬 Atlas 배포 만들기를 참조하세요.
- 검색 및 벡터 검색이 설치된 MongoDB Community 또는 Enterprise 클러스터.
OpenAI API 키입니다. API 요청에 사용할 수 있는 크레딧이 있는 OpenAI 계정이 있어야 합니다. OpenAI 계정 등록에 대해 자세히 학습하려면 OpenAI API 웹사이트를 참조하세요.
Colab과같은 대화형 Python 노트북을 실행 수 있는 환경입니다.

환경 설정

노트북 환경을 설정하다 하려면 다음을 수행합니다.

종속성을 설치하고 가져옵니다.

다음 명령을 실행합니다:

pip install --quiet --upgrade llama-index llama-index-vector-stores-mongodb llama-index-embeddings-openai pymongo

그런 다음, 다음 코드를 실행하여 필요한 패키지를 가져옵니다.

import os, pymongo, pprint
from pymongo.operations import SearchIndexModel
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, StorageContext
from llama_index.core.settings import Settings
from llama_index.core.retrievers import VectorIndexRetriever
from llama_index.core.vector_stores import MetadataFilter, MetadataFilters, ExactMatchFilter, FilterOperator
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
from llama_index.vector_stores.mongodb import MongoDBAtlasVectorSearch

환경 변수를 정의합니다.

다음 코드를 실행하고 자리 표시자를 다음 값으로 대체합니다.

사용자의 OpenAI API 키.
MongoDB 클러스터의 연결 문자열.

os.environ["OPENAI_API_KEY"] = "<api-key>"
MONGODB_URI = "<connection-string>"

참고

<connection-string>을 Atlas 클러스터 또는 로컬 Atlas 배포서버의 연결 문자열로 교체합니다.

연결 문자열은 다음 형식을 사용해야 합니다.

mongodb+srv://<db_username>:<db_password>@<clusterName>.<hostname>.mongodb.net

자세한 학습은 드라이버를 통해 클러스터에 연결을 참조하세요.

연결 문자열은 다음 형식을 사용해야 합니다.

mongodb://localhost:<port-number>/?directConnection=true

학습 내용은 연결 문자열을 참조하세요.

LlamaIndex 설정을 구성합니다.

다음 코드를 실행하여 LlamaIndex와 관련된 설정을 구성합니다. 이 설정은 다음을 지정합니다:

OpenAI는 데이터에 대한 질문에 답변 위해 애플리케이션 에서 사용하는 LLM입니다.
text-embedding-ada-002 를 애플리케이션에서 데이터에서 벡터 임베딩을 생성하는 데 사용하는 임베딩 모델로 사용합니다.
청크 크기와 겹침 을 통해 LlamaIndex가 저장 위해 데이터를 분할하는 방법을 사용자 지정할 수 있습니다.

Settings.llm = OpenAI()
Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")
Settings.chunk_size = 100
Settings.chunk_overlap = 10

MongoDB 벡터 저장소로 사용

샘플 데이터를 불러옵니다.

샘플 데이터를 로드하려면 다음 코드 스니펫을 실행합니다. 다음을 수행합니다.

data 이라는 새 디렉토리를 만듭니다.
지정된 URL에서 PDF를 검색하여 디렉토리에 파일로 저장합니다.
SimpleDirectoryReader 데이터 connector 사용하여 파일 에서 원시 텍스트와 메타데이터 추출합니다. 또한 데이터의 형식을 문서로 지정합니다.

# Load the sample data
from urllib.request import urlretrieve
urlretrieve("https://investors.mongodb.com/node/13176/pdf", "mongodb-earnings-report.pdf")
sample_data = SimpleDirectoryReader(input_files=["./data/mongodb-earnings-report.pdf"]).load_data()
# Print the first document
sample_data[0]

Document(id_='62b7cace-30c0-4687-9d87-e178547ae357', embedding=None,
metadata={'page_label': '1', 'file_name': 'mongodb-earnings-report.pdf',
'file_path': 'data/mongodb-earnings-report.pdf', 'file_type':
'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28',
'last_modified_date': '2025-05-28'},
excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size',
'creation_date', 'last_modified_date', 'last_accessed_date'],
excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size',
'creation_date', 'last_modified_date', 'last_accessed_date'],
relationships={}, metadata_template='{key}: {value}', metadata_separator='\n',
text_resource=MediaResource(embeddings=None, data=None, text='MongoDB, Inc.
Announces Fourth Quarter and Full Year Fiscal 2025 Financial Results\nMarch 5,
2025\nFourth Quarter Fiscal 2025 Total Revenue of $548.4 million, up 20%
Year-over-Year\nFull Year Fiscal 2025 Total Revenue of $2.01 billion, up 19%
Year-over-Year\nContinued Strong Customer Growth with Over 54,500 Customers as
of January 31, 2025\nMongoDB Atlas Revenue up 24% Year-over-Year; 71% of Total
Q4 Revenue\nNEW YORK , March 5, 2025 /PRNewswire/ -- MongoDB, Inc. (NASDAQ:
MDB) today announced its financial results for the fourth quarter and
fiscal\nyear ended January 31, 2025.\n\xa0\n  \xa0\n"MongoDB  delivered a
strong end to fiscal 2025 with 24% Atlas revenue growth and significant margin
expansion. Atlas consumption in the quarter\nwas better than expected and we
continue to see good performance in new workload wins due to the flexibility,
scalability and performance of the\nMongoDB  platform. In fiscal year 2026 we
expect to see stable consumption growth in Atlas, our main growth driver,"
said Dev Ittycheria, President\nand Chief Executive Officer of MongoDB
.\n"Looking ahead, we remain incredibly excited about our long-term growth
opportunity. MongoDB  removes the constraints of legacy databases,\nenabling
businesses to innovate at AI speed with our flexible document model and
seamless scalability. Following the Voyage AI acquisition, we\ncombine
real-time data, sophisticated embedding and retrieval models and semantic
search directly in the database, simplifying the development of\ntrustworthy
AI-powered apps."\nFourth Quarter Fiscal 2025 Financial Highlights\nRevenue:
Total revenue was $548.4 million for the fourth quarter of fiscal 2025, an
increase of 20% year-over-year.\nSubscription revenue was $531.0 million, an
increase of 19% year-over-year, and services revenue was $17.4 million,
an\nincrease of 34% year-over-year.\nGross Profit: Gross profit was $399.4
million for the fourth quarter of fiscal 2025, representing a 73% gross
margin\ncompared to 75% in the year-ago period. Non-GAAP gross profit was
$411.7 million, representing a 75% non-GAAP gross\nmargin, compared to a
non-GAAP gross margin of 77% in the year-ago period.\nLoss from Operations:
Loss from operations was $18.6 million for the fourth quarter of fiscal 2025,
compared to a loss\nfrom operations of $71.0 million in the year-ago period.
Non-GAAP income from operations was $112.5 million, compared\nto non-GAAP
income from operations of $69.2 million in the year-ago period.\nNet Income
(Loss): Net income was $15.8 million, or $0.20 per share, based on 77.6
million weighted-average shares\noutstanding, for the fourth quarter of fiscal
2025. This compares to a net loss of $55.5 million, or $0.77 per share, in
the\nyear-ago period. Non-GAAP net income was $108.4 million, or $1.28 per
share, based on 84.6 million fully diluted\nweighted-average shares
outstanding. This compares to a non-GAAP net income of $71.1 million, or $0.86
per share, in\nthe year-ago period.\nCash Flow: As of January 31, 2025,
MongoDB  had $2.3 billion in cash, cash equivalents, short-term investments
and\nrestricted cash. During the three months ended January 31, 2025, MongoDB
generated $50.5 million of cash from\noperations, compared to $54.6 million of
cash from operations in the year-ago period. MongoDB  used $26.0 million of
cash\nin capital expenditures and used $1.6 million of cash in principal
payments of finance leases, leading to free cash flow of\n$22.9 million,
compared to free cash flow of $50.5 million in the year-ago period.\nFull Year
Fiscal 2025 Financial Highlights\nRevenue: Total revenue was $2.01 billion for
the full year fiscal 2025, an increase of 19% year-over-year.
Subscription\nrevenue was $1.94 billion, an increase of 19% year-over-year,
and services revenue was $62.6 million, an increase of
12%\nyear-over-year.\nGross Profit: Gross profit was $1.47 billion for the
full year fiscal 2025, representing a 73% gross margin compared to',
path=None, url=None, mimetype=None), image_resource=None, audio_resource=None,
video_resource=None, text_template='{metadata_str}\n\n{content}')

벡터 저장소를 인스턴스화합니다.

다음 코드를 실행하여 다음을 지정하는 MongoDBAtlasVectorSearch 메서드를 사용하여 벡터 저장 만듭니다.

MongoDB cluster 에 대한 연결.
llamaindex_db.test 문서를 저장 데 사용되는 MongoDB database 및 컬렉션 으로 사용됩니다.
vector_index 를 벡터 저장소를 쿼리하는 데 사용할 인덱스로 사용합니다.

그런 다음 저장 위한 데이터를 준비하는 데 사용되는 LlamaIndex 컨테이너 객체 인 저장 컨텍스트 에 벡터 저장 저장합니다.

# Connect to your MongoDB cluster
mongo_client = pymongo.MongoClient(MONGODB_URI)
# Instantiate the vector store
vector_store = MongoDBAtlasVectorSearch(
    mongo_client,
    db_name = "llamaindex_db",
    collection_name = "test",
    vector_index_name = "vector_index"
)
vector_store_context = StorageContext.from_defaults(vector_store=vector_store)

데이터를 벡터 임베딩으로 저장합니다.

데이터를 로드하고 MongoDB 벡터 저장 로 인스턴스화한 후에는 데이터에서 벡터 임베딩을 생성하여 MongoDB 에 저장 . 이렇게 하려면 벡터 저장 인덱스를 빌드 해야 합니다. 이 유형의 인덱스 는 데이터를 분할, 임베딩한 후 벡터 저장 에 저장하는 LlamaIndex 데이터 구조입니다.

참고

이 메서드는 환경을 설정하다 때 구성한 임베딩 모델 및 청크 설정을 사용합니다.

vector_store_index = VectorStoreIndex.from_documents(
   sample_data, storage_context=vector_store_context, show_progress=True
)

팁

샘플 코드를 실행 후 Atlas 사용하는 경우 llamaindex_db.test Atlas UI 의 네임스페이스 로이동하여 벡터 임베딩을 확인할 수 있습니다.

MongoDB Vector Search 인덱스 만들기

벡터 저장 에서 벡터 검색 쿼리를 활성화 하려면 llamaindex_db.test 컬렉션 에 MongoDB Vector Search 인덱스 만듭니다.

노트북에서 다음 코드를 실행하여 다음 필드를 인덱싱하는 vectorSearch 유형의 인덱스를 생성합니다.

embedding 필드를 벡터 유형으로 지정합니다. embedding 필드에는 OpenAI의 text-embedding-ada-002 임베딩 모델을 사용하여 생성된 임베딩이 포함되어 있습니다. 인덱스 정의는 1536 벡터 차원을 지정하고 cosine 를 사용하여 유사성을 측정합니다.
metadata.page_label 필드를 PDF의 페이지 번호를 기준으로 데이터를 사전 필터링하는 필터 유형으로 지정합니다.

# Specify the collection for which to create the index
collection = mongo_client["llamaindex_db"]["test"]
# Create your index model, then create the search index
search_index_model = SearchIndexModel(
  definition={
    "fields": [
      {
        "type": "vector",
        "path": "embedding",
        "numDimensions": 1536,
        "similarity": "cosine"
      },
      {
        "type": "filter",
        "path": "metadata.page_label"
      }
    ]
  },
  name="vector_index",
  type="vectorSearch"
)
collection.create_search_index(model=search_index_model)

Vector Search 쿼리 실행

MongoDB 인덱스 빌드하면 노트북으로 돌아가 데이터에 대해 벡터 검색 쿼리를 실행 . 다음 예제는 벡터화된 데이터에 대해 실행 수 있는 다양한 쿼리를 보여줍니다.

이 예에서는 문자열 MongoDB Atlas security 에 대한 기본적인 시맨틱 검색을 수행하고 관련성 점수 에 따라 순위가 지정된 문서 목록을 반환합니다. 또한 다음을 지정합니다.

시맨틱 검색 수행하는 리트리버로서의 MongoDB Vector Search.
similarity_top_k 매개변수를 사용하면 가장 관련성이 높은 세 개의 문서만 반환됩니다.

retriever = vector_store_index.as_retriever(similarity_top_k=3)
nodes = retriever.retrieve("MongoDB acquisition")
for node in nodes:
    print(node)

Node ID: 479446ef-8a32-410d-a5e0-8650bd10d78d
Text: MongoDB  completed the redemption of 2026 Convertible Notes,
eliminating all debt from the balance sheet. Additionally, in
conjunction with the acquisition of Voyage, MongoDB  is announcing a
stock buyback program of $200 million, to offset the dilutive impact
of the acquisition consideration.
Score:  0.914
Node ID: 453137d9-8902-4fae-8d81-5f5d9b0836eb
Text: "Looking ahead, we remain incredibly excited about our long-term
growth opportunity. MongoDB  removes the constraints of legacy
databases, enabling businesses to innovate at AI speed with our
flexible document model and seamless scalability. Following the Voyage
AI acquisition, we combine real-time data, sophisticated embedding and
retrieval mod...
Score:  0.914
Node ID: f3c35db6-43e5-4da7-a297-d9b009b9d300
Text: Lombard Odier, a Swiss private bank, partnered with MongoDB  to
migrate and modernize its legacy banking technology systems on MongoDB
with generative AI. The initiative enabled the bank to migrate code
50-60 times quicker and move applications from a legacy relational
database to MongoDB  20 times faster than previous migrations.
Score:  0.912

인덱스된 필드를 컬렉션의 다른 값과 비교하는 MQL 일치 표현식을 사용하여 데이터를 사전 필터링할 수 있습니다. 필터링하려는 모든 메타데이터 필드를 filter 유형으로 인덱싱해야 합니다. 자세한 내용은 벡터 검색을 위해 필드를 인덱싱하는 방법을 참조하세요.

참고

이 튜토리얼의 인덱스 생성할 때 metadata.page_label 필드 필터하다 로 지정했습니다.

이 예시는 string MongoDB Atlas security에 대한 시맨틱 검색을 수행하고 관련성 점수에 따라 순위가 매겨진 문서 목록을 반환합니다. 또한 다음을 지정합니다.

시맨틱 검색 수행하는 리트리버로서의 MongoDB Vector Search.
similarity_top_k 매개변수를 사용하면 가장 관련성이 높은 세 개의 문서만 반환됩니다.
MongoDB Vector Search가 2페이지에 나타나는 문서만 검색하도록 metadata.page_label 필드 에 대한 필터하다 .

# Specify metadata filters
metadata_filters = MetadataFilters(
   filters=[ExactMatchFilter(key="metadata.page_label", value="2")]
)
retriever = vector_store_index.as_retriever(similarity_top_k=3, filters=metadata_filters)
nodes = retriever.retrieve("MongoDB acquisition")
for node in nodes:
    print(node)

Node ID: 479446ef-8a32-410d-a5e0-8650bd10d78d
Text: MongoDB  completed the redemption of 2026 Convertible Notes,
eliminating all debt from the balance sheet. Additionally, in
conjunction with the acquisition of Voyage, MongoDB  is announcing a
stock buyback program of $200 million, to offset the dilutive impact
of the acquisition consideration.
Score:  0.914
Node ID: f3c35db6-43e5-4da7-a297-d9b009b9d300
Text: Lombard Odier, a Swiss private bank, partnered with MongoDB  to
migrate and modernize its legacy banking technology systems on MongoDB
with generative AI. The initiative enabled the bank to migrate code
50-60 times quicker and move applications from a legacy relational
database to MongoDB  20 times faster than previous migrations.
Score:  0.912
Node ID: 82a2a0c0-80b9-4a9e-a848-529b4ff8f301
Text: Fourth Quarter Fiscal 2025 and Recent Business Highlights
MongoDB  acquired Voyage AI, a pioneer in state-of-the-art embedding
and reranking models that power next-generation AI applications.
Integrating Voyage AI's technology with MongoDB  will enable
organizations to easily build trustworthy, AI-powered applications by
offering highly accurate...
Score:  0.911

데이터에 대한 질문에 답변

이 섹션에서는 MongoDB Vector Search 및 LlamaIndex를 사용하여 애플리케이션에서 RAG 를 구현 방법을 설명합니다. 이제 벡터 검색 쿼리를 실행하여 의미적으로 유사한 문서를 조회하는 방법을 알아보았으니, MongoDB Vector Search를 사용하여 문서를 조회하고 LlamaIndex 쿼리 엔진 를 사용하여 해당 문서를 기반으로 질문에 답변하는 코드를 실행하세요.

이 예제는 다음을 수행합니다:

MongoDB Vector Search를 벡터 저장소를 위한 특정 유형의 리트리버인 벡터 인덱스 리트리버로 인스턴스화합니다. 여기에는 similarity_top_k 매개변수가 포함되어 있어 MongoDB Vector Search가 가장 관련성이 높은 5 문서만 검색할 수 있습니다.

RetrieverQueryEngine 쿼리 엔진을 인스턴스화하여 데이터에 대한 질문에 답변합니다. 메시지가 표시되면 쿼리 엔진은 다음 조치를 수행합니다.
- MongoDB Vector Search를 리트리버로 사용하여 프롬프트에 따라 의미적으로 유사한 문서를 쿼리 .
- 조회된 문서를 기반으로 컨텍스트 인식 응답을 생성하기 위해 환경 설정할 때 지정한 LLM을 호출합니다.
LLM에 Atlas 보안 권장 사항에 대한 샘플 쿼리를 제공합니다.
LLM의 응답과 컨텍스트로 사용된 문서를 반환합니다. 생성된 응답은 다를 수 있습니다.

# Instantiate MongoDB Vector Search as a retriever
vector_store_retriever = VectorIndexRetriever(index=vector_store_index, similarity_top_k=5)
# Pass the retriever into the query engine
query_engine = RetrieverQueryEngine(retriever=vector_store_retriever)
# Prompt the LLM
response = query_engine.query("What was MongoDB's latest acquisition?")
print(response)
print("\nSource documents: ")
pprint.pprint(response.source_nodes)

MongoDB's latest acquisition was Voyage AI, a pioneer in embedding and reranking models for next-generation AI applications.
Source documents:
[NodeWithScore(node=TextNode(id_='82a2a0c0-80b9-4a9e-a848-529b4ff8f301', embedding=None, metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='2171a7d3-482c-4f83-beee-8c37e0ebc747', node_type='4', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='ef623ef7400aa6e120f821b455b2ddce99b94c57365e7552b676abaa3eb23640'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='8cfe6680-8dec-486e-92c5-89ac1733b6c8', node_type='1', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='b6c412af868c29d67a6b030f266cd0e680f4a578a34c209c1818ff9a366c9d44'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='479446ef-8a32-410d-a5e0-8650bd10d78d', node_type='1', metadata={}, hash='b805543bf0ef0efc25492098daa9bd9c037043fb7228fb0c3270de235e668341')}, metadata_template='{key}: {value}', metadata_separator='\n', text="Fourth Quarter Fiscal 2025 and Recent Business Highlights\nMongoDB  acquired Voyage AI, a pioneer in state-of-the-art embedding and reranking models that power next-generation\nAI applications. Integrating Voyage AI's technology with MongoDB  will enable organizations to easily build trustworthy,\nAI-powered applications by offering highly accurate and relevant information retrieval deeply integrated with operational\ndata.", mimetype='text/plain', start_char_idx=1678, end_char_idx=2101, metadata_seperator='\n', text_template='{metadata_str}\n\n{content}'), score=0.9279670119285583),
 NodeWithScore(node=TextNode(id_='453137d9-8902-4fae-8d81-5f5d9b0836eb', embedding=None, metadata={'page_label': '1', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='62b7cace-30c0-4687-9d87-e178547ae357', node_type='4', metadata={'page_label': '1', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='cb1dbd172c17e53682296ccc966ebdbb5605acb4fbf3872286e3a202c1d3650d'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='b6ae7c13-5bec-47f5-887f-835fc7bae374', node_type='1', metadata={'page_label': '1', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='a4835102686cdf03d1106946237d50031d00a0861eea892e38b928dd5e44e295'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='3d4034d3-bac5-4985-8926-9213f8a87318', node_type='1', metadata={}, hash='f103b351f2bda28ec3d2f1bb4f40d93ac1698ea5f7630a5297688a4caa419389')}, metadata_template='{key}: {value}', metadata_separator='\n', text='"Looking ahead, we remain incredibly excited about our long-term growth opportunity. MongoDB  removes the constraints of legacy databases,\nenabling businesses to innovate at AI speed with our flexible document model and seamless scalability. Following the Voyage AI acquisition, we\ncombine real-time data, sophisticated embedding and retrieval models and semantic search directly in the database, simplifying the development of\ntrustworthy AI-powered apps."', mimetype='text/plain', start_char_idx=1062, end_char_idx=1519, metadata_seperator='\n', text_template='{metadata_str}\n\n{content}'), score=0.921961784362793),
 NodeWithScore(node=TextNode(id_='85dd431c-2d4c-4336-ab39-e87a97b30c59', embedding=None, metadata={'page_label': '4', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='311532cc-f526-4fc3-adb6-49e76afdd580', node_type='4', metadata={'page_label': '4', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='37f0ad7fcb7f204226ea7c6c475360e2db55bb77447f1742a164efb9c1da5dc0'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='6175bcb6-9e2a-4196-85f7-0585bcbbdd3b', node_type='1', metadata={}, hash='0e92e55a50f8b6dbfe7bcaedb0ccc42345a185048efcd440e3ee1935875e7cbf')}, metadata_template='{key}: {value}', metadata_separator='\n', text="Headquartered in New York, MongoDB's mission is to empower innovators to create, transform, and disrupt industries with software and data.\nMongoDB's unified, intelligent data platform was built to power the next generation of applications, and MongoDB  is the most widely available, globally\ndistributed database on the market.", mimetype='text/plain', start_char_idx=0, end_char_idx=327, metadata_seperator='\n', text_template='{metadata_str}\n\n{content}'), score=0.9217028021812439),
 NodeWithScore(node=TextNode(id_='f3c35db6-43e5-4da7-a297-d9b009b9d300', embedding=None, metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='2171a7d3-482c-4f83-beee-8c37e0ebc747', node_type='4', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='ef623ef7400aa6e120f821b455b2ddce99b94c57365e7552b676abaa3eb23640'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='3008736c-29f0-4b41-ac0f-efdb469319b9', node_type='1', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='cd3647350e6d7fcd89e2303fe1995b8f91b633c5f33e14b3b4c18a16738ea86f'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='c9bef874-77ee-40bc-a1fe-ca42d1477cb3', node_type='1', metadata={}, hash='c7d7af8a1b43b587a9c47b27f57e7cb8bc35bd90390a078db21e3f5253ee7cc1')}, metadata_template='{key}: {value}', metadata_separator='\n', text='Lombard Odier, a Swiss private bank, partnered with MongoDB  to migrate and modernize its legacy banking technology\nsystems on MongoDB  with generative AI. The initiative enabled the bank to migrate code 50-60 times quicker and move\napplications from a legacy relational database to MongoDB  20 times faster than previous migrations.', mimetype='text/plain', start_char_idx=2618, end_char_idx=2951, metadata_seperator='\n', text_template='{metadata_str}\n\n{content}'), score=0.9197831153869629),
 NodeWithScore(node=TextNode(id_='479446ef-8a32-410d-a5e0-8650bd10d78d', embedding=None, metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='2171a7d3-482c-4f83-beee-8c37e0ebc747', node_type='4', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='ef623ef7400aa6e120f821b455b2ddce99b94c57365e7552b676abaa3eb23640'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='82a2a0c0-80b9-4a9e-a848-529b4ff8f301', node_type='1', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='688872b911c388c239669970f562d4014aaec4753903e75f4bdfcf1eb1daf5ab'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='3008736c-29f0-4b41-ac0f-efdb469319b9', node_type='1', metadata={}, hash='a854a9bf103e429ce78b45603df9e2341e5d0692aa95e544e6c82616be29b28e')}, metadata_template='{key}: {value}', metadata_separator='\n', text='MongoDB  completed the redemption of 2026 Convertible Notes, eliminating all debt from the balance sheet. Additionally, in\nconjunction with the acquisition of Voyage, MongoDB  is announcing a stock buyback program of $200 million, to offset the\ndilutive impact of the acquisition consideration.', mimetype='text/plain', start_char_idx=2102, end_char_idx=2396, metadata_seperator='\n', text_template='{metadata_str}\n\n{content}'), score=0.9183852672576904)]

이 예제는 다음을 수행합니다:

MongoDB Vector Search가 2페이지에 나타나는 문서만 검색하도록 metadata.page_label 필드 에 메타데이터 필터하다 정의합니다.
MongoDB Vector Search를 벡터 저장소를 위한 특정 유형의 리트리버인 벡터 인덱스 리트리버로 인스턴스화합니다. 여기에는 정의한 메타데이터 필터와 similarity_top_k 매개 변수가 포함되어 있어 MongoDB Vector Search가 2페이지에서 가장 관련성이 높은 5 문서만 검색합니다.

RetrieverQueryEngine 쿼리 엔진을 인스턴스화하여 데이터에 대한 질문에 답변합니다. 메시지가 표시되면 쿼리 엔진은 다음 조치를 수행합니다.
- MongoDB Vector Search를 리트리버로 사용하여 프롬프트에 따라 의미적으로 유사한 문서를 쿼리 .
- 조회된 문서를 기반으로 컨텍스트 인식 응답을 생성하기 위해 환경 설정할 때 지정한 LLM을 호출합니다.
LLM에 Atlas 보안 권장 사항에 대한 샘플 쿼리를 제공합니다.
LLM의 응답과 컨텍스트로 사용된 문서를 반환합니다. 생성된 응답은 다를 수 있습니다.

# Specify metadata filters
metadata_filters = MetadataFilters(
   filters=[ExactMatchFilter(key="metadata.page_label", value="2")]
)
# Instantiate MongoDB Vector Search as a retriever
vector_store_retriever = VectorIndexRetriever(index=vector_store_index, filters=metadata_filters, similarity_top_k=5)
# Pass the retriever into the query engine
query_engine = RetrieverQueryEngine(retriever=vector_store_retriever)
# Prompt the LLM
response = query_engine.query("What was MongoDB's latest acquisition?")
print(response)
print("\nSource documents: ")
pprint.pprint(response.source_nodes)

MongoDB's latest acquisition was Voyage AI, a pioneer in embedding and reranking models that power next-generation AI applications.
Source documents:
[NodeWithScore(node=TextNode(id_='82a2a0c0-80b9-4a9e-a848-529b4ff8f301', embedding=None, metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='2171a7d3-482c-4f83-beee-8c37e0ebc747', node_type='4', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='ef623ef7400aa6e120f821b455b2ddce99b94c57365e7552b676abaa3eb23640'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='8cfe6680-8dec-486e-92c5-89ac1733b6c8', node_type='1', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='b6c412af868c29d67a6b030f266cd0e680f4a578a34c209c1818ff9a366c9d44'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='479446ef-8a32-410d-a5e0-8650bd10d78d', node_type='1', metadata={}, hash='b805543bf0ef0efc25492098daa9bd9c037043fb7228fb0c3270de235e668341')}, metadata_template='{key}: {value}', metadata_separator='\n', text="Fourth Quarter Fiscal 2025 and Recent Business Highlights\nMongoDB  acquired Voyage AI, a pioneer in state-of-the-art embedding and reranking models that power next-generation\nAI applications. Integrating Voyage AI's technology with MongoDB  will enable organizations to easily build trustworthy,\nAI-powered applications by offering highly accurate and relevant information retrieval deeply integrated with operational\ndata.", mimetype='text/plain', start_char_idx=1678, end_char_idx=2101, metadata_seperator='\n', text_template='{metadata_str}\n\n{content}'), score=0.9280173778533936),
 NodeWithScore(node=TextNode(id_='f3c35db6-43e5-4da7-a297-d9b009b9d300', embedding=None, metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='2171a7d3-482c-4f83-beee-8c37e0ebc747', node_type='4', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='ef623ef7400aa6e120f821b455b2ddce99b94c57365e7552b676abaa3eb23640'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='3008736c-29f0-4b41-ac0f-efdb469319b9', node_type='1', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='cd3647350e6d7fcd89e2303fe1995b8f91b633c5f33e14b3b4c18a16738ea86f'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='c9bef874-77ee-40bc-a1fe-ca42d1477cb3', node_type='1', metadata={}, hash='c7d7af8a1b43b587a9c47b27f57e7cb8bc35bd90390a078db21e3f5253ee7cc1')}, metadata_template='{key}: {value}', metadata_separator='\n', text='Lombard Odier, a Swiss private bank, partnered with MongoDB  to migrate and modernize its legacy banking technology\nsystems on MongoDB  with generative AI. The initiative enabled the bank to migrate code 50-60 times quicker and move\napplications from a legacy relational database to MongoDB  20 times faster than previous migrations.', mimetype='text/plain', start_char_idx=2618, end_char_idx=2951, metadata_seperator='\n', text_template='{metadata_str}\n\n{content}'), score=0.9198455214500427),
 NodeWithScore(node=TextNode(id_='479446ef-8a32-410d-a5e0-8650bd10d78d', embedding=None, metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='2171a7d3-482c-4f83-beee-8c37e0ebc747', node_type='4', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='ef623ef7400aa6e120f821b455b2ddce99b94c57365e7552b676abaa3eb23640'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='82a2a0c0-80b9-4a9e-a848-529b4ff8f301', node_type='1', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='688872b911c388c239669970f562d4014aaec4753903e75f4bdfcf1eb1daf5ab'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='3008736c-29f0-4b41-ac0f-efdb469319b9', node_type='1', metadata={}, hash='a854a9bf103e429ce78b45603df9e2341e5d0692aa95e544e6c82616be29b28e')}, metadata_template='{key}: {value}', metadata_separator='\n', text='MongoDB  completed the redemption of 2026 Convertible Notes, eliminating all debt from the balance sheet. Additionally, in\nconjunction with the acquisition of Voyage, MongoDB  is announcing a stock buyback program of $200 million, to offset the\ndilutive impact of the acquisition consideration.', mimetype='text/plain', start_char_idx=2102, end_char_idx=2396, metadata_seperator='\n', text_template='{metadata_str}\n\n{content}'), score=0.918432891368866),
 NodeWithScore(node=TextNode(id_='3008736c-29f0-4b41-ac0f-efdb469319b9', embedding=None, metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='2171a7d3-482c-4f83-beee-8c37e0ebc747', node_type='4', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='ef623ef7400aa6e120f821b455b2ddce99b94c57365e7552b676abaa3eb23640'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='479446ef-8a32-410d-a5e0-8650bd10d78d', node_type='1', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='833c2af73d617c1fef7d04111e010bfe06eeeb36c71225c0fb72987cd164526b'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='f3c35db6-43e5-4da7-a297-d9b009b9d300', node_type='1', metadata={}, hash='c39c6258ff9fe34b650dd2782ae20e1ed57ed20465176cbf455ee9857e57dba0')}, metadata_template='{key}: {value}', metadata_separator='\n', text='For the third consecutive year, MongoDB  was named a Leader in the 2024 Gartner® Magic Quadrant™ for Cloud\nDatabase Management Systems. Gartner evaluated 20 vendors based on Ability to Execute and Completeness of Vision.', mimetype='text/plain', start_char_idx=2397, end_char_idx=2617, metadata_seperator='\n', text_template='{metadata_str}\n\n{content}'), score=0.917201817035675),
 NodeWithScore(node=TextNode(id_='d50a3746-84ac-4928-a252-4eda3515f9fc', embedding=None, metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='2171a7d3-482c-4f83-beee-8c37e0ebc747', node_type='4', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='ef623ef7400aa6e120f821b455b2ddce99b94c57365e7552b676abaa3eb23640'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='25e4f1c9-41ba-4344-b775-842a0a15c207', node_type='1', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='28af4302a69924722e2ccd2015b8d64fa83790b4f0d4759898ede48e40668fa1'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='13da6584-75b4-4eb8-a071-8297087ce12c', node_type='1', metadata={}, hash='e316923acbe01dede55287258f9649bb9865ef2357f2316e190b97aef84f22ec')}, metadata_template='{key}: {value}', metadata_separator='\n', text="as amended, including statements concerning MongoDB's financial guidance\nfor the first fiscal quarter and full year fiscal 2026 and underlying assumptions, our expectations regarding Atlas consumption growth and the benefits\nof the Voyage AI acquisition.", mimetype='text/plain', start_char_idx=5174, end_char_idx=5428, metadata_seperator='\n', text_template='{metadata_str}\n\n{content}'), score=0.9084539413452148)]

다음 단계

데이터 커넥터, 인덱스 및 쿼리 엔진을 포함하여 RAG 애플리케이션을 위한 LlamaIndex의 전체 도구 라이브러리를 탐색하려면 LlamaHub.를 참조하세요.

이 튜토리얼의 애플리케이션 확장하여 주고받는 대화를 나누려면 채팅 엔진을 참조하세요.

MongoDB는 다음과 같은 개발자 리소스도 제공합니다.

팁

돌아가기

LangChain4j

Python 통합