Comece a usar a integração com o LlamaIndex

Você pode integrar a Vector Search do MongoDB ao LlamaIndex para implementar a geração aumentada de recuperação (RAG) em seu aplicação LLM. Este tutorial demonstra como começar a usar o MongoDB Vector Search com LlamaIndex para executar pesquisa semântica em seus dados e criar uma implementação de RAG. Especificamente, você executa as seguintes ações:

Configure o ambiente.
Armazene dados personalizados no MongoDB.
Crie um índice do MongoDB Vector Search em seus dados.
Execute as seguintes query de pesquisa vetorial:
- Pesquisa semântica.
- Pesquisa semântica com pré-filtragem de metadados.
Implemente o RAG usando o MongoDB Vector Search para responder a perguntas sobre seus dados.

Trabalhe com uma versão executável deste tutorial como um notebook Python.

Plano de fundo

O LlamaIndex é uma estrutura de código aberto projetada para simplificar a forma como você conecta conjuntos de dados personalizados aos LLMs . Ele fornece várias ferramentas, como conectores de dados, índices e mecanismos de query para ajudá-lo a carregar e preparar incorporações vetoriais para aplicativos RAG .

Ao integrar o MongoDB Vector Search ao LlamaIndex, você pode usar o MongoDB como um banco de dados vetorial e usar o MongoDB Vector Search para implementar RAG, recuperando documentos semanticamente semelhantes de seus dados. Para saber mais sobre RAG, consulte Geração Aumentada de Recuperação (RAG) com o MongoDB.

Procedimento

Pré-requisitos

Para concluir este tutorial, você deve ter o seguinte:

Um dos seguintes tipos de cluster MongoDB :
- Um cluster do Atlas executando a versão 6.0.11 do MongoDB, 7.0.2, ou posterior. Certifique-se de que seu endereço IP esteja incluído na lista de acesso do seu projeto Atlas.
- Um sistema local do Atlas criado utilizando o Atlas CLI. Para saber mais, consulte Criar uma implantação de Atlas local.
- Um cluster MongoDB Community ou Enterprise com Search e Vector Search instalados.
Uma chave de API da OpenAI. Você deve ter uma conta da OpenAI com créditos disponíveis para solicitações de API. Para aprender mais sobre como registrar uma conta OpenAI, consulte o website de API OpenAI.
Uma chave de API do Voyage AI. Para criar uma chave de API, consulte Chaves de API do modelo.
Um ambiente para executar blocos de anotações interativos do Python, como o CoLab.

Configurar o ambiente

Configure o ambiente para este tutorial. Crie um bloco de anotações Python interativo salvando um arquivo com a extensão .ipynb. Este bloco de anotações permite que você execute trechos de código Python individualmente, e você o usará para executar o código neste tutorial.

Para configurar seu ambiente de bloco de anotações:

Instalar e importar dependências.

Execute o seguinte comando:

pip install --quiet --upgrade llama-index llama-index-vector-stores-mongodb llama-index-llms-openai llama-index-embeddings-voyageai pymongo

Em seguida, execute o seguinte código para importar os pacotes necessários:

import os, pymongo, pprint
from pymongo.operations import SearchIndexModel
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, StorageContext
from llama_index.core.settings import Settings
from llama_index.core.retrievers import VectorIndexRetriever
from llama_index.core.vector_stores import MetadataFilter, MetadataFilters, ExactMatchFilter, FilterOperator
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.embeddings.voyageai import VoyageEmbedding
from llama_index.llms.openai import OpenAI
from llama_index.vector_stores.mongodb import MongoDBAtlasVectorSearch

Defina as variáveis de ambiente.

Execute o código a seguir, substituindo os espaços reservados pelos seguintes valores:

Sua chave de API da OpenAI.
Sua chave de API do Voyage AI.
A string de conexão SRV do cluster MongoDB .

os.environ["OPENAI_API_KEY"] = "<openai-api-key>"
os.environ["VOYAGEAI_API_KEY"] = "<voyageai-api-key>"
MONGODB_URI = "<connection-string>"

Observação

Substitua <connection-string> pela string de conexão do seu cluster do Atlas ou da implantação local do Atlas.

Sua string de conexão deve usar o seguinte formato:

mongodb+srv://<db_username>:<db_password>@<clusterName>.<hostname>.mongodb.net

Para saber mais, consulte Conectar-se a um cluster por meio de bibliotecas de clientes.

Sua string de conexão deve usar o seguinte formato:

mongodb://localhost:<port-number>/?directConnection=true

Para saber mais, consulte Connection strings.

Configure as configurações do LlamaIndex.

Execute o seguinte código para definir configurações específicas para LlamaIndex. Essas configurações especificam o seguinte:

voyage-3-large como o modelo de incorporação usado pelo seu aplicativo para gerar incorporações vetoriais a partir de seus dados.
OpenAI como o LLM usado pelo seu aplicação para responder a perguntas sobre seus dados.
Tamanho e sobreposição do chunk para personalizar como o LlamaIndex particiona seus dados para armazenamento.

from llama_index.embeddings.voyageai import VoyageEmbedding
embed_model= VoyageEmbedding(
  voyage_api_key = os.environ["VOYAGEAI_API_KEY"],
  model_name = "voyage-3-large",
)
Settings.llm = OpenAI()
       Settings.embed_model = embed_model
Settings.chunk_size = 100
Settings.chunk_overlap = 10

Use o MongoDB como um armazenamento de vetores

Em seguida, carregue dados personalizados no MongoDB e instancie seu cluster MongoDB como um banco de dados vetorial , também chamado de armazenamento vetorial. Copie e cole os seguintes trechos de código em seu bloco de anotações.

Carregue os dados de amostra.

Para este tutorial, você usa um documento PDF acessível publicamente que contém um relatório de rendimentos recente do MongoDB como fonte de dados para seu armazenamento de vetores. Este documento descreve os resultados financeiros do MongoDB para o quarto trimestre e o ano inteiro do ano fiscal 2025.

Para carregar os dados de amostra, execute o seguinte trecho de código. Ele faz o seguinte:

Cria um novo diretório chamado data.
Recupera o PDF da URL especificada e o salva como um arquivo no diretório.
Utiliza o conector de dados SimpleDirectoryReader para extrair texto bruto e metadados do arquivo. Ele também formata os dados em documentos.

# Load the sample data
from urllib.request import urlretrieve
urlretrieve("https://investors.mongodb.com/node/13176/pdf", "mongodb-earnings-report.pdf")
sample_data = SimpleDirectoryReader(input_files=["mongodb-earnings-report.pdf"]).load_data()
# Print the first document
sample_data[0]

Document(id_='62b7cace-30c0-4687-9d87-e178547ae357', embedding=None,
metadata={'page_label': '1', 'file_name': 'mongodb-earnings-report.pdf',
'file_path': 'data/mongodb-earnings-report.pdf', 'file_type':
'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28',
'last_modified_date': '2025-05-28'},
excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size',
'creation_date', 'last_modified_date', 'last_accessed_date'],
excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size',
'creation_date', 'last_modified_date', 'last_accessed_date'],
relationships={}, metadata_template='{key}: {value}', metadata_separator='\n',
text_resource=MediaResource(embeddings=None, data=None, text='MongoDB, Inc.
Announces Fourth Quarter and Full Year Fiscal 2025 Financial Results\nMarch 5,
2025\nFourth Quarter Fiscal 2025 Total Revenue of $548.4 million, up 20%
Year-over-Year\nFull Year Fiscal 2025 Total Revenue of $2.01 billion, up 19%
Year-over-Year\nContinued Strong Customer Growth with Over 54,500 Customers as
of January 31, 2025\nMongoDB Atlas Revenue up 24% Year-over-Year; 71% of Total
Q4 Revenue\nNEW YORK , March 5, 2025 /PRNewswire/ -- MongoDB, Inc. (NASDAQ:
MDB) today announced its financial results for the fourth quarter and
fiscal\nyear ended January 31, 2025.\n\xa0\n  \xa0\n"MongoDB  delivered a
strong end to fiscal 2025 with 24% Atlas revenue growth and significant margin
expansion. Atlas consumption in the quarter\nwas better than expected and we
continue to see good performance in new workload wins due to the flexibility,
scalability and performance of the\nMongoDB  platform. In fiscal year 2026 we
expect to see stable consumption growth in Atlas, our main growth driver,"
said Dev Ittycheria, President\nand Chief Executive Officer of MongoDB
.\n"Looking ahead, we remain incredibly excited about our long-term growth
opportunity. MongoDB  removes the constraints of legacy databases,\nenabling
businesses to innovate at AI speed with our flexible document model and
seamless scalability. Following the Voyage AI acquisition, we\ncombine
real-time data, sophisticated embedding and retrieval models and semantic
search directly in the database, simplifying the development of\ntrustworthy
AI-powered apps."\nFourth Quarter Fiscal 2025 Financial Highlights\nRevenue:
Total revenue was $548.4 million for the fourth quarter of fiscal 2025, an
increase of 20% year-over-year.\nSubscription revenue was $531.0 million, an
increase of 19% year-over-year, and services revenue was $17.4 million,
an\nincrease of 34% year-over-year.\nGross Profit: Gross profit was $399.4
million for the fourth quarter of fiscal 2025, representing a 73% gross
margin\ncompared to 75% in the year-ago period. Non-GAAP gross profit was
$411.7 million, representing a 75% non-GAAP gross\nmargin, compared to a
non-GAAP gross margin of 77% in the year-ago period.\nLoss from Operations:
Loss from operations was $18.6 million for the fourth quarter of fiscal 2025,
compared to a loss\nfrom operations of $71.0 million in the year-ago period.
Non-GAAP income from operations was $112.5 million, compared\nto non-GAAP
income from operations of $69.2 million in the year-ago period.\nNet Income
(Loss): Net income was $15.8 million, or $0.20 per share, based on 77.6
million weighted-average shares\noutstanding, for the fourth quarter of fiscal
2025. This compares to a net loss of $55.5 million, or $0.77 per share, in
the\nyear-ago period. Non-GAAP net income was $108.4 million, or $1.28 per
share, based on 84.6 million fully diluted\nweighted-average shares
outstanding. This compares to a non-GAAP net income of $71.1 million, or $0.86
per share, in\nthe year-ago period.\nCash Flow: As of January 31, 2025,
MongoDB  had $2.3 billion in cash, cash equivalents, short-term investments
and\nrestricted cash. During the three months ended January 31, 2025, MongoDB
generated $50.5 million of cash from\noperations, compared to $54.6 million of
cash from operations in the year-ago period. MongoDB  used $26.0 million of
cash\nin capital expenditures and used $1.6 million of cash in principal
payments of finance leases, leading to free cash flow of\n$22.9 million,
compared to free cash flow of $50.5 million in the year-ago period.\nFull Year
Fiscal 2025 Financial Highlights\nRevenue: Total revenue was $2.01 billion for
the full year fiscal 2025, an increase of 19% year-over-year.
Subscription\nrevenue was $1.94 billion, an increase of 19% year-over-year,
and services revenue was $62.6 million, an increase of
12%\nyear-over-year.\nGross Profit: Gross profit was $1.47 billion for the
full year fiscal 2025, representing a 73% gross margin compared to',
path=None, url=None, mimetype=None), image_resource=None, audio_resource=None,
video_resource=None, text_template='{metadata_str}\n\n{content}')

Instancie o armazenamento de vetores.

Execute o código abaixo para criar um armazenamento de vetor usando o método MongoDBAtlasVectorSearch, que especifica o seguinte:

Uma conexão com seu cluster MongoDB .
llamaindex_db.test como o banco de dados MongoDB e a coleta usada para armazenar os documentos.
vector_index como o índice a ser usado para consultar o armazenamento de vetores.

Em seguida, você salva o armazenamento de vetor em um contexto de armazenamento, que é um objeto de container LlamaIndex usado para preparar seus dados para armazenamento.

# Connect to your MongoDB cluster
mongo_client = pymongo.MongoClient(MONGODB_URI)
# Instantiate the vector store
vector_store = MongoDBAtlasVectorSearch(
    mongo_client,
    db_name = "llamaindex_db",
    collection_name = "test",
    vector_index_name = "vector_index"
)
vector_store_context = StorageContext.from_defaults(vector_store=vector_store)

Armazene seus dados como incorporações vetoriais.

Depois de carregar seus dados e instanciar o Atlas como um armazenamento de vetor, gere incorporações vetoriais a partir de seus dados e armazene-os no Atlas. Para fazer isso, você deve construir um índice de armazenamento de vetor. Esse tipo de índice é uma estrutura de dados LlamaIndex que divide, incorpora e armazena seus dados no armazenamento de vetores.

O seguinte código utiliza o método VectorStoreIndex.from_documents para construir o índice de armazenamento de vetor em seus dados de amostra. Ele transforma seus dados de amostra em incorporações vetoriais e armazena essas incorporações como documentos na coleção llamaindex_db.test em seu cluster MongoDB , conforme especificado pelo contexto de armazenamento do armazenamento de vetores.

Observação

Esse método usa o modelo de incorporação e as configurações de chunk que você configurou ao configurar seu ambiente.

vector_store_index = VectorStoreIndex.from_documents(
   sample_data, storage_context=vector_store_context, show_progress=True
)

Dica

Depois de executar o código de amostra, se estiver usando o Atlas, poderá verificar suas incorporações vetoriais navegando até o namespace llamaindex_db.test na interface do usuário do Atlas.

Crie o índice de Vector Search do MongoDB

Para habilitar consultas de pesquisa de vetor em seu armazenamento de vetor, crie um índice do MongoDB Vector Search na coleção llamaindex_db.test .

No seu notebook, execute o código a seguir para criar um índice do tipo vectorSearch que indexa os seguintes campos:

embedding campo como o tipo de vetor. O campo embedding contém as incorporações criadas utilizando o modelo de incorporação voyage-3-large do VoyageAI. A definição de índice especifica 1024 dimensões vetoriais e mede a similaridade usando cosine.
metadata.page_label campo como o tipo de filtro para pré-filtrar dados pelo número da página no PDF.

# Specify the collection for which to create the index
collection = mongo_client["llamaindex_db"]["test"]
# Create your index model, then create the search index
search_index_model = SearchIndexModel(
  definition={
    "fields": [
      {
        "type": "vector",
        "path": "embedding",
        "numDimensions": 1024,
        "similarity": "cosine"
      },
      {
        "type": "filter",
        "path": "metadata.page_label"
      }
    ]
  },
  name="vector_index",
  type="vectorSearch"
)
collection.create_search_index(model=search_index_model)

O índice deve levar cerca de um minuto para ser criado. Enquanto ele é compilado, o índice está em um estado de sincronização inicial. Quando a construção estiver concluída, você poderá começar a fazer query nos dados em sua coleção.

Pré-requisitos

Para concluir este tutorial, você deve ter o seguinte:

Um dos seguintes tipos de cluster MongoDB :
- Um cluster do Atlas executando a versão 6.0.11 do MongoDB, 7.0.2, ou posterior. Certifique-se de que seu endereço IP esteja incluído na lista de acesso do seu projeto Atlas.
- Um sistema local do Atlas criado utilizando o Atlas CLI. Para saber mais, consulte Criar uma implantação de Atlas local.
- Um cluster MongoDB Community ou Enterprise com Search e Vector Search instalados.
Uma chave de API da OpenAI. Você deve ter uma conta da OpenAI com créditos disponíveis para solicitações de API. Para aprender mais sobre como registrar uma conta OpenAI, consulte o website de API OpenAI.
Um ambiente para executar blocos de anotações interativos do Python, como o CoLab.

Configurar o ambiente

Para configurar seu ambiente de bloco de anotações:

Instalar e importar dependências.

Execute o seguinte comando:

pip install --quiet --upgrade llama-index llama-index-vector-stores-mongodb llama-index-embeddings-openai pymongo

Em seguida, execute o seguinte código para importar os pacotes necessários:

import os, pymongo, pprint
from pymongo.operations import SearchIndexModel
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, StorageContext
from llama_index.core.settings import Settings
from llama_index.core.retrievers import VectorIndexRetriever
from llama_index.core.vector_stores import MetadataFilter, MetadataFilters, ExactMatchFilter, FilterOperator
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
from llama_index.vector_stores.mongodb import MongoDBAtlasVectorSearch

Defina as variáveis de ambiente.

Execute o código a seguir, substituindo os espaços reservados pelos seguintes valores:

Sua chave de API da OpenAI.
A string de conexão do cluster MongoDB .

os.environ["OPENAI_API_KEY"] = "<api-key>"
MONGODB_URI = "<connection-string>"

Observação

Substitua <connection-string> pela string de conexão do seu cluster do Atlas ou da implantação local do Atlas.

Sua string de conexão deve usar o seguinte formato:

mongodb+srv://<db_username>:<db_password>@<clusterName>.<hostname>.mongodb.net

Para saber mais, consulte Conectar-se a um cluster por meio de bibliotecas de clientes.

Sua string de conexão deve usar o seguinte formato:

mongodb://localhost:<port-number>/?directConnection=true

Para saber mais, consulte Connection strings.

Configure as configurações do LlamaIndex.

Execute o seguinte código para definir configurações específicas para LlamaIndex. Essas configurações especificam o seguinte:

OpenAI como o LLM usado pelo seu aplicação para responder a perguntas sobre seus dados.
text-embedding-ada-002 como o modelo de incorporação usado pelo seu aplicativo para gerar incorporações vetoriais a partir de seus dados.
Tamanho e sobreposição do chunk para personalizar como o LlamaIndex particiona seus dados para armazenamento.

Settings.llm = OpenAI()
Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")
Settings.chunk_size = 100
Settings.chunk_overlap = 10

Use o MongoDB como um armazenamento de vetores

Carregue os dados de amostra.

Para carregar os dados de amostra, execute o seguinte trecho de código. Ele faz o seguinte:

Cria um novo diretório chamado data.
Recupera o PDF da URL especificada e o salva como um arquivo no diretório.
Utiliza o conector de dados SimpleDirectoryReader para extrair texto bruto e metadados do arquivo. Ele também formata os dados em documentos.

# Load the sample data
from urllib.request import urlretrieve
urlretrieve("https://investors.mongodb.com/node/13176/pdf", "mongodb-earnings-report.pdf")
sample_data = SimpleDirectoryReader(input_files=["./data/mongodb-earnings-report.pdf"]).load_data()
# Print the first document
sample_data[0]

Document(id_='62b7cace-30c0-4687-9d87-e178547ae357', embedding=None,
metadata={'page_label': '1', 'file_name': 'mongodb-earnings-report.pdf',
'file_path': 'data/mongodb-earnings-report.pdf', 'file_type':
'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28',
'last_modified_date': '2025-05-28'},
excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size',
'creation_date', 'last_modified_date', 'last_accessed_date'],
excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size',
'creation_date', 'last_modified_date', 'last_accessed_date'],
relationships={}, metadata_template='{key}: {value}', metadata_separator='\n',
text_resource=MediaResource(embeddings=None, data=None, text='MongoDB, Inc.
Announces Fourth Quarter and Full Year Fiscal 2025 Financial Results\nMarch 5,
2025\nFourth Quarter Fiscal 2025 Total Revenue of $548.4 million, up 20%
Year-over-Year\nFull Year Fiscal 2025 Total Revenue of $2.01 billion, up 19%
Year-over-Year\nContinued Strong Customer Growth with Over 54,500 Customers as
of January 31, 2025\nMongoDB Atlas Revenue up 24% Year-over-Year; 71% of Total
Q4 Revenue\nNEW YORK , March 5, 2025 /PRNewswire/ -- MongoDB, Inc. (NASDAQ:
MDB) today announced its financial results for the fourth quarter and
fiscal\nyear ended January 31, 2025.\n\xa0\n  \xa0\n"MongoDB  delivered a
strong end to fiscal 2025 with 24% Atlas revenue growth and significant margin
expansion. Atlas consumption in the quarter\nwas better than expected and we
continue to see good performance in new workload wins due to the flexibility,
scalability and performance of the\nMongoDB  platform. In fiscal year 2026 we
expect to see stable consumption growth in Atlas, our main growth driver,"
said Dev Ittycheria, President\nand Chief Executive Officer of MongoDB
.\n"Looking ahead, we remain incredibly excited about our long-term growth
opportunity. MongoDB  removes the constraints of legacy databases,\nenabling
businesses to innovate at AI speed with our flexible document model and
seamless scalability. Following the Voyage AI acquisition, we\ncombine
real-time data, sophisticated embedding and retrieval models and semantic
search directly in the database, simplifying the development of\ntrustworthy
AI-powered apps."\nFourth Quarter Fiscal 2025 Financial Highlights\nRevenue:
Total revenue was $548.4 million for the fourth quarter of fiscal 2025, an
increase of 20% year-over-year.\nSubscription revenue was $531.0 million, an
increase of 19% year-over-year, and services revenue was $17.4 million,
an\nincrease of 34% year-over-year.\nGross Profit: Gross profit was $399.4
million for the fourth quarter of fiscal 2025, representing a 73% gross
margin\ncompared to 75% in the year-ago period. Non-GAAP gross profit was
$411.7 million, representing a 75% non-GAAP gross\nmargin, compared to a
non-GAAP gross margin of 77% in the year-ago period.\nLoss from Operations:
Loss from operations was $18.6 million for the fourth quarter of fiscal 2025,
compared to a loss\nfrom operations of $71.0 million in the year-ago period.
Non-GAAP income from operations was $112.5 million, compared\nto non-GAAP
income from operations of $69.2 million in the year-ago period.\nNet Income
(Loss): Net income was $15.8 million, or $0.20 per share, based on 77.6
million weighted-average shares\noutstanding, for the fourth quarter of fiscal
2025. This compares to a net loss of $55.5 million, or $0.77 per share, in
the\nyear-ago period. Non-GAAP net income was $108.4 million, or $1.28 per
share, based on 84.6 million fully diluted\nweighted-average shares
outstanding. This compares to a non-GAAP net income of $71.1 million, or $0.86
per share, in\nthe year-ago period.\nCash Flow: As of January 31, 2025,
MongoDB  had $2.3 billion in cash, cash equivalents, short-term investments
and\nrestricted cash. During the three months ended January 31, 2025, MongoDB
generated $50.5 million of cash from\noperations, compared to $54.6 million of
cash from operations in the year-ago period. MongoDB  used $26.0 million of
cash\nin capital expenditures and used $1.6 million of cash in principal
payments of finance leases, leading to free cash flow of\n$22.9 million,
compared to free cash flow of $50.5 million in the year-ago period.\nFull Year
Fiscal 2025 Financial Highlights\nRevenue: Total revenue was $2.01 billion for
the full year fiscal 2025, an increase of 19% year-over-year.
Subscription\nrevenue was $1.94 billion, an increase of 19% year-over-year,
and services revenue was $62.6 million, an increase of
12%\nyear-over-year.\nGross Profit: Gross profit was $1.47 billion for the
full year fiscal 2025, representing a 73% gross margin compared to',
path=None, url=None, mimetype=None), image_resource=None, audio_resource=None,
video_resource=None, text_template='{metadata_str}\n\n{content}')

Instancie o armazenamento de vetores.

Execute o código abaixo para criar um armazenamento de vetor usando o método MongoDBAtlasVectorSearch, que especifica o seguinte:

Uma conexão com seu cluster MongoDB .
llamaindex_db.test como o banco de dados MongoDB e a coleta usada para armazenar os documentos.
vector_index como o índice a ser usado para consultar o armazenamento de vetores.

Em seguida, você salva o armazenamento de vetor em um contexto de armazenamento, que é um objeto de container LlamaIndex usado para preparar seus dados para armazenamento.

# Connect to your MongoDB cluster
mongo_client = pymongo.MongoClient(MONGODB_URI)
# Instantiate the vector store
vector_store = MongoDBAtlasVectorSearch(
    mongo_client,
    db_name = "llamaindex_db",
    collection_name = "test",
    vector_index_name = "vector_index"
)
vector_store_context = StorageContext.from_defaults(vector_store=vector_store)

Armazene seus dados como incorporações vetoriais.

Depois de carregar seus dados e instanciar o MongoDB como um armazenamento de vetores, gere incorporações vetoriais a partir de seus dados e armazene-os no MongoDB. Para fazer isso, você deve construir um índice de armazenamento de vetor. Esse tipo de índice é uma estrutura de dados LlamaIndex que divide, incorpora e armazena seus dados no armazenamento de vetores.

Observação

Esse método usa o modelo de incorporação e as configurações de chunk que você configurou ao configurar seu ambiente.

vector_store_index = VectorStoreIndex.from_documents(
   sample_data, storage_context=vector_store_context, show_progress=True
)

Dica

Depois de executar o código de amostra, se estiver usando o Atlas, poderá verificar suas incorporações vetoriais navegando até o namespace llamaindex_db.test na interface do usuário do Atlas.

Crie o índice de Vector Search do MongoDB

Para habilitar consultas de pesquisa de vetor em seu armazenamento de vetor, crie um índice do MongoDB Vector Search na coleção llamaindex_db.test .

No seu notebook, execute o código a seguir para criar um índice do tipo vectorSearch que indexa os seguintes campos:

embedding campo como o tipo de vetor . O campo embedding contém as incorporações criadas utilizando o modelo de incorporação text-embedding-ada-002 do OpenAI. A definição de índice especifica 1536 dimensões vetoriais e mede a similaridade usando cosine.
metadata.page_label campo como o tipo de filtro para pré-filtrar dados pelo número da página no PDF.

# Specify the collection for which to create the index
collection = mongo_client["llamaindex_db"]["test"]
# Create your index model, then create the search index
search_index_model = SearchIndexModel(
  definition={
    "fields": [
      {
        "type": "vector",
        "path": "embedding",
        "numDimensions": 1536,
        "similarity": "cosine"
      },
      {
        "type": "filter",
        "path": "metadata.page_label"
      }
    ]
  },
  name="vector_index",
  type="vectorSearch"
)
collection.create_search_index(model=search_index_model)

Executar queries no Vector Search

Depois que o MongoDB criar seu índice, retorne ao seu bloco de anotações e execute consultas de pesquisa vetorial em seus dados. Os exemplos seguintes demonstram diferentes queries que você pode executar em seus dados vetorizados.

Este exemplo executa uma pesquisa semântica básica para a string MongoDB Atlas security e retorna uma lista de documentos classificados por pontuação de relevância. Ele também especifica o seguinte:

MongoDB Vector Search como um recuperador para realizar pesquisas semânticas.
O parâmetro similarity_top_k para retornar apenas os três documentos mais relevantes.

retriever = vector_store_index.as_retriever(similarity_top_k=3)
nodes = retriever.retrieve("MongoDB acquisition")
for node in nodes:
    print(node)

Node ID: 479446ef-8a32-410d-a5e0-8650bd10d78d
Text: MongoDB  completed the redemption of 2026 Convertible Notes,
eliminating all debt from the balance sheet. Additionally, in
conjunction with the acquisition of Voyage, MongoDB  is announcing a
stock buyback program of $200 million, to offset the dilutive impact
of the acquisition consideration.
Score:  0.914
Node ID: 453137d9-8902-4fae-8d81-5f5d9b0836eb
Text: "Looking ahead, we remain incredibly excited about our long-term
growth opportunity. MongoDB  removes the constraints of legacy
databases, enabling businesses to innovate at AI speed with our
flexible document model and seamless scalability. Following the Voyage
AI acquisition, we combine real-time data, sophisticated embedding and
retrieval mod...
Score:  0.914
Node ID: f3c35db6-43e5-4da7-a297-d9b009b9d300
Text: Lombard Odier, a Swiss private bank, partnered with MongoDB  to
migrate and modernize its legacy banking technology systems on MongoDB
with generative AI. The initiative enabled the bank to migrate code
50-60 times quicker and move applications from a legacy relational
database to MongoDB  20 times faster than previous migrations.
Score:  0.912

Você pode pré-filtrar seus dados usando uma expressão de correspondência MQL que compara o campo indexado com outro valor em sua coleção. Você deve indexar todos os campos de metadados pelos quais deseja filtrar como o tipo filter. Para saber mais, consulte Como indexar campos para pesquisa vetorial.

Observação

Você especificou o campo metadata.page_label como um filtro quando criou o índice para este tutorial.

Este exemplo executa uma pesquisa semântica para a string MongoDB Atlas security e retorna uma lista de documentos classificados por pontuação de relevância. Ele também especifica o seguinte:

MongoDB Vector Search como um recuperador para realizar pesquisas semânticas.
O parâmetro similarity_top_k para retornar apenas os três documentos mais relevantes.
Um filtro no campo metadata.page_label para que o MongoDB Vector Search pesquise documentos que aparecem somente na página dois.

# Specify metadata filters
metadata_filters = MetadataFilters(
   filters=[ExactMatchFilter(key="metadata.page_label", value="2")]
)
retriever = vector_store_index.as_retriever(similarity_top_k=3, filters=metadata_filters)
nodes = retriever.retrieve("MongoDB acquisition")
for node in nodes:
    print(node)

Node ID: 479446ef-8a32-410d-a5e0-8650bd10d78d
Text: MongoDB  completed the redemption of 2026 Convertible Notes,
eliminating all debt from the balance sheet. Additionally, in
conjunction with the acquisition of Voyage, MongoDB  is announcing a
stock buyback program of $200 million, to offset the dilutive impact
of the acquisition consideration.
Score:  0.914
Node ID: f3c35db6-43e5-4da7-a297-d9b009b9d300
Text: Lombard Odier, a Swiss private bank, partnered with MongoDB  to
migrate and modernize its legacy banking technology systems on MongoDB
with generative AI. The initiative enabled the bank to migrate code
50-60 times quicker and move applications from a legacy relational
database to MongoDB  20 times faster than previous migrations.
Score:  0.912
Node ID: 82a2a0c0-80b9-4a9e-a848-529b4ff8f301
Text: Fourth Quarter Fiscal 2025 and Recent Business Highlights
MongoDB  acquired Voyage AI, a pioneer in state-of-the-art embedding
and reranking models that power next-generation AI applications.
Integrating Voyage AI's technology with MongoDB  will enable
organizations to easily build trustworthy, AI-powered applications by
offering highly accurate...
Score:  0.911

Responda a perguntas sobre seus dados

Esta seção demonstra como implementar RAG em seu aplicação com o MongoDB Vector Search e o LlamaIndex. Agora que você aprenderam a executar queries de pesquisa vetorial para recuperar documentos semanticamente semelhantes, execute o código a seguir para usar o MongoDB Vector Search para recuperar documentos e um mecanismo de query LlamaIndex para responder a perguntas com base nesses documentos.

Este exemplo faz o seguinte:

Instancia o MongoDB Vector Search como um recuperador de índice de vetor, um tipo específico de recuperador para armazenamentos de vetores. Inclui o parâmetro similarity_top_k para que o MongoDB Vector Search recupere somente os documentos mais relevantes do 5.

Instancia o mecanismo de query do RetrieverQueryEngine para responder a perguntas sobre seus dados. Quando solicitado, o mecanismo de query executa a seguinte ação:
- Usa o MongoDB Vector Search como um recuperador para fazer query de documentos semanticamente semelhantes com base no prompt.
- Chama o LLM que você especificou ao configurar seu ambiente para gerar uma resposta sensível ao contexto com base nos documentos recuperados.
Solicita ao LLM um exemplo de query sobre as recomendações de segurança do Atlas.
Retorna a resposta do LLM e os documentos usados como contexto. A resposta gerada pode variar.

# Instantiate MongoDB Vector Search as a retriever
vector_store_retriever = VectorIndexRetriever(index=vector_store_index, similarity_top_k=5)
# Pass the retriever into the query engine
query_engine = RetrieverQueryEngine(retriever=vector_store_retriever)
# Prompt the LLM
response = query_engine.query("What was MongoDB's latest acquisition?")
print(response)
print("\nSource documents: ")
pprint.pprint(response.source_nodes)

MongoDB's latest acquisition was Voyage AI, a pioneer in embedding and reranking models for next-generation AI applications.
Source documents:
[NodeWithScore(node=TextNode(id_='82a2a0c0-80b9-4a9e-a848-529b4ff8f301', embedding=None, metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='2171a7d3-482c-4f83-beee-8c37e0ebc747', node_type='4', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='ef623ef7400aa6e120f821b455b2ddce99b94c57365e7552b676abaa3eb23640'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='8cfe6680-8dec-486e-92c5-89ac1733b6c8', node_type='1', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='b6c412af868c29d67a6b030f266cd0e680f4a578a34c209c1818ff9a366c9d44'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='479446ef-8a32-410d-a5e0-8650bd10d78d', node_type='1', metadata={}, hash='b805543bf0ef0efc25492098daa9bd9c037043fb7228fb0c3270de235e668341')}, metadata_template='{key}: {value}', metadata_separator='\n', text="Fourth Quarter Fiscal 2025 and Recent Business Highlights\nMongoDB  acquired Voyage AI, a pioneer in state-of-the-art embedding and reranking models that power next-generation\nAI applications. Integrating Voyage AI's technology with MongoDB  will enable organizations to easily build trustworthy,\nAI-powered applications by offering highly accurate and relevant information retrieval deeply integrated with operational\ndata.", mimetype='text/plain', start_char_idx=1678, end_char_idx=2101, metadata_seperator='\n', text_template='{metadata_str}\n\n{content}'), score=0.9279670119285583),
 NodeWithScore(node=TextNode(id_='453137d9-8902-4fae-8d81-5f5d9b0836eb', embedding=None, metadata={'page_label': '1', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='62b7cace-30c0-4687-9d87-e178547ae357', node_type='4', metadata={'page_label': '1', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='cb1dbd172c17e53682296ccc966ebdbb5605acb4fbf3872286e3a202c1d3650d'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='b6ae7c13-5bec-47f5-887f-835fc7bae374', node_type='1', metadata={'page_label': '1', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='a4835102686cdf03d1106946237d50031d00a0861eea892e38b928dd5e44e295'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='3d4034d3-bac5-4985-8926-9213f8a87318', node_type='1', metadata={}, hash='f103b351f2bda28ec3d2f1bb4f40d93ac1698ea5f7630a5297688a4caa419389')}, metadata_template='{key}: {value}', metadata_separator='\n', text='"Looking ahead, we remain incredibly excited about our long-term growth opportunity. MongoDB  removes the constraints of legacy databases,\nenabling businesses to innovate at AI speed with our flexible document model and seamless scalability. Following the Voyage AI acquisition, we\ncombine real-time data, sophisticated embedding and retrieval models and semantic search directly in the database, simplifying the development of\ntrustworthy AI-powered apps."', mimetype='text/plain', start_char_idx=1062, end_char_idx=1519, metadata_seperator='\n', text_template='{metadata_str}\n\n{content}'), score=0.921961784362793),
 NodeWithScore(node=TextNode(id_='85dd431c-2d4c-4336-ab39-e87a97b30c59', embedding=None, metadata={'page_label': '4', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='311532cc-f526-4fc3-adb6-49e76afdd580', node_type='4', metadata={'page_label': '4', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='37f0ad7fcb7f204226ea7c6c475360e2db55bb77447f1742a164efb9c1da5dc0'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='6175bcb6-9e2a-4196-85f7-0585bcbbdd3b', node_type='1', metadata={}, hash='0e92e55a50f8b6dbfe7bcaedb0ccc42345a185048efcd440e3ee1935875e7cbf')}, metadata_template='{key}: {value}', metadata_separator='\n', text="Headquartered in New York, MongoDB's mission is to empower innovators to create, transform, and disrupt industries with software and data.\nMongoDB's unified, intelligent data platform was built to power the next generation of applications, and MongoDB  is the most widely available, globally\ndistributed database on the market.", mimetype='text/plain', start_char_idx=0, end_char_idx=327, metadata_seperator='\n', text_template='{metadata_str}\n\n{content}'), score=0.9217028021812439),
 NodeWithScore(node=TextNode(id_='f3c35db6-43e5-4da7-a297-d9b009b9d300', embedding=None, metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='2171a7d3-482c-4f83-beee-8c37e0ebc747', node_type='4', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='ef623ef7400aa6e120f821b455b2ddce99b94c57365e7552b676abaa3eb23640'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='3008736c-29f0-4b41-ac0f-efdb469319b9', node_type='1', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='cd3647350e6d7fcd89e2303fe1995b8f91b633c5f33e14b3b4c18a16738ea86f'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='c9bef874-77ee-40bc-a1fe-ca42d1477cb3', node_type='1', metadata={}, hash='c7d7af8a1b43b587a9c47b27f57e7cb8bc35bd90390a078db21e3f5253ee7cc1')}, metadata_template='{key}: {value}', metadata_separator='\n', text='Lombard Odier, a Swiss private bank, partnered with MongoDB  to migrate and modernize its legacy banking technology\nsystems on MongoDB  with generative AI. The initiative enabled the bank to migrate code 50-60 times quicker and move\napplications from a legacy relational database to MongoDB  20 times faster than previous migrations.', mimetype='text/plain', start_char_idx=2618, end_char_idx=2951, metadata_seperator='\n', text_template='{metadata_str}\n\n{content}'), score=0.9197831153869629),
 NodeWithScore(node=TextNode(id_='479446ef-8a32-410d-a5e0-8650bd10d78d', embedding=None, metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='2171a7d3-482c-4f83-beee-8c37e0ebc747', node_type='4', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='ef623ef7400aa6e120f821b455b2ddce99b94c57365e7552b676abaa3eb23640'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='82a2a0c0-80b9-4a9e-a848-529b4ff8f301', node_type='1', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='688872b911c388c239669970f562d4014aaec4753903e75f4bdfcf1eb1daf5ab'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='3008736c-29f0-4b41-ac0f-efdb469319b9', node_type='1', metadata={}, hash='a854a9bf103e429ce78b45603df9e2341e5d0692aa95e544e6c82616be29b28e')}, metadata_template='{key}: {value}', metadata_separator='\n', text='MongoDB  completed the redemption of 2026 Convertible Notes, eliminating all debt from the balance sheet. Additionally, in\nconjunction with the acquisition of Voyage, MongoDB  is announcing a stock buyback program of $200 million, to offset the\ndilutive impact of the acquisition consideration.', mimetype='text/plain', start_char_idx=2102, end_char_idx=2396, metadata_seperator='\n', text_template='{metadata_str}\n\n{content}'), score=0.9183852672576904)]

Este exemplo faz o seguinte:

Define um filtro de metadados no campo metadata.page_label para que o MongoDB Vector Search pesquise documentos que aparecem somente na página dois.
Instancia o MongoDB Vector Search como um recuperador de índice de vetor, um tipo específico de recuperador para armazenamentos de vetores. Ele inclui os filtros de metadados que você definiu e o parâmetro similarity_top_k para que o MongoDB Vector Search recupere somente os documentos 5 mais relevantes da página dois.

Instancia o mecanismo de query do RetrieverQueryEngine para responder a perguntas sobre seus dados. Quando solicitado, o mecanismo de query executa a seguinte ação:
- Usa o MongoDB Vector Search como um recuperador para fazer query de documentos semanticamente semelhantes com base no prompt.
- Chama o LLM que você especificou ao configurar seu ambiente para gerar uma resposta sensível ao contexto com base nos documentos recuperados.
Solicita ao LLM um exemplo de query sobre as recomendações de segurança do Atlas.
Retorna a resposta do LLM e os documentos usados como contexto. A resposta gerada pode variar.

# Specify metadata filters
metadata_filters = MetadataFilters(
   filters=[ExactMatchFilter(key="metadata.page_label", value="2")]
)
# Instantiate MongoDB Vector Search as a retriever
vector_store_retriever = VectorIndexRetriever(index=vector_store_index, filters=metadata_filters, similarity_top_k=5)
# Pass the retriever into the query engine
query_engine = RetrieverQueryEngine(retriever=vector_store_retriever)
# Prompt the LLM
response = query_engine.query("What was MongoDB's latest acquisition?")
print(response)
print("\nSource documents: ")
pprint.pprint(response.source_nodes)

MongoDB's latest acquisition was Voyage AI, a pioneer in embedding and reranking models that power next-generation AI applications.
Source documents:
[NodeWithScore(node=TextNode(id_='82a2a0c0-80b9-4a9e-a848-529b4ff8f301', embedding=None, metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='2171a7d3-482c-4f83-beee-8c37e0ebc747', node_type='4', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='ef623ef7400aa6e120f821b455b2ddce99b94c57365e7552b676abaa3eb23640'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='8cfe6680-8dec-486e-92c5-89ac1733b6c8', node_type='1', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='b6c412af868c29d67a6b030f266cd0e680f4a578a34c209c1818ff9a366c9d44'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='479446ef-8a32-410d-a5e0-8650bd10d78d', node_type='1', metadata={}, hash='b805543bf0ef0efc25492098daa9bd9c037043fb7228fb0c3270de235e668341')}, metadata_template='{key}: {value}', metadata_separator='\n', text="Fourth Quarter Fiscal 2025 and Recent Business Highlights\nMongoDB  acquired Voyage AI, a pioneer in state-of-the-art embedding and reranking models that power next-generation\nAI applications. Integrating Voyage AI's technology with MongoDB  will enable organizations to easily build trustworthy,\nAI-powered applications by offering highly accurate and relevant information retrieval deeply integrated with operational\ndata.", mimetype='text/plain', start_char_idx=1678, end_char_idx=2101, metadata_seperator='\n', text_template='{metadata_str}\n\n{content}'), score=0.9280173778533936),
 NodeWithScore(node=TextNode(id_='f3c35db6-43e5-4da7-a297-d9b009b9d300', embedding=None, metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='2171a7d3-482c-4f83-beee-8c37e0ebc747', node_type='4', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='ef623ef7400aa6e120f821b455b2ddce99b94c57365e7552b676abaa3eb23640'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='3008736c-29f0-4b41-ac0f-efdb469319b9', node_type='1', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='cd3647350e6d7fcd89e2303fe1995b8f91b633c5f33e14b3b4c18a16738ea86f'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='c9bef874-77ee-40bc-a1fe-ca42d1477cb3', node_type='1', metadata={}, hash='c7d7af8a1b43b587a9c47b27f57e7cb8bc35bd90390a078db21e3f5253ee7cc1')}, metadata_template='{key}: {value}', metadata_separator='\n', text='Lombard Odier, a Swiss private bank, partnered with MongoDB  to migrate and modernize its legacy banking technology\nsystems on MongoDB  with generative AI. The initiative enabled the bank to migrate code 50-60 times quicker and move\napplications from a legacy relational database to MongoDB  20 times faster than previous migrations.', mimetype='text/plain', start_char_idx=2618, end_char_idx=2951, metadata_seperator='\n', text_template='{metadata_str}\n\n{content}'), score=0.9198455214500427),
 NodeWithScore(node=TextNode(id_='479446ef-8a32-410d-a5e0-8650bd10d78d', embedding=None, metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='2171a7d3-482c-4f83-beee-8c37e0ebc747', node_type='4', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='ef623ef7400aa6e120f821b455b2ddce99b94c57365e7552b676abaa3eb23640'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='82a2a0c0-80b9-4a9e-a848-529b4ff8f301', node_type='1', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='688872b911c388c239669970f562d4014aaec4753903e75f4bdfcf1eb1daf5ab'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='3008736c-29f0-4b41-ac0f-efdb469319b9', node_type='1', metadata={}, hash='a854a9bf103e429ce78b45603df9e2341e5d0692aa95e544e6c82616be29b28e')}, metadata_template='{key}: {value}', metadata_separator='\n', text='MongoDB  completed the redemption of 2026 Convertible Notes, eliminating all debt from the balance sheet. Additionally, in\nconjunction with the acquisition of Voyage, MongoDB  is announcing a stock buyback program of $200 million, to offset the\ndilutive impact of the acquisition consideration.', mimetype='text/plain', start_char_idx=2102, end_char_idx=2396, metadata_seperator='\n', text_template='{metadata_str}\n\n{content}'), score=0.918432891368866),
 NodeWithScore(node=TextNode(id_='3008736c-29f0-4b41-ac0f-efdb469319b9', embedding=None, metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='2171a7d3-482c-4f83-beee-8c37e0ebc747', node_type='4', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='ef623ef7400aa6e120f821b455b2ddce99b94c57365e7552b676abaa3eb23640'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='479446ef-8a32-410d-a5e0-8650bd10d78d', node_type='1', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='833c2af73d617c1fef7d04111e010bfe06eeeb36c71225c0fb72987cd164526b'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='f3c35db6-43e5-4da7-a297-d9b009b9d300', node_type='1', metadata={}, hash='c39c6258ff9fe34b650dd2782ae20e1ed57ed20465176cbf455ee9857e57dba0')}, metadata_template='{key}: {value}', metadata_separator='\n', text='For the third consecutive year, MongoDB  was named a Leader in the 2024 Gartner® Magic Quadrant™ for Cloud\nDatabase Management Systems. Gartner evaluated 20 vendors based on Ability to Execute and Completeness of Vision.', mimetype='text/plain', start_char_idx=2397, end_char_idx=2617, metadata_seperator='\n', text_template='{metadata_str}\n\n{content}'), score=0.917201817035675),
 NodeWithScore(node=TextNode(id_='d50a3746-84ac-4928-a252-4eda3515f9fc', embedding=None, metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='2171a7d3-482c-4f83-beee-8c37e0ebc747', node_type='4', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='ef623ef7400aa6e120f821b455b2ddce99b94c57365e7552b676abaa3eb23640'), <NodeRelationship.PREVIOUS: '2'>: RelatedNodeInfo(node_id='25e4f1c9-41ba-4344-b775-842a0a15c207', node_type='1', metadata={'page_label': '2', 'file_name': 'mongodb-earnings-report.pdf', 'file_path': 'data/mongodb-earnings-report.pdf', 'file_type': 'application/pdf', 'file_size': 150863, 'creation_date': '2025-05-28', 'last_modified_date': '2025-05-28'}, hash='28af4302a69924722e2ccd2015b8d64fa83790b4f0d4759898ede48e40668fa1'), <NodeRelationship.NEXT: '3'>: RelatedNodeInfo(node_id='13da6584-75b4-4eb8-a071-8297087ce12c', node_type='1', metadata={}, hash='e316923acbe01dede55287258f9649bb9865ef2357f2316e190b97aef84f22ec')}, metadata_template='{key}: {value}', metadata_separator='\n', text="as amended, including statements concerning MongoDB's financial guidance\nfor the first fiscal quarter and full year fiscal 2026 and underlying assumptions, our expectations regarding Atlas consumption growth and the benefits\nof the Voyage AI acquisition.", mimetype='text/plain', start_char_idx=5174, end_char_idx=5428, metadata_seperator='\n', text_template='{metadata_str}\n\n{content}'), score=0.9084539413452148)]

Próximos passos

Para explorar a biblioteca completa de FERRAMENTAS para aplicativos RAG do LlamaIndex, que inclui conectores de dados, índices e mecanismos de consulta, consulte LlamaHub.

Para estender o aplicação neste tutorial para ter conversas de vai e vem, consulte Mecanismo de bate-papo.

O MongoDB também fornece os seguintes recursos para desenvolvedores:

Dica

Voltar

LangChain4j

Integração Python