Comece com a integração LangChainGo

Você pode integrar o MongoDB Vector Search com o LangChainGo para construir aplicativos de grandes modelos de linguagem (LLM) e implementar geração aumentada de recuperação (RAG). Este tutorial demonstra como começar a usar o MongoDB Vector Search com o LangChainGo para realizar pesquisas semânticas em seus dados e criar uma implementação RAG. Especificamente, você executa as seguintes ações:

Configure o ambiente.
Armazene dados personalizados no MongoDB.
Crie um índice do MongoDB Vector Search em seus dados.
Execute as seguintes query de pesquisa vetorial:
- Pesquisa semântica.
- Pesquisa semântica com pré-filtragem de metadados.
Implemente o RAG usando o MongoDB Vector Search para responder a perguntas sobre seus dados.

Plano de fundo

LangChainGo é a implementação em linguagem de programação Go do LangChain. É uma adaptação de terceiros orientada pela comunidade do framework LangChain.

LangChain é uma estrutura de código aberto que simplifica a criação de aplicativos LLM por meio do uso de "cadeias". As cadeias são componentes específicos do LangChain que podem ser combinados para uma variedade de casos de uso de IA, incluindo RAG.

Ao integrar o MongoDB Vector Search com o LangChain, você pode usar o MongoDB como um banco de dados vetorial e usar o MongoDB Vector Search para implementar RAG, recuperando documentos semanticamente semelhantes de seus dados. Para saber mais sobre RAG, consulte Geração Aumentada de Recuperação (RAG) com o MongoDB.

O LangChainGo facilita a orquestração de LLMs para aplicativos de IA, levando os recursos do LangChain para o ecossistema Go. Ele também permite que os desenvolvedores se conectem aos seus bancos de dados compatíveis com armazenamento vetorial preferidos, incluindo MongoDB.

Procedimento

Pré-requisitos

Para concluir este tutorial, você deve ter o seguinte:

Um dos seguintes tipos de cluster MongoDB :
- Um cluster do Atlas executando a versão 6.0.11 do MongoDB, 7.0.2, ou posterior. Certifique-se de que seu endereço IP esteja incluído na lista de acesso do seu projeto Atlas.
- Um sistema local do Atlas criado utilizando o Atlas CLI. Para saber mais, consulte Criar uma implantação de Atlas local.
- Um cluster MongoDB Community ou Enterprise com Search e Vector Search instalados.
Uma chave de API da OpenAI. Você deve ter uma conta da OpenAI com créditos disponíveis para solicitações de API. Para aprender mais sobre como registrar uma conta OpenAI, consulte o website de API OpenAI.
Uma chave de API da Voyage AI. Para criar uma conta e uma chave de API, consulte o site da Voyage AI.
Um terminal e editor de código para executar seu projeto Go.
Go instalado na sua máquina.

Configurar o ambiente

Você deve primeiro configurar o ambiente para este tutorial. Complete as etapas a seguir para configurar seu ambiente.

Inicialize seu projeto Go.

Execute os seguintes comandos no seu terminal para criar um novo diretório denominado langchaingo-mongodb e inicializar seu projeto:

mkdir langchaingo-mongodb
cd langchaingo-mongodb
go mod init langchaingo-mongodb

Instalar dependências.

Execute os seguintes comandos:

go get github.com/joho/godotenv
go get github.com/tmc/langchaingo/chains
go get github.com/tmc/langchaingo/llms
go get github.com/tmc/langchaingo/prompts
go get github.com/tmc/langchaingo/vectorstores/mongovector
go get github.com/tmc/langchaingo/embeddings/voyageai
go get go.mongodb.org/mongo-driver/v2/mongo
go mod tidy

Inicialize suas variáveis de ambiente.

No seu diretório de projeto do langchaingo-mongodb, crie um arquivo .env e adicione as seguintes linhas:

OPENAI_API_KEY="<openai-api-key>"
VOYAGEAI_API_KEY="<voyage-api-key>"
MONGODB_URI="<connection-string>"

Substitua os valores de espaço reservado por sua chave de API OpenAI, chave de API do Voyage AI e a string de conexão SRVpara seu cluster MongoDB . Sua string de conexão deve usar o seguinte formato:

mongodb+srv://<username>:<password>@<cluster-name>.mongodb.net/<dbname>

Crie seu arquivo principal.

No seu diretório de projeto langchaingo-mongodb, crie um arquivo denominado main.go. Você adicionará código a esse arquivo durante o tutorial.

Use o MongoDB como um armazenamento de vetores

Nesta seção, você define uma função assíncrona para carregar dados personalizados no MongoDB e instanciar o MongoDB como um banco de dados vetorial, também chamado de armazenamento de vetores.

Importe as seguintes dependências.

Adicione as seguintes importações ao topo do seu arquivo main.go.

 package main
 import (
   "context"
   "log"
   "os"
   "github.com/joho/godotenv"
   "github.com/tmc/langchaingo/embeddings/voyageai"
   "github.com/tmc/langchaingo/schema"
   "github.com/tmc/langchaingo/vectorstores/mongovector"
   "go.mongodb.org/mongo-driver/v2/mongo"
   "go.mongodb.org/mongo-driver/v2/mongo/options"
)

Defina os detalhes do Vector Store.

O seguinte código executa estas ações:

Configura o Atlas como um armazenamento de vetores especificando o seguinte:
- langchaingo_db.test como a coleção no Atlas para armazenar os documentos.
- vector_index como o índice a ser usado para consultar o armazenamento de vetores.
- text como o nome do campo que contém o conteúdo de texto bruto.
- embedding como o nome do campo que contém as incorporações vetoriais.
Prepara seus dados personalizados fazendo o seguinte:
- Define o texto para cada documento.
- Utiliza o pacote mongovector do LangChainGo para gerar incorporações para os textos. Este pacote armazena as incorporações de documento no MongoDB e permite pesquisas em incorporações armazenadas.
- Constrói documentos que incluem texto, incorporações e metadados.
Ingere os documentos construídos no Atlas e instancia o armazenamento vetorial.

Cole o seguinte código no seu arquivo main.go:

// Defines the document structure
type Document struct {
	PageContent string            `bson:"text"`
	Embedding   []float32         `bson:"embedding"`
	Metadata    map[string]string `bson:"metadata"`
}
func main() {
	const (
		voyageAIEmbeddingDim = 1024
		similarityAlgorithm  = "dotProduct"
		indexName            = "vector_index"
		databaseName         = "langchaingo_db"
		collectionName       = "test"
	)
	if err := godotenv.Load(); err != nil {
		log.Fatal("No .env file found")
	}
	// Loads the MongoDB URI from environment
	uri := os.Getenv("MONGODB_URI")
	if uri == "" {
		log.Fatal("Set your 'MONGODB_URI' environment variable in the .env file")
	}
	// Loads the API key from environment
	voyageApiKey := os.Getenv("VOYAGEAI_API_KEY")
	if voyageApiKey == "" {
		log.Fatal("Set your VOYAGEAI_API_KEY environment variable in the .env file")
	}
	// Connects to MongoDB cluster
	client, err := mongo.Connect(options.Client().ApplyURI(uri))
	if err != nil {
		log.Fatalf("Failed to connect to server: %v", err)
	}
	defer func() {
		if err := client.Disconnect(context.Background()); err != nil {
			log.Fatalf("Error disconnecting the client: %v", err)
		}
	}()
	log.Println("Connected to MongoDB.")
	// Selects the database and collection
	coll := client.Database(databaseName).Collection(collectionName)
	// Creates an embedder client
	embedder, err := voyageai.NewVoyageAI(
		voyageai.WithModel("voyage-3-large"),
	)
	if err != nil {
		log.Fatalf("Failed to create an embedder: %v", err)
	}
	// Creates a new MongoDB vector store
	store := mongovector.New(coll, embedder, mongovector.WithIndex(indexName), mongovector.WithPath("embeddings"))
	// Checks if the collection is empty, and if empty, adds documents to the MongoDB vector store
	if isCollectionEmpty(coll) {
		documents := []schema.Document{
			{
				PageContent: "Proper tuber planting involves site selection, proper timing, and exceptional care. Choose spots with well-drained soil and adequate sun exposure. Tubers are generally planted in spring, but depending on the plant, timing varies. Always plant with the eyes facing upward at a depth two to three times the tuber's height. Ensure 4 inch spacing between small tubers, expand to 12 inches for large ones. Adequate moisture is needed, yet do not overwater. Mulching can help preserve moisture and prevent weed growth.",
				Metadata: map[string]any{
					"author": "A",
					"type":   "post",
				},
			},
			{
				PageContent: "Successful oil painting necessitates patience, proper equipment, and technique. Begin with a carefully prepared, primed canvas. Sketch your composition lightly before applying paint. Use high-quality brushes and oils to create vibrant, long-lasting artworks. Remember to paint 'fat over lean,' meaning each subsequent layer should contain more oil to prevent cracking. Allow each layer to dry before applying another. Clean your brushes often and avoid solvents that might damage them. Finally, always work in a well-ventilated space.",
				Metadata: map[string]any{
					"author": "B",
					"type":   "post",
				},
			},
			{
				PageContent: "For a natural lawn, selection of the right grass type suitable for your climate is crucial. Balanced watering, generally 1 to 1.5 inches per week, is important; overwatering invites disease. Opt for organic fertilizers over synthetic versions to provide necessary nutrients and improve soil structure. Regular lawn aeration helps root growth and prevents soil compaction. Practice natural pest control and consider overseeding to maintain a dense sward, which naturally combats weeds and pest.",
				Metadata: map[string]any{
					"author": "C",
					"type":   "post",
				},
			},
		}
		_, err := store.AddDocuments(context.Background(), documents)
		if err != nil {
			log.Fatalf("Error adding documents: %v", err)
		}
		log.Printf("Successfully added %d documents to the collection.\n", len(documents))
	} else {
		log.Println("Documents already exist in the collection, skipping document addition.")
	}
}
func isCollectionEmpty(coll *mongo.Collection) bool {
	count, err := coll.EstimatedDocumentCount(context.Background())
	if err != nil {
		log.Fatalf("Failed to count documents in the collection: %v", err)
	}
	return count == 0
}

Execute seu projeto Go .

Salve o arquivo e execute o seguinte comando para carregar seus dados no MongoDB.

go run main.go

Connected to MongoDB Atlas.
Successfully added 3 documents to the collection.

Dica

Depois de executar main.go, se estiver usando o Atlas, poderá verificar suas incorporações vetoriais navegando até o namespace langchaingo_db.test na interface do usuário do Atlas.

Crie o índice de Vector Search do MongoDB

Para habilitar consultas de pesquisa de vetor em seu armazenamento de vetor, crie um índice do MongoDB Vector Search na coleção langchaingo_db.test .

Adicione as seguintes importações ao topo do seu arquivo main.go:

import (
  // Other imports...
  "fmt"
  "time"
  "go.mongodb.org/mongo-driver/v2/bson"
)

Defina as seguintes funções no seu arquivo main.go fora da sua função main(). Estas funções criam e gerenciam um índice de pesquisa vetorial para sua collection do MongoDB :

A função SearchIndexExists verifica se existe um índice de pesquisa com o nome especificado e se pode ser consultado.
A função CreateVectorSearchIndex cria um índice de pesquisa vetorial na collection especificada. Esta função bloqueia até que o índice seja criado e consultável.

// Checks if the search index exists
func SearchIndexExists(ctx context.Context, coll *mongo.Collection, idx string) (bool, error) {
	log.Println("Checking if search index exists.")
	view := coll.SearchIndexes()
	siOpts := options.SearchIndexes().SetName(idx).SetType("vectorSearch")
	cursor, err := view.List(ctx, siOpts)
	if err != nil {
		return false, fmt.Errorf("failed to list search indexes: %w", err)
	}
	for cursor.Next(ctx) {
		index := struct {
			Name      string `bson:"name"`
			Queryable bool   `bson:"queryable"`
		}{}
		if err := cursor.Decode(&index); err != nil {
			return false, fmt.Errorf("failed to decode search index: %w", err)
		}
		if index.Name == idx && index.Queryable {
			return true, nil
		}
	}
	if err := cursor.Err(); err != nil {
		return false, fmt.Errorf("cursor error: %w", err)
	}
	return false, nil
}
// Creates a vector search index. This function blocks until the index has been
// created.
func CreateVectorSearchIndex(
	ctx context.Context,
	coll *mongo.Collection,
	idxName string,
	voyageAIEmbeddingDim int,
	similarityAlgorithm string,
) (string, error) {
	type vectorField struct {
		Type          string `bson:"type,omitempty"`
		Path          string `bson:"path,omitempty"`
		NumDimensions int    `bson:"numDimensions,omitempty"`
		Similarity    string `bson:"similarity,omitempty"`
	}
	fields := []vectorField{
		{
			Type:          "vector",
			Path:          "embeddings",
			NumDimensions: voyageAIEmbeddingDim,
			Similarity:    similarityAlgorithm,
		},
		{
			Type: "filter",
			Path: "metadata.author",
		},
		{
			Type: "filter",
			Path: "metadata.type",
		},
	}
	def := struct {
		Fields []vectorField `bson:"fields"`
	}{
		Fields: fields,
	}
	log.Println("Creating vector search index...")
	view := coll.SearchIndexes()
	siOpts := options.SearchIndexes().SetName(idxName).SetType("vectorSearch")
	searchName, err := view.CreateOne(ctx, mongo.SearchIndexModel{Definition: def, Options: siOpts})
	if err != nil {
		return "", fmt.Errorf("failed to create the search index: %w", err)
	}
	// Awaits the creation of the index
	var doc bson.Raw
	for doc == nil {
		cursor, err := view.List(ctx, options.SearchIndexes().SetName(searchName))
		if err != nil {
			return "", fmt.Errorf("failed to list search indexes: %w", err)
		}
		if !cursor.Next(ctx) {
			break
		}
		name := cursor.Current.Lookup("name").StringValue()
		queryable := cursor.Current.Lookup("queryable").Boolean()
		if name == searchName && queryable {
			doc = cursor.Current
		} else {
			time.Sleep(5 * time.Second)
		}
	}
	return searchName, nil
}

Crie a coleção e o índice do armazenamento de vetores chamando as funções anteriores na função main() . Adicione o seguinte código ao final da sua função main():

// SearchIndexExists will return true if the provided index is defined for the
// collection. This operation blocks until the search completes.
if ok, _ := SearchIndexExists(context.Background(), coll, indexName); !ok {
	// Creates the vector store collection
	err = client.Database(databaseName).CreateCollection(context.Background(), collectionName)
	if err != nil {
		log.Fatalf("failed to create vector store collection: %v", err)
	}
	_, err = CreateVectorSearchIndex(context.Background(), coll, indexName, voyageAIEmbeddingDim, similarityAlgorithm)
	if err != nil {
		log.Fatalf("failed to create index: %v", err)
	}
	log.Println("Successfully created vector search index.")
} else {
	log.Println("Vector search index already exists.")
}

Salve o arquivo e execute o seguinte comando para criar seu índice do MongoDB Vector Search .

go run main.go

Checking if search index exists.
Creating vector search index...
Successfully created vector search index.

Dica

Depois de executar main.go, você pode visualizar seu índice de pesquisa vetorial na interface do Atlas navegando até a coleção langchaingo_db.test no seu cluster.

Executar queries no Vector Search

Esta seção demonstra várias queries que você pode executar em seus dados vetorizados. Agora que você criou o índice, é possível executar consultas de pesquisa vetorial.

Selecione a aba Basic Semantic Search ou Semantic Search with Filtering para ver o código correspondente.

Adicione o seguinte código à sua função principal e salve o arquivo.

A pesquisa semântica recupera informações que estão semanticamente relacionadas a uma consulta. O código a seguir utiliza o método SimilaritySearch() para realizar uma pesquisa semântica pela string "Prevent weeds" e restringe os resultados ao primeiro documento.

// Performs basic semantic search
docs, err := store.SimilaritySearch(context.Background(), "Prevent weeds", 1)
if err != nil {
   fmt.Println("Error performing search:", err)
}
fmt.Println("Semantic Search Results:", docs)

Execute o seguinte comando para executar a query.

go run main.go

Semantic Search Results: [{For a natural lawn, selection of
the right grass type suitable for your climate is crucial.
Balanced watering, generally 1 to 1.5 inches per week, is
important; overwatering invites disease. Opt for organic
fertilizers over synthetic versions to provide necessary
nutrients and improve soil structure. Regular lawn aeration
helps root growth and prevents soil compaction. Practice
natural pest control and consider overseeding to maintain a
dense sward, which naturally combats weeds and pest.
map[author:C type:post] 0.69752026}]

Você pode pré-filtrar seus dados usando uma expressão de correspondência MQL que compara o campo indexado com outro valor em sua coleção. Você deve indexar todos os campos de metadados pelos quais deseja filtrar como o tipo filter. Para saber mais, consulte Como indexar campos para pesquisa vetorial.

Adicione a seguinte dependência.

Adicione as seguintes dependências ao seu arquivo main.go:

import (
  // Other imports...
  "github.com/tmc/langchaingo/vectorstores"
)

Adicione o seguinte código à sua função principal e salve o arquivo.

O código a seguir usa o método SimilaritySearch() para executar uma pesquisa semântica para a string "Tulip care". Ele especifica os seguintes parâmetros:

O número de documentos a retornar como 1.
Um limite de pontuação de 0.60.

Retorna o documento que corresponde ao filtro metadata.type: post e inclui o limite de pontuação.

// Performs semantic search with metadata filter
filter := map[string]interface{}{
   "metadata.type": "post",
}
docs, err := store.SimilaritySearch(context.Background(), "Tulip care", 1,
   vectorstores.WithScoreThreshold(0.60),
   vectorstores.WithFilters(filter))
if err != nil {
   fmt.Println("Error performing search:", err)
}
fmt.Println("Filter Search Results:", docs)

Execute o seguinte comando para executar a query.

go run main.go

Filter Search Results: [{Proper tuber planting involves site
selection, proper timing, and exceptional care. Choose spots
with well-drained soil and adequate sun exposure. Tubers are
generally planted in spring, but depending on the plant,
timing varies. Always plant with the eyes facing upward at a
depth two to three times the tuber's height. Ensure 4 inch
spacing between small tubers, expand to 12 inches for large
ones. Adequate moisture is needed, yet do not overwater.
Mulching can help preserve moisture and prevent weed growth.
map[author:A type:post] 0.64432365}]

Responda a perguntas sobre seus dados

Esta seção demonstra uma implementação de RAG usando o MongoDB Vector Search e o LangChainGo. Agora que você usou a Vector Search do MongoDB para recuperar documentos semanticamente semelhantes, use o exemplo de código a seguir para solicitar que o LLM responda às perguntas sobre os documentos retornados pela Vector Search do MongoDB .

Importe as seguintes dependências.

Adicione as seguintes importações ao topo do seu arquivo main.go.

import (
  // Other imports...
  "strings"
  "github.com/tmc/langchaingo/llms/openai"
  "github.com/tmc/langchaingo/chains"
  "github.com/tmc/langchaingo/prompts"
  "github.com/tmc/langchaingo/vectorstores"
)

Adicione o seguinte código ao final da sua função principal e salve o arquivo.

Este código faz o seguinte:

Instancia o MongoDB Vector Search como um recuperador para consultar documentos semanticamente semelhantes.
Define um modelo de prompt do LangChainGo para instruir o LLM a usar os documentos recuperados como contexto para sua query. O LangChainGo preenche esses documentos na variável de entrada {{.context}} e sua query na variável {{.question}}.
Constrói uma cadeia que usa o modelo de chat da OpenAI para gerar respostas sensíveis ao contexto com base no modelo de prompt fornecido.
Envia uma query de exemplo sobre pintura para iniciantes para a cadeia, usando o prompt e o retriever para reunir o contexto relevante.
Retorna e imprime a resposta do LLM e os documentos usados como contexto.

// Implements RAG to answer questions on your data
optionsVector := []vectorstores.Option{
	vectorstores.WithScoreThreshold(0.60),
}
retriever := vectorstores.ToRetriever(&store, 1, optionsVector...)
// Loads OpenAI API key from environment
openaiApiKey := os.Getenv("OPENAI_API_KEY")
if openaiApiKey == "" {
	log.Fatal("Set your OPENAI_API_KEY environment variable in the .env file")
}
// Creates an OpenAI LLM client
llm, err := openai.New(openai.WithToken(openaiApiKey), openai.WithModel("gpt-4o"), openai.WithEmbeddingModel("voyage-3-large"))
if err != nil {
	log.Fatalf("Failed to create an LLM client: %v", err)
}
prompt := prompts.NewPromptTemplate(
	`Answer the question based on the following context:
	{{.context}}
	Question: {{.question}}`,
	[]string{"context", "question"},
)
llmChain := chains.NewLLMChain(llm, prompt)
ctx := context.Background()
const question = "How do I get started painting?"
documents, err := retriever.GetRelevantDocuments(ctx, question)
if err != nil {
	log.Fatalf("Failed to retrieve documents: %v", err)
}
var contextBuilder strings.Builder
for i, document := range documents {
	contextBuilder.WriteString(fmt.Sprintf("Document %d: %s\n", i+1, document.PageContent))
}
contextStr := contextBuilder.String()
inputs := map[string]interface{}{
	"context":  contextStr,
	"question": question,
}
out, err := chains.Call(ctx, llmChain, inputs)
if err != nil {
	log.Fatalf("Failed to run LLM chain: %v", err)
}
log.Println("Source documents:")
for i, doc := range documents {
	log.Printf("Document %d: %s\n", i+1, doc.PageContent)
}
responseText, ok := out["text"].(string)
if !ok {
	log.Println("Unexpected response type")
	return
}
log.Println("Question:", question)
log.Println("Generated Answer:", responseText)

Execute o seguinte comando para executar seu arquivo.

Depois de salvar o arquivo, execute o seguinte comando. A resposta gerada pode variar.

go run main.go

Source documents:
Document 1: "Successful oil painting necessitates patience,
proper equipment, and technique. Begin with a carefully
prepared, primed canvas. Sketch your composition lightly before
applying paint. Use high-quality brushes and oils to create
vibrant, long-lasting artworks. Remember to paint 'fat over
lean,' meaning each subsequent layer should contain more oil to
prevent cracking. Allow each layer to dry before applying
another. Clean your brushes often and avoid solvents that might
damage them. Finally, always work in a well-ventilated space."
Question: How do I get started painting?
Generated Answer: To get started painting, you should begin with a
carefully prepared, primed canvas. Sketch your composition lightly
before applying paint. Use high-quality brushes and oils to create
vibrant, long-lasting artworks. Remember to paint 'fat over lean,'
meaning each subsequent layer should contain more oil to prevent
cracking. Allow each layer to dry before applying another. Clean
your brushes often and avoid solvents that might damage them.
Finally, always work in a well-ventilated space.

Pré-requisitos

Para concluir este tutorial, você deve ter o seguinte:

Um dos seguintes tipos de cluster MongoDB :
- Um cluster do Atlas executando a versão 6.0.11 do MongoDB, 7.0.2, ou posterior. Certifique-se de que seu endereço IP esteja incluído na lista de acesso do seu projeto Atlas.
- Um sistema local do Atlas criado utilizando o Atlas CLI. Para saber mais, consulte Criar uma implantação de Atlas local.
- Um cluster MongoDB Community ou Enterprise com Search e Vector Search instalados.
Uma chave de API da OpenAI. Você deve ter uma conta da OpenAI com créditos disponíveis para solicitações de API. Para aprender mais sobre como registrar uma conta OpenAI, consulte o website de API OpenAI.
Um terminal e editor de código para executar seu projeto Go.
Go instalado na sua máquina.

Configurar o ambiente

Você deve primeiro configurar o ambiente para este tutorial. Complete as etapas a seguir para configurar seu ambiente.

Inicialize seu projeto Go.

Execute os seguintes comandos em seu terminal para criar um novo diretório chamado langchaingo-mongodb e inicializar seu projeto:

mkdir langchaingo-mongodb
cd langchaingo-mongodb
go mod init langchaingo-mongodb

Instalar dependências.

Execute os seguintes comandos:

go get github.com/joho/godotenv
go get github.com/tmc/langchaingo/chains
go get github.com/tmc/langchaingo/llms
go get github.com/tmc/langchaingo/prompts
go get github.com/tmc/langchaingo/vectorstores/mongovector
go get go.mongodb.org/mongo-driver/v2/mongo
go mod tidy

Inicialize suas variáveis de ambiente.

No seu diretório de projeto do langchaingo-mongodb, crie um arquivo .env e adicione as seguintes linhas:

OPENAI_API_KEY="<api-key>"
MONGODB_URI="<connection-string>"

Substitua os valores de espaço reservado por sua chave de API OpenAI e a string de conexão SRVpara seu cluster MongoDB . Sua string de conexão deve usar o seguinte formato:

mongodb+srv://<username>:<password>@<cluster-name>.mongodb.net/<dbname>

Crie seu arquivo principal.

No seu diretório de projeto langchaingo-mongodb, crie um arquivo denominado main.go. Você adicionará código a esse arquivo durante o tutorial.

Use o MongoDB como um armazenamento de vetores

Nesta seção, você define uma função assíncrona para carregar dados personalizados no MongoDB e instanciar o MongoDB como um banco de dados vetorial, também chamado de armazenamento de vetores.

Importe as seguintes dependências.

Adicione as seguintes importações ao topo do seu arquivo main.go.

 package main
 import (
   "context"
   "log"
   "os"
   "github.com/joho/godotenv"
   "github.com/tmc/langchaingo/embeddings"
   "github.com/tmc/langchaingo/llms/openai"
   "github.com/tmc/langchaingo/schema"
   "github.com/tmc/langchaingo/vectorstores/mongovector"
   "go.mongodb.org/mongo-driver/v2/mongo"
   "go.mongodb.org/mongo-driver/v2/mongo/options"
)

Defina os detalhes do Vector Store.

O seguinte código executa estas ações:

Configura o Atlas como um armazenamento de vetores especificando o seguinte:
- langchaingo_db.test como a coleção no Atlas para armazenar os documentos.
- vector_index como o índice a ser usado para consultar o armazenamento de vetores.
- text como o nome do campo que contém o conteúdo de texto bruto.
- embedding como o nome do campo que contém as incorporações vetoriais.
Prepara seus dados personalizados fazendo o seguinte:
- Define o texto para cada documento.
- Utiliza o pacote mongovector do LangChainGo para gerar incorporações para os textos. Este pacote armazena as incorporações de documento no MongoDB e permite pesquisas em incorporações armazenadas.
- Constrói documentos que incluem texto, incorporações e metadados.
Ingere os documentos construídos no Atlas e instancia o armazenamento vetorial.

Cole o seguinte código no seu arquivo main.go:

// Defines the document structure
type Document struct {
	PageContent string            `bson:"text"`
	Embedding   []float32         `bson:"embedding"`
	Metadata    map[string]string `bson:"metadata"`
}
func main() {
	const (
		openAIEmbeddingModel = "text-embedding-3-small"
		openAIEmbeddingDim   = 1536
		similarityAlgorithm  = "dotProduct"
		indexName            = "vector_index"
		databaseName         = "langchaingo_db"
		collectionName       = "test"
	)
	if err := godotenv.Load(); err != nil {
		log.Fatal("No .env file found")
	}
	// Loads the MongoDB URI from environment
	uri := os.Getenv("MONGODB_URI")
	if uri == "" {
		log.Fatal("Set your 'MONGODB_URI' environment variable in the .env file")
	}
	// Loads the API key from environment
	apiKey := os.Getenv("OPENAI_API_KEY")
	if apiKey == "" {
		log.Fatal("Set your OPENAI_API_KEY environment variable in the .env file")
	}
	// Connects to MongoDB
	client, err := mongo.Connect(options.Client().ApplyURI(uri))
	if err != nil {
		log.Fatalf("Failed to connect to server: %v", err)
	}
	defer func() {
		if err := client.Disconnect(context.Background()); err != nil {
			log.Fatalf("Error disconnecting the client: %v", err)
		}
	}()
	log.Println("Connected to MongoDB.")
	// Selects the database and collection
	coll := client.Database(databaseName).Collection(collectionName)
	// Creates an OpenAI LLM embedder client
	llm, err := openai.New(openai.WithEmbeddingModel(openAIEmbeddingModel))
	if err != nil {
		log.Fatalf("Failed to create an embedder client: %v", err)
	}
	// Creates an embedder from the embedder client
	embedder, err := embeddings.NewEmbedder(llm)
	if err != nil {
		log.Fatalf("Failed to create an embedder: %v", err)
	}
	// Creates a new MongoDB vector store
	store := mongovector.New(coll, embedder, mongovector.WithIndex(indexName), mongovector.WithPath("embeddings"))
	// Checks if the collection is empty, and if empty, adds documents to the MongoDB database vector store
	if isCollectionEmpty(coll) {
		documents := []schema.Document{
			{
				PageContent: "Proper tuber planting involves site selection, proper timing, and exceptional care. Choose spots with well-drained soil and adequate sun exposure. Tubers are generally planted in spring, but depending on the plant, timing varies. Always plant with the eyes facing upward at a depth two to three times the tuber's height. Ensure 4 inch spacing between small tubers, expand to 12 inches for large ones. Adequate moisture is needed, yet do not overwater. Mulching can help preserve moisture and prevent weed growth.",
				Metadata: map[string]any{
					"author": "A",
					"type":   "post",
				},
			},
			{
				PageContent: "Successful oil painting necessitates patience, proper equipment, and technique. Begin with a carefully prepared, primed canvas. Sketch your composition lightly before applying paint. Use high-quality brushes and oils to create vibrant, long-lasting artworks. Remember to paint 'fat over lean,' meaning each subsequent layer should contain more oil to prevent cracking. Allow each layer to dry before applying another. Clean your brushes often and avoid solvents that might damage them. Finally, always work in a well-ventilated space.",
				Metadata: map[string]any{
					"author": "B",
					"type":   "post",
				},
			},
			{
				PageContent: "For a natural lawn, selection of the right grass type suitable for your climate is crucial. Balanced watering, generally 1 to 1.5 inches per week, is important; overwatering invites disease. Opt for organic fertilizers over synthetic versions to provide necessary nutrients and improve soil structure. Regular lawn aeration helps root growth and prevents soil compaction. Practice natural pest control and consider overseeding to maintain a dense sward, which naturally combats weeds and pest.",
				Metadata: map[string]any{
					"author": "C",
					"type":   "post",
				},
			},
		}
		_, err := store.AddDocuments(context.Background(), documents)
		if err != nil {
			log.Fatalf("Error adding documents: %v", err)
		}
		log.Printf("Successfully added %d documents to the collection.\n", len(documents))
	} else {
		log.Println("Documents already exist in the collection, skipping document addition.")
	}
}
func isCollectionEmpty(coll *mongo.Collection) bool {
	count, err := coll.EstimatedDocumentCount(context.Background())
	if err != nil {
		log.Fatalf("Failed to count documents in the collection: %v", err)
	}
	return count == 0
}

Execute seu projeto Go .

Salve o arquivo e execute o seguinte comando para carregar seus dados no MongoDB.

go run main.go

Connected to MongoDB Atlas.
Successfully added 3 documents to the collection.

Dica

Depois de executar main.go, se estiver usando o Atlas, poderá verificar suas incorporações vetoriais navegando até o namespace langchaingo_db.test na interface do usuário do Atlas.

Crie o índice de Vector Search do MongoDB

Para habilitar consultas de pesquisa de vetor em seu armazenamento de vetor, crie um índice do MongoDB Vector Search na coleção langchaingo_db.test .

Adicione as seguintes importações ao topo do seu arquivo main.go:

import (
  // Other imports...
  "fmt"
  "time"
  "go.mongodb.org/mongo-driver/v2/bson"
)

Defina as seguintes funções no seu arquivo main.go fora da sua função main(). Estas funções criam e gerenciam um índice de pesquisa vetorial para sua collection do MongoDB :

A função SearchIndexExists verifica se existe um índice de pesquisa com o nome especificado e se pode ser consultado.
A função CreateVectorSearchIndex cria um índice de pesquisa vetorial na collection especificada. Esta função bloqueia até que o índice seja criado e consultável.

// Checks if the search index exists
func SearchIndexExists(ctx context.Context, coll *mongo.Collection, idx string) (bool, error) {
	log.Println("Checking if search index exists.")
	view := coll.SearchIndexes()
	siOpts := options.SearchIndexes().SetName(idx).SetType("vectorSearch")
	cursor, err := view.List(ctx, siOpts)
	if err != nil {
		return false, fmt.Errorf("failed to list search indexes: %w", err)
	}
	for cursor.Next(ctx) {
		index := struct {
			Name      string `bson:"name"`
			Queryable bool   `bson:"queryable"`
		}{}
		if err := cursor.Decode(&index); err != nil {
			return false, fmt.Errorf("failed to decode search index: %w", err)
		}
		if index.Name == idx && index.Queryable {
			return true, nil
		}
	}
	if err := cursor.Err(); err != nil {
		return false, fmt.Errorf("cursor error: %w", err)
	}
	return false, nil
}
// Creates a vector search index. This function blocks until the index has been
// created.
func CreateVectorSearchIndex(
	ctx context.Context,
	coll *mongo.Collection,
	idxName string,
	openAIEmbeddingDim int,
	similarityAlgorithm string,
) (string, error) {
	type vectorField struct {
		Type          string `bson:"type,omitempty"`
		Path          string `bson:"path,omitempty"`
		NumDimensions int    `bson:"numDimensions,omitempty"`
		Similarity    string `bson:"similarity,omitempty"`
	}
	fields := []vectorField{
		{
			Type:          "vector",
			Path:          "embeddings",
			NumDimensions: openAIEmbeddingDim,
			Similarity:    similarityAlgorithm,
		},
		{
			Type: "filter",
			Path: "metadata.author",
		},
		{
			Type: "filter",
			Path: "metadata.type",
		},
	}
	def := struct {
		Fields []vectorField `bson:"fields"`
	}{
		Fields: fields,
	}
	log.Println("Creating vector search index...")
	view := coll.SearchIndexes()
	siOpts := options.SearchIndexes().SetName(idxName).SetType("vectorSearch")
	searchName, err := view.CreateOne(ctx, mongo.SearchIndexModel{Definition: def, Options: siOpts})
	if err != nil {
		return "", fmt.Errorf("failed to create the search index: %w", err)
	}
	// Awaits the creation of the index
	var doc bson.Raw
	for doc == nil {
		cursor, err := view.List(ctx, options.SearchIndexes().SetName(searchName))
		if err != nil {
			return "", fmt.Errorf("failed to list search indexes: %w", err)
		}
		if !cursor.Next(ctx) {
			break
		}
		name := cursor.Current.Lookup("name").StringValue()
		queryable := cursor.Current.Lookup("queryable").Boolean()
		if name == searchName && queryable {
			doc = cursor.Current
		} else {
			time.Sleep(5 * time.Second)
		}
	}
	return searchName, nil
}

Crie a coleção e o índice do armazenamento de vetores chamando as funções anteriores na função main() . Adicione o seguinte código ao final da sua função main():

// SearchIndexExists will return true if the provided index is defined for the
// collection. This operation blocks until the search completes.
if ok, _ := SearchIndexExists(context.Background(), coll, indexName); !ok {
	// Creates the vector store collection
	err = client.Database(databaseName).CreateCollection(context.Background(), collectionName)
	if err != nil {
		log.Fatalf("failed to create vector store collection: %v", err)
	}
	_, err = CreateVectorSearchIndex(context.Background(), coll, indexName, openAIEmbeddingDim, similarityAlgorithm)
	if err != nil {
		log.Fatalf("failed to create index: %v", err)
	}
	log.Println("Successfully created vector search index.")
} else {
	log.Println("Vector search index already exists.")
}

Salve o arquivo e execute o seguinte comando para criar seu índice do MongoDB Vector Search .

go run main.go

Checking if search index exists.
Creating vector search index...
Successfully created vector search index.

Dica

Depois de executar main.go, você pode visualizar seu índice de pesquisa vetorial na interface do Atlas navegando até a coleção langchaingo_db.test no seu cluster.

Executar queries no Vector Search

Esta seção demonstra várias queries que você pode executar em seus dados vetorizados. Agora que você criou o índice, é possível executar consultas de pesquisa vetorial.

Selecione a aba Basic Semantic Search ou Semantic Search with Filtering para ver o código correspondente.

Adicione o seguinte código à sua função principal e salve o arquivo.

// Performs basic semantic search
docs, err := store.SimilaritySearch(context.Background(), "Prevent weeds", 1)
if err != nil {
   fmt.Println("Error performing search:", err)
}
fmt.Println("Semantic Search Results:", docs)

Execute o seguinte comando para executar a query.

go run main.go

Semantic Search Results: [{For a natural lawn, selection of
the right grass type suitable for your climate is crucial.
Balanced watering, generally 1 to 1.5 inches per week, is
important; overwatering invites disease. Opt for organic
fertilizers over synthetic versions to provide necessary
nutrients and improve soil structure. Regular lawn aeration
helps root growth and prevents soil compaction. Practice
natural pest control and consider overseeding to maintain a
dense sward, which naturally combats weeds and pest.
map[author:C type:post] 0.69752026}]

Adicione a seguinte dependência.

Adicione as seguintes dependências ao seu arquivo main.go:

import (
  // Other imports...
  "github.com/tmc/langchaingo/vectorstores"
)

Adicione o seguinte código à sua função principal e salve o arquivo.

O código a seguir usa o método SimilaritySearch() para executar uma pesquisa semântica para a string "Tulip care". Ele especifica os seguintes parâmetros:

O número de documentos a retornar como 1.
Um limite de pontuação de 0.60.

Retorna o documento que corresponde ao filtro metadata.type: post e inclui o limite de pontuação.

// Performs semantic search with metadata filter
filter := map[string]interface{}{
   "metadata.type": "post",
}
docs, err := store.SimilaritySearch(context.Background(), "Tulip care", 1,
   vectorstores.WithScoreThreshold(0.60),
   vectorstores.WithFilters(filter))
if err != nil {
   fmt.Println("Error performing search:", err)
}
fmt.Println("Filter Search Results:", docs)

Execute o seguinte comando para executar a query.

go run main.go

Filter Search Results: [{Proper tuber planting involves site
selection, proper timing, and exceptional care. Choose spots
with well-drained soil and adequate sun exposure. Tubers are
generally planted in spring, but depending on the plant,
timing varies. Always plant with the eyes facing upward at a
depth two to three times the tuber's height. Ensure 4 inch
spacing between small tubers, expand to 12 inches for large
ones. Adequate moisture is needed, yet do not overwater.
Mulching can help preserve moisture and prevent weed growth.
map[author:A type:post] 0.64432365}]

Responda a perguntas sobre seus dados

Importe as seguintes dependências.

Adicione as seguintes importações ao topo do seu arquivo main.go.

import (
  // Other imports...
  "strings"
  "github.com/tmc/langchaingo/chains"
  "github.com/tmc/langchaingo/prompts"
  "github.com/tmc/langchaingo/vectorstores"
)

Adicione o seguinte código ao final da sua função principal e salve o arquivo.

Este código faz o seguinte:

Instancia o MongoDB Vector Search como um recuperador para consultar documentos semanticamente semelhantes.
Define um modelo de prompt do LangChainGo para instruir o LLM a usar os documentos recuperados como contexto para sua query. O LangChainGo preenche esses documentos na variável de entrada {{.context}} e sua query na variável {{.question}}.
Constrói uma cadeia que usa o modelo de chat da OpenAI para gerar respostas sensíveis ao contexto com base no modelo de prompt fornecido.
Envia uma query de exemplo sobre pintura para iniciantes para a cadeia, usando o prompt e o retriever para reunir o contexto relevante.
Retorna e imprime a resposta do LLM e os documentos usados como contexto.

// Implements RAG to answer questions on your data
optionsVector := []vectorstores.Option{
	vectorstores.WithScoreThreshold(0.60),
}
retriever := vectorstores.ToRetriever(&store, 1, optionsVector...)
prompt := prompts.NewPromptTemplate(
	`Answer the question based on the following context:
	{{.context}}
	Question: {{.question}}`,
	[]string{"context", "question"},
)
llmChain := chains.NewLLMChain(llm, prompt)
ctx := context.Background()
const question = "How do I get started painting?"
documents, err := retriever.GetRelevantDocuments(ctx, question)
if err != nil {
	log.Fatalf("Failed to retrieve documents: %v", err)
}
var contextBuilder strings.Builder
for i, document := range documents {
	contextBuilder.WriteString(fmt.Sprintf("Document %d: %s\n", i+1, document.PageContent))
}
contextStr := contextBuilder.String()
inputs := map[string]interface{}{
	"context":  contextStr,
	"question": question,
}
out, err := chains.Call(ctx, llmChain, inputs)
if err != nil {
	log.Fatalf("Failed to run LLM chain: %v", err)
}
log.Println("Source documents:")
for i, doc := range documents {
	log.Printf("Document %d: %s\n", i+1, doc.PageContent)
}
responseText, ok := out["text"].(string)
if !ok {
	log.Println("Unexpected response type")
	return
}
log.Println("Question:", question)
log.Println("Generated Answer:", responseText)

Execute o seguinte comando para executar seu arquivo.

Depois de salvar o arquivo, execute o seguinte comando. A resposta gerada pode variar.

go run main.go

Source documents:
Document 1: "Successful oil painting necessitates patience,
proper equipment, and technique. Begin with a carefully
prepared, primed canvas. Sketch your composition lightly before
applying paint. Use high-quality brushes and oils to create
vibrant, long-lasting artworks. Remember to paint 'fat over
lean,' meaning each subsequent layer should contain more oil to
prevent cracking. Allow each layer to dry before applying
another. Clean your brushes often and avoid solvents that might
damage them. Finally, always work in a well-ventilated space."
Question: How do I get started painting?
Generated Answer: To get started painting, you should begin with a
carefully prepared, primed canvas. Sketch your composition lightly
before applying paint. Use high-quality brushes and oils to create
vibrant, long-lasting artworks. Remember to paint 'fat over lean,'
meaning each subsequent layer should contain more oil to prevent
cracking. Allow each layer to dry before applying another. Clean
your brushes often and avoid solvents that might damage them.
Finally, always work in a well-ventilated space.

Depois de concluir este tutorial, você integrou com sucesso o MongoDB Vector Search com o LangChainGo para criar um aplicação RAG. Você realizou o seguinte:

Iniciado e configurado o ambiente necessário para suportar seu aplicação
Dados personalizados armazenados no MongoDB e MongoDB instanciado como um armazenamento de vetor
Criou um índice do MongoDB Vector Search em seus dados, habilitando recursos de pesquisa semântica
Usou incorporações vetoriais para recuperar dados semanticamente relevantes
Resultados de pesquisa aprimorados por meio da incorporação de filtros de metadados
Implementou um fluxo de trabalho RAG usando o MongoDB Vector Search para fornecer respostas significativas a perguntas com base em seus dados

Próximos passos

Para saber mais sobre como começar a usar a Vector Search do MongoDB, consulte o Início Rápido da Vector Search do MongoDB e selecione Go no menu suspenso.
Para aprender mais sobre incorporações vetoriais, consulte Como criar incorporações vetoriais e selecione Go no menu suspenso.
Para saber como integrar o LangChainGo e o Hugging Face, consulte Retrieval-Augmented Geração (RAG) com o MongoDB.
Para saber como implementar RAG sem a necessidade de chaves ou créditos de API, consulte Criar uma implementação local de RAG com o MongoDB Vector Search.

O MongoDB também fornece o seguinte recurso de desenvolvimento:

Centro de aprendizado de IA

Dica

Para saber mais sobre a integração de LangChainGo, OpenAI e MongoDB, consulte Usando MongoDB Atlas como um armazenamento de vetor com incorporações OpenAI.

Voltar

LangChain JS/TS

LangChain4j