Comience a utilizar la integración de LangChainGo

Puede integrar MongoDB Vector Search con LangChainGo Para crear aplicaciones de modelos de lenguaje grandes (LLM) e implementar la generación aumentada por recuperación (RAG). Este tutorial muestra cómo empezar a usar MongoDB Vector Search con LangChainGo para realizar búsquedas semánticas en sus datos y crear un... Implementación deRAG. En concreto, se realizan las siguientes acciones:

Configura el entorno.
Almacenar datos personalizados en MongoDB.
Cree un índice de búsqueda vectorial MongoDB en sus datos.
Ejecuta las siguientes consultas de búsqueda vectorial:
- Búsqueda semántica.
- Búsqueda semántica con prefiltrado de metadatos.
Implementa RAG usando la búsqueda vectorial de MongoDB para responder preguntas sobre tus datos.

Segundo plano

LangChainGo es la implementación del lenguaje de programación Go de LangChain. Es una adaptación de terceros del framework LangChain, impulsada por la comunidad.

LangChain es un framework de código abierto que simplifica la creación de aplicaciones LLM mediante el uso de cadenas. Estas cadenas son componentes específicos de LangChain que pueden combinarse para diversos casos de uso de IA, incluyendo RAG.

Al integrar MongoDB Vector Search con LangChain, puede usar MongoDB como base de datos vectorial y usar MongoDB Vector Search para implementar RAG recuperando documentos semánticamente similares de sus datos. Para obtener más información sobre RAG, consulte Recuperación-Generación Aumentada (RAG) con MongoDB.

LangChainGo facilita la orquestación de LLM para aplicaciones de IA, integrando las capacidades de LangChain en el ecosistema Go. También permite a los desarrolladores conectarse a sus bases de datos preferidas compatibles con el almacén de vectores, incluyendo MongoDB.

Procedimiento

Requisitos previos

Para completar este tutorial, debes tener lo siguiente:

Uno de los siguientes tipos de clúster de MongoDB:
- Un clúster de Atlas que ejecuta la versión 6.0.11, 7.0.2 o posterior de MongoDB. Asegúrese de que su dirección IP esté incluida en la lista de acceso de su proyecto Atlas.
- Una implementación local de Atlas creada usando Atlas CLI. Para obtener más información, consulta Crear una Implementación local de Atlas.
- Un clúster de MongoDB Community o Enterprise con Search y Vector Search instalados.
Una llave de API de OpenAI. Debes tener una cuenta de OpenAI con créditos disponibles para las solicitudes de API. Para obtener más información sobre cómo registrar una cuenta de OpenAI, consulta el sitio web de la API de OpenAI.
Una clave API de Voyage AI. Para crear una clave API, consulta Claves API de modelos.
Una terminal y editor de código para ejecutar su proyecto de Go.
Vaya instalado en su máquina.

Configurar el entorno

Primero debe configurar el entorno para este tutorial. Complete los siguientes pasos para configurar su entorno.

Inicia el proyecto de Go.

Ejecute los siguientes comandos en su terminal para crear un nuevo directorio llamado langchaingo-mongodb e inicializa tu proyecto:

mkdir langchaingo-mongodb
cd langchaingo-mongodb
go mod init langchaingo-mongodb

Instalar dependencias.

Ejecute los siguientes comandos:

go get github.com/joho/godotenv
go get github.com/tmc/langchaingo/chains
go get github.com/tmc/langchaingo/llms
go get github.com/tmc/langchaingo/prompts
go get github.com/tmc/langchaingo/vectorstores/mongovector
go get github.com/tmc/langchaingo/embeddings/voyageai
go get go.mongodb.org/mongo-driver/v2/mongo
go mod tidy

Inicialice sus variables de entorno.

En el directorio de su proyecto langchaingo-mongodb, cree un archivo .env y agregue las siguientes líneas:

OPENAI_API_KEY="<openai-api-key>"
VOYAGEAI_API_KEY="<voyage-api-key>"
MONGODB_URI="<connection-string>"

Reemplace los valores de marcador de posición con su clave de API de OpenAI, la clave de API de Voyage AI y la cadena de conexión SRVde su clúster de MongoDB. Su cadena de conexión debe tener el siguiente formato:

mongodb+srv://<username>:<password>@<cluster-name>.mongodb.net/<dbname>

Crea tu archivo principal.

En tu directorio de proyecto langchaingo-mongodb, crea un archivo llamado main.go. Agregarás código a este archivo a lo largo de este tutorial.

Utilice MongoDB como almacén de vectores

En esta sección, definirá una función asincrónica para cargar datos personalizados en MongoDB e instanciar MongoDB como una base de datos vectorial, también denominada almacén vectorial.

Importe las siguientes dependencias.

Agregue las siguientes importaciones en la parte superior de su archivo main.go.

 package main
 import (
   "context"
   "log"
   "os"
   "github.com/joho/godotenv"
   "github.com/tmc/langchaingo/embeddings/voyageai"
   "github.com/tmc/langchaingo/schema"
   "github.com/tmc/langchaingo/vectorstores/mongovector"
   "go.mongodb.org/mongo-driver/v2/mongo"
   "go.mongodb.org/mongo-driver/v2/mongo/options"
)

Define los detalles de la tienda de vectores.

El siguiente código realiza estas acciones:

Configura Atlas como un almacén de vectores especificando lo siguiente:
- langchaingo_db.test como la colección en Atlas para almacenar los documentos.
- vector_index como el índice a utilizar para consultar el almacén de vectores.
- text como el nombre del campo que contiene el contenido de texto sin formato.
- embedding como el nombre del campo que contiene las vector embeddings.
Prepara tus datos personalizados de la siguiente manera:
- Define un texto para cada documento.
- Utiliza el paquete mongovector de LangChainGo para generar incrustaciones para los textos. Este paquete almacena incrustaciones de documentos en MongoDB y permite búsquedas sobre incrustaciones almacenadas.
- Construye documentos que incluyen texto, incrustaciones y metadatos.
Ingiere los documentos construidos en Atlas e instancia el almacén de vectores.

Pegue el siguiente código en su archivo main.go:

// Defines the document structure
type Document struct {
	PageContent string            `bson:"text"`
	Embedding   []float32         `bson:"embedding"`
	Metadata    map[string]string `bson:"metadata"`
}
func main() {
	const (
		voyageAIEmbeddingDim = 1024
		similarityAlgorithm  = "dotProduct"
		indexName            = "vector_index"
		databaseName         = "langchaingo_db"
		collectionName       = "test"
	)
	if err := godotenv.Load(); err != nil {
		log.Fatal("No .env file found")
	}
	// Loads the MongoDB URI from environment
	uri := os.Getenv("MONGODB_URI")
	if uri == "" {
		log.Fatal("Set your 'MONGODB_URI' environment variable in the .env file")
	}
	// Loads the API key from environment
	voyageApiKey := os.Getenv("VOYAGEAI_API_KEY")
	if voyageApiKey == "" {
		log.Fatal("Set your VOYAGEAI_API_KEY environment variable in the .env file")
	}
	// Connects to MongoDB cluster
	client, err := mongo.Connect(options.Client().ApplyURI(uri))
	if err != nil {
		log.Fatalf("Failed to connect to server: %v", err)
	}
	defer func() {
		if err := client.Disconnect(context.Background()); err != nil {
			log.Fatalf("Error disconnecting the client: %v", err)
		}
	}()
	log.Println("Connected to MongoDB.")
	// Selects the database and collection
	coll := client.Database(databaseName).Collection(collectionName)
	// Creates an embedder client
	embedder, err := voyageai.NewVoyageAI(
		voyageai.WithModel("voyage-3-large"),
	)
	if err != nil {
		log.Fatalf("Failed to create an embedder: %v", err)
	}
	// Creates a new MongoDB vector store
	store := mongovector.New(coll, embedder, mongovector.WithIndex(indexName), mongovector.WithPath("embeddings"))
	// Checks if the collection is empty, and if empty, adds documents to the MongoDB vector store
	if isCollectionEmpty(coll) {
		documents := []schema.Document{
			{
				PageContent: "Proper tuber planting involves site selection, proper timing, and exceptional care. Choose spots with well-drained soil and adequate sun exposure. Tubers are generally planted in spring, but depending on the plant, timing varies. Always plant with the eyes facing upward at a depth two to three times the tuber's height. Ensure 4 inch spacing between small tubers, expand to 12 inches for large ones. Adequate moisture is needed, yet do not overwater. Mulching can help preserve moisture and prevent weed growth.",
				Metadata: map[string]any{
					"author": "A",
					"type":   "post",
				},
			},
			{
				PageContent: "Successful oil painting necessitates patience, proper equipment, and technique. Begin with a carefully prepared, primed canvas. Sketch your composition lightly before applying paint. Use high-quality brushes and oils to create vibrant, long-lasting artworks. Remember to paint 'fat over lean,' meaning each subsequent layer should contain more oil to prevent cracking. Allow each layer to dry before applying another. Clean your brushes often and avoid solvents that might damage them. Finally, always work in a well-ventilated space.",
				Metadata: map[string]any{
					"author": "B",
					"type":   "post",
				},
			},
			{
				PageContent: "For a natural lawn, selection of the right grass type suitable for your climate is crucial. Balanced watering, generally 1 to 1.5 inches per week, is important; overwatering invites disease. Opt for organic fertilizers over synthetic versions to provide necessary nutrients and improve soil structure. Regular lawn aeration helps root growth and prevents soil compaction. Practice natural pest control and consider overseeding to maintain a dense sward, which naturally combats weeds and pest.",
				Metadata: map[string]any{
					"author": "C",
					"type":   "post",
				},
			},
		}
		_, err := store.AddDocuments(context.Background(), documents)
		if err != nil {
			log.Fatalf("Error adding documents: %v", err)
		}
		log.Printf("Successfully added %d documents to the collection.\n", len(documents))
	} else {
		log.Println("Documents already exist in the collection, skipping document addition.")
	}
}
func isCollectionEmpty(coll *mongo.Collection) bool {
	count, err := coll.EstimatedDocumentCount(context.Background())
	if err != nil {
		log.Fatalf("Failed to count documents in the collection: %v", err)
	}
	return count == 0
}

Ejecute su proyecto Go.

Guarde el archivo y luego ejecute el siguiente comando para cargar sus datos en MongoDB.

go run main.go

Connected to MongoDB Atlas.
Successfully added 3 documents to the collection.

Tip

Después de main.go ejecutar, si está usando Atlas, puede verificar sus incrustaciones vectoriales navegando al espacio de nombres en langchaingo_db.test la interfaz de usuario de Atlas.

Cree el índice de búsqueda vectorial de MongoDB

Para habilitar consultas de búsqueda de vectores en su tienda de vectores, cree un índice de búsqueda de vectores de MongoDB en la colección langchaingo_db.test.

Agregue las siguientes importaciones en la parte superior de su archivo main.go:

import (
  // Other imports...
  "fmt"
  "time"
  "go.mongodb.org/mongo-driver/v2/bson"
)

Define las siguientes funciones en tu archivo main.go, fuera de la función main(). Estas funciones crean y administran un índice de búsqueda vectorial para tu colección de MongoDB:

La función SearchIndexExists verifica si existe un índice de búsqueda con el nombre especificado y si se puede consultar.
La función CreateVectorSearchIndex crea un índice de búsqueda vectorial en la colección especificada. Esta función se bloquea hasta que el índice se crea y se puede consultar.

// Checks if the search index exists
func SearchIndexExists(ctx context.Context, coll *mongo.Collection, idx string) (bool, error) {
	log.Println("Checking if search index exists.")
	view := coll.SearchIndexes()
	siOpts := options.SearchIndexes().SetName(idx).SetType("vectorSearch")
	cursor, err := view.List(ctx, siOpts)
	if err != nil {
		return false, fmt.Errorf("failed to list search indexes: %w", err)
	}
	for cursor.Next(ctx) {
		index := struct {
			Name      string `bson:"name"`
			Queryable bool   `bson:"queryable"`
		}{}
		if err := cursor.Decode(&index); err != nil {
			return false, fmt.Errorf("failed to decode search index: %w", err)
		}
		if index.Name == idx && index.Queryable {
			return true, nil
		}
	}
	if err := cursor.Err(); err != nil {
		return false, fmt.Errorf("cursor error: %w", err)
	}
	return false, nil
}
// Creates a vector search index. This function blocks until the index has been
// created.
func CreateVectorSearchIndex(
	ctx context.Context,
	coll *mongo.Collection,
	idxName string,
	voyageAIEmbeddingDim int,
	similarityAlgorithm string,
) (string, error) {
	type vectorField struct {
		Type          string `bson:"type,omitempty"`
		Path          string `bson:"path,omitempty"`
		NumDimensions int    `bson:"numDimensions,omitempty"`
		Similarity    string `bson:"similarity,omitempty"`
	}
	fields := []vectorField{
		{
			Type:          "vector",
			Path:          "embeddings",
			NumDimensions: voyageAIEmbeddingDim,
			Similarity:    similarityAlgorithm,
		},
		{
			Type: "filter",
			Path: "metadata.author",
		},
		{
			Type: "filter",
			Path: "metadata.type",
		},
	}
	def := struct {
		Fields []vectorField `bson:"fields"`
	}{
		Fields: fields,
	}
	log.Println("Creating vector search index...")
	view := coll.SearchIndexes()
	siOpts := options.SearchIndexes().SetName(idxName).SetType("vectorSearch")
	searchName, err := view.CreateOne(ctx, mongo.SearchIndexModel{Definition: def, Options: siOpts})
	if err != nil {
		return "", fmt.Errorf("failed to create the search index: %w", err)
	}
	// Awaits the creation of the index
	var doc bson.Raw
	for doc == nil {
		cursor, err := view.List(ctx, options.SearchIndexes().SetName(searchName))
		if err != nil {
			return "", fmt.Errorf("failed to list search indexes: %w", err)
		}
		if !cursor.Next(ctx) {
			break
		}
		name := cursor.Current.Lookup("name").StringValue()
		queryable := cursor.Current.Lookup("queryable").Boolean()
		if name == searchName && queryable {
			doc = cursor.Current
		} else {
			time.Sleep(5 * time.Second)
		}
	}
	return searchName, nil
}

Cree la colección y el índice del almacén de vectores llamando a las funciones anteriores en su función main(). Añada el siguiente código al final de su función main():

// SearchIndexExists will return true if the provided index is defined for the
// collection. This operation blocks until the search completes.
if ok, _ := SearchIndexExists(context.Background(), coll, indexName); !ok {
	// Creates the vector store collection
	err = client.Database(databaseName).CreateCollection(context.Background(), collectionName)
	if err != nil {
		log.Fatalf("failed to create vector store collection: %v", err)
	}
	_, err = CreateVectorSearchIndex(context.Background(), coll, indexName, voyageAIEmbeddingDim, similarityAlgorithm)
	if err != nil {
		log.Fatalf("failed to create index: %v", err)
	}
	log.Println("Successfully created vector search index.")
} else {
	log.Println("Vector search index already exists.")
}

Guarde el archivo y luego ejecute el siguiente comando para crear su índice de búsqueda vectorial de MongoDB.

go run main.go

Checking if search index exists.
Creating vector search index...
Successfully created vector search index.

Tip

Después de main.go ejecutar, puede ver su índice de búsqueda de vectores en la interfaz de usuario de Atlas navegando a la langchaingo_db.test colección en su clúster.

Ejecución de consultas de búsqueda vectorial

Esta sección muestra varias consultas que puede ejecutar en sus datos vectorizados. Ahora que ha creado el índice, puede ejecutar consultas de búsqueda vectorial.

Seleccione el Basic Semantic Search o la pestaña Semantic Search with Filtering para ver el código correspondiente.

Agregue el siguiente código a su función principal y guarde el archivo.

La búsqueda semántica recupera información significativamente relacionada con una consulta. El siguiente código utiliza el método SimilaritySearch() para realizar una búsqueda semántica de la cadena "Prevent weeds" y limita los resultados al primer documento.

// Performs basic semantic search
docs, err := store.SimilaritySearch(context.Background(), "Prevent weeds", 1)
if err != nil {
   fmt.Println("Error performing search:", err)
}
fmt.Println("Semantic Search Results:", docs)

Ejecuta el siguiente comando para ejecutar la query.

go run main.go

Semantic Search Results: [{For a natural lawn, selection of
the right grass type suitable for your climate is crucial.
Balanced watering, generally 1 to 1.5 inches per week, is
important; overwatering invites disease. Opt for organic
fertilizers over synthetic versions to provide necessary
nutrients and improve soil structure. Regular lawn aeration
helps root growth and prevents soil compaction. Practice
natural pest control and consider overseeding to maintain a
dense sward, which naturally combats weeds and pest.
map[author:C type:post] 0.69752026}]

Puedes prefiltrar tus datos utilizando una expresión de coincidencia MQL que compare el campo indexado con otro valor de tu colección. Debes indexar cualquier campo de metadatos que desees filtrar como tipo filter. Para obtener más información, consulta Cómo indexar campos para la búsqueda vectorial.

Agregue la siguiente dependencia.

Se debe añadir las siguientes dependencias al archivo main.go:

import (
  // Other imports...
  "github.com/tmc/langchaingo/vectorstores"
)

Agregue el siguiente código a su función principal y guarde el archivo.

El siguiente código utiliza el método SimilaritySearch() para realizar una búsqueda semántica de la cadena "Tulip care". Especifica los siguientes parámetros:

El número de documentos a devolver es 1.
Un umbral de puntuación de 0.60.

Devuelve el documento que coincide con el filtro metadata.type: post e incluye el umbral de puntuación.

// Performs semantic search with metadata filter
filter := map[string]interface{}{
   "metadata.type": "post",
}
docs, err := store.SimilaritySearch(context.Background(), "Tulip care", 1,
   vectorstores.WithScoreThreshold(0.60),
   vectorstores.WithFilters(filter))
if err != nil {
   fmt.Println("Error performing search:", err)
}
fmt.Println("Filter Search Results:", docs)

Ejecuta el siguiente comando para ejecutar la query.

go run main.go

Filter Search Results: [{Proper tuber planting involves site
selection, proper timing, and exceptional care. Choose spots
with well-drained soil and adequate sun exposure. Tubers are
generally planted in spring, but depending on the plant,
timing varies. Always plant with the eyes facing upward at a
depth two to three times the tuber's height. Ensure 4 inch
spacing between small tubers, expand to 12 inches for large
ones. Adequate moisture is needed, yet do not overwater.
Mulching can help preserve moisture and prevent weed growth.
map[author:A type:post] 0.64432365}]

Responda preguntas sobre sus datos

Esta sección muestra una implementación de RAG con MongoDB Vector Search y LangChainGo. Ahora que ha utilizado MongoDB Vector Search para recuperar documentos semánticamente similares, utilice el siguiente ejemplo de código para solicitar al LLM que responda preguntas sobre los documentos devueltos por MongoDB Vector Search.

Importe las siguientes dependencias.

Agregue las siguientes importaciones en la parte superior de su archivo main.go.

import (
  // Other imports...
  "strings"
  "github.com/tmc/langchaingo/llms/openai"
  "github.com/tmc/langchaingo/chains"
  "github.com/tmc/langchaingo/prompts"
  "github.com/tmc/langchaingo/vectorstores"
)

Agregue el siguiente código al final de su función principal y guarde el archivo.

Este código realiza lo siguiente:

Crea una instancia de MongoDB Vector Search como un recuperador para consultar documentos semánticamente similares.
Define una plantilla de solicitud de LangChainGo para indicar al LLM que utilice los documentos recuperados como contexto para su consulta. LangChainGo rellena estos documentos en la variable de entrada {{.context}} y su consulta en la variable {{.question}}.
Construye una cadena que utiliza el modelo de chat de OpenAI para generar respuestas sensibles al contexto basadas en la plantilla de mensaje proporcionada.
Envía una consulta de muestra sobre pintura para principiantes a la cadena, utilizando el indicador y el recuperador para recopilar el contexto relevante.
Devuelve e imprime la respuesta del LLM y los documentos usados como contexto.

// Implements RAG to answer questions on your data
optionsVector := []vectorstores.Option{
	vectorstores.WithScoreThreshold(0.60),
}
retriever := vectorstores.ToRetriever(&store, 1, optionsVector...)
// Loads OpenAI API key from environment
openaiApiKey := os.Getenv("OPENAI_API_KEY")
if openaiApiKey == "" {
	log.Fatal("Set your OPENAI_API_KEY environment variable in the .env file")
}
// Creates an OpenAI LLM client
llm, err := openai.New(openai.WithToken(openaiApiKey), openai.WithModel("gpt-4o"), openai.WithEmbeddingModel("voyage-3-large"))
if err != nil {
	log.Fatalf("Failed to create an LLM client: %v", err)
}
prompt := prompts.NewPromptTemplate(
	`Answer the question based on the following context:
	{{.context}}
	Question: {{.question}}`,
	[]string{"context", "question"},
)
llmChain := chains.NewLLMChain(llm, prompt)
ctx := context.Background()
const question = "How do I get started painting?"
documents, err := retriever.GetRelevantDocuments(ctx, question)
if err != nil {
	log.Fatalf("Failed to retrieve documents: %v", err)
}
var contextBuilder strings.Builder
for i, document := range documents {
	contextBuilder.WriteString(fmt.Sprintf("Document %d: %s\n", i+1, document.PageContent))
}
contextStr := contextBuilder.String()
inputs := map[string]interface{}{
	"context":  contextStr,
	"question": question,
}
out, err := chains.Call(ctx, llmChain, inputs)
if err != nil {
	log.Fatalf("Failed to run LLM chain: %v", err)
}
log.Println("Source documents:")
for i, doc := range documents {
	log.Printf("Document %d: %s\n", i+1, doc.PageContent)
}
responseText, ok := out["text"].(string)
if !ok {
	log.Println("Unexpected response type")
	return
}
log.Println("Question:", question)
log.Println("Generated Answer:", responseText)

Ejecute el siguiente comando para ejecutar su archivo.

Después de guardar el archivo, ejecute el siguiente comando. La respuesta generada puede variar.

go run main.go

Source documents:
Document 1: "Successful oil painting necessitates patience,
proper equipment, and technique. Begin with a carefully
prepared, primed canvas. Sketch your composition lightly before
applying paint. Use high-quality brushes and oils to create
vibrant, long-lasting artworks. Remember to paint 'fat over
lean,' meaning each subsequent layer should contain more oil to
prevent cracking. Allow each layer to dry before applying
another. Clean your brushes often and avoid solvents that might
damage them. Finally, always work in a well-ventilated space."
Question: How do I get started painting?
Generated Answer: To get started painting, you should begin with a
carefully prepared, primed canvas. Sketch your composition lightly
before applying paint. Use high-quality brushes and oils to create
vibrant, long-lasting artworks. Remember to paint 'fat over lean,'
meaning each subsequent layer should contain more oil to prevent
cracking. Allow each layer to dry before applying another. Clean
your brushes often and avoid solvents that might damage them.
Finally, always work in a well-ventilated space.

Requisitos previos

Para completar este tutorial, debes tener lo siguiente:

Uno de los siguientes tipos de clúster de MongoDB:
- Un clúster de Atlas que ejecuta la versión 6.0.11, 7.0.2 o posterior de MongoDB. Asegúrese de que su dirección IP esté incluida en la lista de acceso de su proyecto Atlas.
- Una implementación local de Atlas creada usando Atlas CLI. Para obtener más información, consulta Crear una Implementación local de Atlas.
- Un clúster de MongoDB Community o Enterprise con Search y Vector Search instalados.
Una llave de API de OpenAI. Debes tener una cuenta de OpenAI con créditos disponibles para las solicitudes de API. Para obtener más información sobre cómo registrar una cuenta de OpenAI, consulta el sitio web de la API de OpenAI.
Una terminal y editor de código para ejecutar su proyecto de Go.
Vaya instalado en su máquina.

Configurar el entorno

Primero debe configurar el entorno para este tutorial. Complete los siguientes pasos para configurar su entorno.

Inicia el proyecto de Go.

Ejecute los siguientes comandos en su terminal para crear un nuevo directorio llamado langchaingo-mongodb e inicializar su proyecto:

mkdir langchaingo-mongodb
cd langchaingo-mongodb
go mod init langchaingo-mongodb

Instalar dependencias.

Ejecute los siguientes comandos:

go get github.com/joho/godotenv
go get github.com/tmc/langchaingo/chains
go get github.com/tmc/langchaingo/llms
go get github.com/tmc/langchaingo/prompts
go get github.com/tmc/langchaingo/vectorstores/mongovector
go get go.mongodb.org/mongo-driver/v2/mongo
go mod tidy

Inicialice sus variables de entorno.

En el directorio de su proyecto langchaingo-mongodb, cree un archivo .env y agregue las siguientes líneas:

OPENAI_API_KEY="<api-key>"
MONGODB_URI="<connection-string>"

Reemplace los valores de marcador de posición con su clave de API de OpenAI y la cadena de conexión SRVde su clúster de MongoDB. Su cadena de conexión debe tener el siguiente formato:

mongodb+srv://<username>:<password>@<cluster-name>.mongodb.net/<dbname>

Crea tu archivo principal.

En tu directorio de proyecto langchaingo-mongodb, crea un archivo llamado main.go. Agregarás código a este archivo a lo largo de este tutorial.

Utilice MongoDB como almacén de vectores

En esta sección, definirá una función asincrónica para cargar datos personalizados en MongoDB e instanciar MongoDB como una base de datos vectorial, también denominada almacén vectorial.

Importe las siguientes dependencias.

Agregue las siguientes importaciones en la parte superior de su archivo main.go.

 package main
 import (
   "context"
   "log"
   "os"
   "github.com/joho/godotenv"
   "github.com/tmc/langchaingo/embeddings"
   "github.com/tmc/langchaingo/llms/openai"
   "github.com/tmc/langchaingo/schema"
   "github.com/tmc/langchaingo/vectorstores/mongovector"
   "go.mongodb.org/mongo-driver/v2/mongo"
   "go.mongodb.org/mongo-driver/v2/mongo/options"
)

Define los detalles de la tienda de vectores.

El siguiente código realiza estas acciones:

Configura Atlas como un almacén de vectores especificando lo siguiente:
- langchaingo_db.test como la colección en Atlas para almacenar los documentos.
- vector_index como el índice a utilizar para consultar el almacén de vectores.
- text como el nombre del campo que contiene el contenido de texto sin formato.
- embedding como el nombre del campo que contiene las vector embeddings.
Prepara tus datos personalizados de la siguiente manera:
- Define un texto para cada documento.
- Utiliza el paquete mongovector de LangChainGo para generar incrustaciones para los textos. Este paquete almacena incrustaciones de documentos en MongoDB y permite búsquedas sobre incrustaciones almacenadas.
- Construye documentos que incluyen texto, incrustaciones y metadatos.
Ingiere los documentos construidos en Atlas e instancia el almacén de vectores.

Pegue el siguiente código en su archivo main.go:

// Defines the document structure
type Document struct {
	PageContent string            `bson:"text"`
	Embedding   []float32         `bson:"embedding"`
	Metadata    map[string]string `bson:"metadata"`
}
func main() {
	const (
		openAIEmbeddingModel = "text-embedding-3-small"
		openAIEmbeddingDim   = 1536
		similarityAlgorithm  = "dotProduct"
		indexName            = "vector_index"
		databaseName         = "langchaingo_db"
		collectionName       = "test"
	)
	if err := godotenv.Load(); err != nil {
		log.Fatal("No .env file found")
	}
	// Loads the MongoDB URI from environment
	uri := os.Getenv("MONGODB_URI")
	if uri == "" {
		log.Fatal("Set your 'MONGODB_URI' environment variable in the .env file")
	}
	// Loads the API key from environment
	apiKey := os.Getenv("OPENAI_API_KEY")
	if apiKey == "" {
		log.Fatal("Set your OPENAI_API_KEY environment variable in the .env file")
	}
	// Connects to MongoDB
	client, err := mongo.Connect(options.Client().ApplyURI(uri))
	if err != nil {
		log.Fatalf("Failed to connect to server: %v", err)
	}
	defer func() {
		if err := client.Disconnect(context.Background()); err != nil {
			log.Fatalf("Error disconnecting the client: %v", err)
		}
	}()
	log.Println("Connected to MongoDB.")
	// Selects the database and collection
	coll := client.Database(databaseName).Collection(collectionName)
	// Creates an OpenAI LLM embedder client
	llm, err := openai.New(openai.WithEmbeddingModel(openAIEmbeddingModel))
	if err != nil {
		log.Fatalf("Failed to create an embedder client: %v", err)
	}
	// Creates an embedder from the embedder client
	embedder, err := embeddings.NewEmbedder(llm)
	if err != nil {
		log.Fatalf("Failed to create an embedder: %v", err)
	}
	// Creates a new MongoDB vector store
	store := mongovector.New(coll, embedder, mongovector.WithIndex(indexName), mongovector.WithPath("embeddings"))
	// Checks if the collection is empty, and if empty, adds documents to the MongoDB database vector store
	if isCollectionEmpty(coll) {
		documents := []schema.Document{
			{
				PageContent: "Proper tuber planting involves site selection, proper timing, and exceptional care. Choose spots with well-drained soil and adequate sun exposure. Tubers are generally planted in spring, but depending on the plant, timing varies. Always plant with the eyes facing upward at a depth two to three times the tuber's height. Ensure 4 inch spacing between small tubers, expand to 12 inches for large ones. Adequate moisture is needed, yet do not overwater. Mulching can help preserve moisture and prevent weed growth.",
				Metadata: map[string]any{
					"author": "A",
					"type":   "post",
				},
			},
			{
				PageContent: "Successful oil painting necessitates patience, proper equipment, and technique. Begin with a carefully prepared, primed canvas. Sketch your composition lightly before applying paint. Use high-quality brushes and oils to create vibrant, long-lasting artworks. Remember to paint 'fat over lean,' meaning each subsequent layer should contain more oil to prevent cracking. Allow each layer to dry before applying another. Clean your brushes often and avoid solvents that might damage them. Finally, always work in a well-ventilated space.",
				Metadata: map[string]any{
					"author": "B",
					"type":   "post",
				},
			},
			{
				PageContent: "For a natural lawn, selection of the right grass type suitable for your climate is crucial. Balanced watering, generally 1 to 1.5 inches per week, is important; overwatering invites disease. Opt for organic fertilizers over synthetic versions to provide necessary nutrients and improve soil structure. Regular lawn aeration helps root growth and prevents soil compaction. Practice natural pest control and consider overseeding to maintain a dense sward, which naturally combats weeds and pest.",
				Metadata: map[string]any{
					"author": "C",
					"type":   "post",
				},
			},
		}
		_, err := store.AddDocuments(context.Background(), documents)
		if err != nil {
			log.Fatalf("Error adding documents: %v", err)
		}
		log.Printf("Successfully added %d documents to the collection.\n", len(documents))
	} else {
		log.Println("Documents already exist in the collection, skipping document addition.")
	}
}
func isCollectionEmpty(coll *mongo.Collection) bool {
	count, err := coll.EstimatedDocumentCount(context.Background())
	if err != nil {
		log.Fatalf("Failed to count documents in the collection: %v", err)
	}
	return count == 0
}

Ejecute su proyecto Go.

Guarde el archivo y luego ejecute el siguiente comando para cargar sus datos en MongoDB.

go run main.go

Connected to MongoDB Atlas.
Successfully added 3 documents to the collection.

Tip

Después de main.go ejecutar, si está usando Atlas, puede verificar sus incrustaciones vectoriales navegando al espacio de nombres en langchaingo_db.test la interfaz de usuario de Atlas.

Cree el índice de búsqueda vectorial de MongoDB

Para habilitar consultas de búsqueda de vectores en su tienda de vectores, cree un índice de búsqueda de vectores de MongoDB en la colección langchaingo_db.test.

Agregue las siguientes importaciones en la parte superior de su archivo main.go:

import (
  // Other imports...
  "fmt"
  "time"
  "go.mongodb.org/mongo-driver/v2/bson"
)

Define las siguientes funciones en tu archivo main.go, fuera de la función main(). Estas funciones crean y administran un índice de búsqueda vectorial para tu colección de MongoDB:

La función SearchIndexExists verifica si existe un índice de búsqueda con el nombre especificado y si se puede consultar.
La función CreateVectorSearchIndex crea un índice de búsqueda vectorial en la colección especificada. Esta función se bloquea hasta que el índice se crea y se puede consultar.

// Checks if the search index exists
func SearchIndexExists(ctx context.Context, coll *mongo.Collection, idx string) (bool, error) {
	log.Println("Checking if search index exists.")
	view := coll.SearchIndexes()
	siOpts := options.SearchIndexes().SetName(idx).SetType("vectorSearch")
	cursor, err := view.List(ctx, siOpts)
	if err != nil {
		return false, fmt.Errorf("failed to list search indexes: %w", err)
	}
	for cursor.Next(ctx) {
		index := struct {
			Name      string `bson:"name"`
			Queryable bool   `bson:"queryable"`
		}{}
		if err := cursor.Decode(&index); err != nil {
			return false, fmt.Errorf("failed to decode search index: %w", err)
		}
		if index.Name == idx && index.Queryable {
			return true, nil
		}
	}
	if err := cursor.Err(); err != nil {
		return false, fmt.Errorf("cursor error: %w", err)
	}
	return false, nil
}
// Creates a vector search index. This function blocks until the index has been
// created.
func CreateVectorSearchIndex(
	ctx context.Context,
	coll *mongo.Collection,
	idxName string,
	openAIEmbeddingDim int,
	similarityAlgorithm string,
) (string, error) {
	type vectorField struct {
		Type          string `bson:"type,omitempty"`
		Path          string `bson:"path,omitempty"`
		NumDimensions int    `bson:"numDimensions,omitempty"`
		Similarity    string `bson:"similarity,omitempty"`
	}
	fields := []vectorField{
		{
			Type:          "vector",
			Path:          "embeddings",
			NumDimensions: openAIEmbeddingDim,
			Similarity:    similarityAlgorithm,
		},
		{
			Type: "filter",
			Path: "metadata.author",
		},
		{
			Type: "filter",
			Path: "metadata.type",
		},
	}
	def := struct {
		Fields []vectorField `bson:"fields"`
	}{
		Fields: fields,
	}
	log.Println("Creating vector search index...")
	view := coll.SearchIndexes()
	siOpts := options.SearchIndexes().SetName(idxName).SetType("vectorSearch")
	searchName, err := view.CreateOne(ctx, mongo.SearchIndexModel{Definition: def, Options: siOpts})
	if err != nil {
		return "", fmt.Errorf("failed to create the search index: %w", err)
	}
	// Awaits the creation of the index
	var doc bson.Raw
	for doc == nil {
		cursor, err := view.List(ctx, options.SearchIndexes().SetName(searchName))
		if err != nil {
			return "", fmt.Errorf("failed to list search indexes: %w", err)
		}
		if !cursor.Next(ctx) {
			break
		}
		name := cursor.Current.Lookup("name").StringValue()
		queryable := cursor.Current.Lookup("queryable").Boolean()
		if name == searchName && queryable {
			doc = cursor.Current
		} else {
			time.Sleep(5 * time.Second)
		}
	}
	return searchName, nil
}

Cree la colección y el índice del almacén de vectores llamando a las funciones anteriores en su función main(). Añada el siguiente código al final de su función main():

// SearchIndexExists will return true if the provided index is defined for the
// collection. This operation blocks until the search completes.
if ok, _ := SearchIndexExists(context.Background(), coll, indexName); !ok {
	// Creates the vector store collection
	err = client.Database(databaseName).CreateCollection(context.Background(), collectionName)
	if err != nil {
		log.Fatalf("failed to create vector store collection: %v", err)
	}
	_, err = CreateVectorSearchIndex(context.Background(), coll, indexName, openAIEmbeddingDim, similarityAlgorithm)
	if err != nil {
		log.Fatalf("failed to create index: %v", err)
	}
	log.Println("Successfully created vector search index.")
} else {
	log.Println("Vector search index already exists.")
}

Guarde el archivo y luego ejecute el siguiente comando para crear su índice de búsqueda vectorial de MongoDB.

go run main.go

Checking if search index exists.
Creating vector search index...
Successfully created vector search index.

Tip

Después de main.go ejecutar, puede ver su índice de búsqueda de vectores en la interfaz de usuario de Atlas navegando a la langchaingo_db.test colección en su clúster.

Ejecución de consultas de búsqueda vectorial

Esta sección muestra varias consultas que puede ejecutar en sus datos vectorizados. Ahora que ha creado el índice, puede ejecutar consultas de búsqueda vectorial.

Seleccione la pestaña Basic Semantic Search o Semantic Search with Filtering para ver el código correspondiente.

Agregue el siguiente código a su función principal y guarde el archivo.

// Performs basic semantic search
docs, err := store.SimilaritySearch(context.Background(), "Prevent weeds", 1)
if err != nil {
   fmt.Println("Error performing search:", err)
}
fmt.Println("Semantic Search Results:", docs)

Ejecuta el siguiente comando para ejecutar la query.

go run main.go

Semantic Search Results: [{For a natural lawn, selection of
the right grass type suitable for your climate is crucial.
Balanced watering, generally 1 to 1.5 inches per week, is
important; overwatering invites disease. Opt for organic
fertilizers over synthetic versions to provide necessary
nutrients and improve soil structure. Regular lawn aeration
helps root growth and prevents soil compaction. Practice
natural pest control and consider overseeding to maintain a
dense sward, which naturally combats weeds and pest.
map[author:C type:post] 0.69752026}]

Agregue la siguiente dependencia.

Se debe añadir las siguientes dependencias al archivo main.go:

import (
  // Other imports...
  "github.com/tmc/langchaingo/vectorstores"
)

Agregue el siguiente código a su función principal y guarde el archivo.

El siguiente código utiliza el método SimilaritySearch() para realizar una búsqueda semántica de la cadena "Tulip care". Especifica los siguientes parámetros:

El número de documentos a devolver es 1.
Un umbral de puntuación de 0.60.

Devuelve el documento que coincide con el filtro metadata.type: post e incluye el umbral de puntuación.

// Performs semantic search with metadata filter
filter := map[string]interface{}{
   "metadata.type": "post",
}
docs, err := store.SimilaritySearch(context.Background(), "Tulip care", 1,
   vectorstores.WithScoreThreshold(0.60),
   vectorstores.WithFilters(filter))
if err != nil {
   fmt.Println("Error performing search:", err)
}
fmt.Println("Filter Search Results:", docs)

Ejecuta el siguiente comando para ejecutar la query.

go run main.go

Filter Search Results: [{Proper tuber planting involves site
selection, proper timing, and exceptional care. Choose spots
with well-drained soil and adequate sun exposure. Tubers are
generally planted in spring, but depending on the plant,
timing varies. Always plant with the eyes facing upward at a
depth two to three times the tuber's height. Ensure 4 inch
spacing between small tubers, expand to 12 inches for large
ones. Adequate moisture is needed, yet do not overwater.
Mulching can help preserve moisture and prevent weed growth.
map[author:A type:post] 0.64432365}]

Responda preguntas sobre sus datos

Importe las siguientes dependencias.

Agregue las siguientes importaciones en la parte superior de su archivo main.go.

import (
  // Other imports...
  "strings"
  "github.com/tmc/langchaingo/chains"
  "github.com/tmc/langchaingo/prompts"
  "github.com/tmc/langchaingo/vectorstores"
)

Agregue el siguiente código al final de su función principal y guarde el archivo.

Este código realiza lo siguiente:

Crea una instancia de MongoDB Vector Search como un recuperador para consultar documentos semánticamente similares.
Define una plantilla de solicitud de LangChainGo para indicar al LLM que utilice los documentos recuperados como contexto para su consulta. LangChainGo rellena estos documentos en la variable de entrada {{.context}} y su consulta en la variable {{.question}}.
Construye una cadena que utiliza el modelo de chat de OpenAI para generar respuestas sensibles al contexto basadas en la plantilla de mensaje proporcionada.
Envía una consulta de muestra sobre pintura para principiantes a la cadena, utilizando el indicador y el recuperador para recopilar el contexto relevante.
Devuelve e imprime la respuesta del LLM y los documentos usados como contexto.

// Implements RAG to answer questions on your data
optionsVector := []vectorstores.Option{
	vectorstores.WithScoreThreshold(0.60),
}
retriever := vectorstores.ToRetriever(&store, 1, optionsVector...)
prompt := prompts.NewPromptTemplate(
	`Answer the question based on the following context:
	{{.context}}
	Question: {{.question}}`,
	[]string{"context", "question"},
)
llmChain := chains.NewLLMChain(llm, prompt)
ctx := context.Background()
const question = "How do I get started painting?"
documents, err := retriever.GetRelevantDocuments(ctx, question)
if err != nil {
	log.Fatalf("Failed to retrieve documents: %v", err)
}
var contextBuilder strings.Builder
for i, document := range documents {
	contextBuilder.WriteString(fmt.Sprintf("Document %d: %s\n", i+1, document.PageContent))
}
contextStr := contextBuilder.String()
inputs := map[string]interface{}{
	"context":  contextStr,
	"question": question,
}
out, err := chains.Call(ctx, llmChain, inputs)
if err != nil {
	log.Fatalf("Failed to run LLM chain: %v", err)
}
log.Println("Source documents:")
for i, doc := range documents {
	log.Printf("Document %d: %s\n", i+1, doc.PageContent)
}
responseText, ok := out["text"].(string)
if !ok {
	log.Println("Unexpected response type")
	return
}
log.Println("Question:", question)
log.Println("Generated Answer:", responseText)

Ejecute el siguiente comando para ejecutar su archivo.

Después de guardar el archivo, ejecute el siguiente comando. La respuesta generada puede variar.

go run main.go

Source documents:
Document 1: "Successful oil painting necessitates patience,
proper equipment, and technique. Begin with a carefully
prepared, primed canvas. Sketch your composition lightly before
applying paint. Use high-quality brushes and oils to create
vibrant, long-lasting artworks. Remember to paint 'fat over
lean,' meaning each subsequent layer should contain more oil to
prevent cracking. Allow each layer to dry before applying
another. Clean your brushes often and avoid solvents that might
damage them. Finally, always work in a well-ventilated space."
Question: How do I get started painting?
Generated Answer: To get started painting, you should begin with a
carefully prepared, primed canvas. Sketch your composition lightly
before applying paint. Use high-quality brushes and oils to create
vibrant, long-lasting artworks. Remember to paint 'fat over lean,'
meaning each subsequent layer should contain more oil to prevent
cracking. Allow each layer to dry before applying another. Clean
your brushes often and avoid solvents that might damage them.
Finally, always work in a well-ventilated space.

Tras completar este tutorial, habrá integrado correctamente MongoDB Vector Search con LangChainGo para crear una aplicación RAG. Ha logrado lo siguiente:

Inició y configuró el entorno necesario para soportar su aplicación
Se almacenaron datos personalizados en MongoDB y se creó una instancia de MongoDB como almacén de vectores.
Creó un índice de búsqueda vectorial MongoDB en sus datos, lo que habilita capacidades de búsqueda semántica
Se utilizaron incrustaciones vectoriales para recuperar datos semánticamente relevantes
Resultados de búsqueda mejorados al incorporar filtros de metadatos
Implementé un flujo de trabajo RAG utilizando MongoDB Vector Search para proporcionar respuestas significativas a preguntas basadas en sus datos

Próximos pasos

Para obtener más información sobre cómo comenzar a utilizar MongoDB Vector Search, consulte el Inicio rápido de MongoDB Vector Search y luego seleccione Go en el menú desplegable.
Para obtener más información sobre las incrustaciones vectoriales, consulte Cómo crear incrustaciones vectoriales y luego seleccione Go en el menú desplegable.
Para aprender a integrar LangChainGo y Hugging Face, consulta Generación de recuperación aumentada (RAG) con MongoDB.
Para aprender cómo implementar RAG sin la necesidad de claves API o créditos, consulte Construir una implementación de RAG local con MongoDB Vector Search.

MongoDB también proporciona el siguiente recurso para desarrolladores:

Centro de aprendizaje de IA

Tip

Para obtener más información sobre la integración de LangChainGo, OpenAI y MongoDB,consulte Uso de MongoDB Atlas como un almacén vectorial con incrustaciones de OpenAI.

Volver

LangChain JS/TS

LangChain4j