Docs Menu
Docs Home
/ /

How to Create Vector Embeddings

You can store vector embeddings alongside your other MongoDB data. These embeddings capture meaningful relationships in your data and allow you to perform semantic search and implement RAG.

Use the following tutorial to learn how to create vector embeddings and query them using vector search. Specifically, you perform the following actions:

  1. Define a function that uses an embedding model to generate vector embeddings.

    Select whether you want to use a proprietary or open-source model. For state-of-the-art embeddings, use Voyage AI.

  2. Create embeddings from your data and store them in MongoDB.

    Select whether you want to create embeddings from new data or from existing data that you already have in a MongoDB collection.

  3. Create embeddings from your search terms and run a vector search query.

For production applications, you typically write a script to generate vector embeddings. You can start with the sample code on this page and customize it for your use case.

To complete this tutorial, you must have the following:

In this section, you create vector embeddings from your data using the function that you defined, and then you store these embeddings in a MongoDB collection.

In this section, you index the vector embeddings in your collection and create an embedding that you use to run a sample vector search query.

The vector search returns documents whose embeddings are closest in distance to the embedding from your query. This indicates that they are similar in meaning.

Consider the following factors when creating vector embeddings:

In order to create vector embeddings, you must use an embedding model. Embedding models are algorithms that you use to generate numerical representations of your data. Choose one of the following ways to access an embedding model:

Method
Description

Load an open-source model

If you don't have an API key for a proprietary embedding model, load an open-source embedding model locally from your application.

Use a proprietary model

Most AI providers offer APIs for their proprietary embedding models that you can use to create vector embeddings. For state-of-the-art embeddings, use Voyage AI.

Leverage an integration

You can integrate MongoDB Vector Search with open-source frameworks and AI services to quickly connect to both open-source and proprietary embedding models and generate vector embeddings for MongoDB Vector Search.

To learn more, see Integrate MongoDB with AI Technologies.

The embedding model you choose affects your query results and determines the number of dimensions you specify in your MongoDB Vector Search index. Each model offers different advantages depending on your data and use case. For state-of-the-art embeddings, including multi-modal and domain-specific embedding models, use Voyage AI.

When choosing an embedding model for MongoDB Vector Search, consider the following metrics:

  • Embedding Dimensions: The length of the vector embedding.

    Smaller embeddings are more storage efficient, while larger embeddings can capture more nuanced relationships in your data. The model you choose should strike a balance between efficiency and complexity.

  • Max Tokens: The number of tokens that can be compressed in a single embedding.

  • Model Size: The size of the model in gigabytes.

    While larger models perform better, they require more computational resources as you scale MongoDB Vector Search to production.

  • Retrieval Average: A score that measures the performance of retrieval systems.

    A higher score indicates that the model is better at ranking relevant documents higher in the list of retrieved results. This score is important when choosing a model for RAG applications.

If you have a large number of float vectors and want to reduce the storage and WiredTiger footprint (such as disk and memory usage) in mongod, compress your embeddings by converting them to binData vectors.

BinData is a BSON data type that stores binary data. The default type for vector embeddings is an array of 32-bit floats (float32). Binary data is more storage efficient than the default array format, and therefore requires three times less disk space.

Storing binData vectors improves query performance since less resources are needed to load a document into the working set. This can significantly improve query speed for vector queries where you are returning over 20 documents. If you compress your float32 embeddings, you can query them with either float32 or binData vectors.

The tutorial on this page includes an example function that you can use to convert your float32 vectors to binData vectors.

BSON BinData vectors are supported by the following drivers:

Float vectors are typically difficult to compress because each element in the array has its own type (despite most vectors being uniformly typed). For this reason, converting the float vector output of an embedding model to a binData vector with subtype float32 is a more efficient serialization scheme. binData vectors store a single type descriptor for the entire vector, which reduces storage overhead.

Consider the following strategies to ensure that your embeddings are correct and optimal:

Once you've learned how to create embeddings and query your embeddings with MongoDB Vector Search, start building generative AI applications by implementing retrieval-augmented generation (RAG):

You can also quantize your 32-bit float vector embeddings into fewer bits to further reduce resource consumption and improve query speed. To learn more, see Vector Quantization.

Back

Compatibility & Limitations

Earn a Skill Badge

Master "RAG with MongoDB" for free!

Learn more

On this page