MongoDB Vector Search Quick Start

MongoDB Search and MongoDB Vector Search with MongoDB Community is in Preview. The feature and the corresponding documentation might change at any time during the Preview period. To learn more, see Preview Features.

Deployment Type

Learning Summary

This quick start focused on retrieving documents from your cluster that contain text that is semantically related to a provided query. However, you can create a vector search index on embeddings that represent any type of data that you might write to your cluster, such as images or videos.

Sample Data

This quick start uses the sample_mflix.embedded_movies collection which contains details about movies. In each document in the collection, the plot_embedding_voyage_3_large field contains a vector embedding that represents the string in the plot field. For more information on the schema of the documents in the collection, see Sample Mflix Dataset.

By storing your source data and its corresponding vector embeddings in the same document, you can leverage both fields for complex queries or hybrid search. You can even store vector embeddings generated from different embedding models in the same document to streamline your workflow as you test the performance of different vector embedding models for your specific use case.

Vector Embeddings

The vector embeddings in the sample_mflix.embedded_movies collection and in the example query were created using the Voyage AI voyage-3-large embedding model. Your choice of embedding model informs the vector dimensions and vector similarity function you use in your vector search index. You can use any embedding model you like, and it is worth experimenting with different models as accuracy can vary from model to model depending on your specific use case.

To learn how to create vector embeddings of your own data, see How to Create Vector Embeddings.

Vector Index Definition

An index is a data structure that holds a subset of data from a collection's documents that improves database performance for specific queries. A vector search index points to the fields that contain your vector embeddings and includes the dimensions of your vectors as well as the function used to measure similarity between vectors of queries and vectors stored in the database.

Because the voyage-3-large embedding model used in this quick start converts data into vector embeddings with 2048 dimensions and supports the cosine function, this vector search index specifies the same number of vector dimensions and similarity function.

Vector Search Query

The query you ran in this quick start is an aggregation pipeline, in which the $vectorSearch stage performs an Approximate Nearest Neighbor (ANN) search followed by a $project stage that refines the results. To see all the options for a vector search query, including using Exact Nearest Neighbor (ENN) or how to narrow the scope of your vector search with the filter option, see Run Vector Search Queries.

Next Steps

To learn how to create embeddings from data and load them into Atlas, see Create Embeddings.
To learn how to implement retrieval-augmented generation (RAG), see Retrieval-Augmented Generation (RAG) with MongoDB.
To integrate MongoDB Vector Search with popular AI frameworks and services, see MongoDB AI Integrations.
To build production ready AI chatbots using MongoDB Vector Search, see the MongoDB Chatbot Framework.
To learn how to implement RAG without the need for API keys or credits, see Build a Local RAG Implementation with MongoDB Vector Search.

Back

Vector Search

Compatibility & Limitations