Perform Semantic Search with Atlas Vector Search
Overview
You can perform semantic search on data in your Atlas cluster running MongoDB v6.0.11, v7.0.2, or later using Atlas Vector Search. You can store vector embeddings for any kind of data along with other data in your collection on the Atlas cluster. Atlas Vector Search supports embeddings that are less than and equal to 2048 dimensions in width.
When you define an Atlas Vector Search index on your collection, you can seamlessly index vector data along with your other data and then perform semantic search against the indexed fields.
Atlas Vector Search uses the Hierarchical Navigable Small Worlds algorithm to perform the semantic search. You can use Atlas Vector Search support for aNN queries to search for results similar to a selected product, search for images, etc.
About Atlas Vector Search Indexes
You must index the fields that contain vector embeddings of BSON
double
data type for performing semantic search against the vector
data. To perform a vector search against your data, create an Atlas Vector Search
vectorSearch
type index for your data. In the Atlas Vector Search index
definition, you must index the field that contains the vector data as
the vector
type.
You can also optionally index fields for pre-filtering the data against
which you want to run your Atlas Vector Search query. You can pre-filter your data
by boolean, numeric, and string values. To pre-filter the data, you must
index your boolean, numeric, and string fields as the filter
type in
the Atlas Vector Search index definition that indexes your vector data field.
To learn more about indexing your fields for Atlas Vector Search, see How to Index Fields for Vector Search.
About Atlas Vector Search Queries
Atlas Vector Search queries take the shape of an aggregation pipeline stage. The Atlas Vector Search $vectorSearch
stage must be the
first stage in the pipeline. You can run Atlas Vector Search queries only against
fields indexed as the vector
type in a vectorSearch
type index.
You can pre-filter data against which you want to perform semantic
search using an MQL match expression
that compares an indexed field with boolean, number, or string values.
Atlas Vector Search supports some comparison query and aggregation
pipeline operators in the filter in
your $vectorSearch
query.
To learn more about pre-filtering and querying your data using Atlas Vector Search, see Run Vector Search Queries.
Procedure
This section describes how to index vector embeddings in your data on an Atlas cluster and run queries that search vector embeddings.
Define your Atlas Vector Search index.
Index the field that contains vector data in your collection as the vector type in a vectorSearch type index definition. You can optionally also index boolean, numeric, and string fields that you want to pre-filter your data by as the filter type in the same index.
{ "name": "<index-name>", "type": "vectorSearch", "fields":[ { "type": "vector", "path": "<field-name>", "numDimensions": <number-of-dimensions>, "similarity": "euclidean | cosine | dotProduct" }, { "type": "filter", "path": "<field-name>" }, ... ] }
To learn more, see How to Index Fields for Vector Search.
Construct your vector search query.
Use the $vectorSearch
pipeline stage in your query.
{ "$vectorSearch": { "index": "<index-name>", ... } }
To learn more about this pipeline stage, see Run Vector Search Queries.
(Optional) Specify whether you want the score for the documents in the results.
Use vectorSearchScore
in the $project
stage after
your $vectorSearch
stage to retrieve the score for the
documents in the results. To learn more, see Score the Documents in the Results.
Run your query.
Verify your query syntax and then run it using mongosh
or a
supported driver.