vector field type and vectorSearch operator are available as Preview features. The feature and the corresponding documentation might change at any time during the Preview period. To learn more, see Preview Features.You can use the vector type to index vector embeddings. The
vector field must contain an array of numbers of the following types:
BSON
int32,int64, ordoubledata typesBSON
doubledata type
You can use the vectorSearch operator,
similar to the $vectorSearch stage, in your
$search aggregation pipeline to query fields indexed as
the vector type.
vector Type Limitations
The following limitations apply:
You can't index fields with arrays of objects (MongoDB Search
embeddedDocumentstype) asvectortype.You can't set storedSource to
truein index definitions that containvectortype. Instead, useincludeto specify the fields to store onmongotor useexcludeto exclude thevectortype field from storage.You can't use the
$vectorSearchstage to query fields indexed as thevectortype.
Define the Index for the vector Type
Configure vector Field Properties
Configure vector Field Properties
The MongoDB Search vector type takes the following parameters:
Option | Type | Necessity | Description |
|---|---|---|---|
|
| Required | Human-readable label that identifies this field type.
Value must be |
| Int | Required | Number of vector dimensions that MongoDB Search enforces at index-time and
query-time. You can set this field only for For indexing quantized vectors or BinData, you can specify one of the following values:
The embedding model you choose determines the number of dimensions in your vector embeddings, with some models having multiple options for how many dimensions are output. To learn more, see Choosing a Method to Create Embeddings. |
| String | Required | Vector similarity function to use to search for top K-nearest
neighbors. You can set this field only for You can specify one of the following values:
To learn more, see About the Similarity Functions. |
| String | Optional | Type of automatic vector quantization for your vectors. Use
this setting only if your embeddings are You can specify one of the following values:
To learn more, see Vector Quantization. |
| Object | Optional | Parameters to use for Hierarchical Navigable Small Worlds graph construction. If omitted, uses
the default values for the IMPORTANT: This is available as a Preview feature. Modifying the default values might negatively impact your MongoDB Search index and queries. |
hnswOptions.maxEdges | Int | Optional | Maximum number of edges (or connections) that a node can have in
the Hierarchical Navigable Small Worlds graph. Value can be between A higher number improves recall (accuracy of search results) because the graph is better connected. However, this also increases query and indexing time by increasing the number of neighbors to evaluate per graph node, and requires more memory to store the additional nodes for each connection in the Hierarchical Navigable Small Worlds graph. |
hnswOptions.numEdgeCandidates | Int | Optional | Analogous to A higher number provides a graph with high-quality connections, which can improve search quality (recall), but it can also increase query latency. |
Try an Example for the vector Type
The following index definition example uses the
sample_mflix.embedded_movies collection in the sample data. After you load the collection, you can use the
following example to index the plot_embedding_voyage_3_large
field as the vector type for running queries using the
vectorSearch (MongoDB Search Operator). For a sample query to run against this
index, see Examples.
This index definition automatically indexes all the dynamically indexable fields using the default typeSet and also indexes the
plot_embedding_voyage_3_large field as vector type with the
following settings:
2048number of dimensionsdotProductsimilarity functionscalarquantization