Docs Menu

Docs HomeLaunch & Manage MongoDBMongoDB Atlas

knnBeta

On this page

  • Definition
  • Syntax
  • Options
  • Behavior
  • Examples
knnBeta

The knnBeta operator uses Hierarchical Navigable Small Worlds algorithm to perform semantic search. You can use Atlas Search support for kNN query to search similar to a selected product, search for images, etc.

knnBeta has the following syntax:

1{
2 $search: {
3 "index": "<index name>", // optional, defaults to "default"
4 "knnBeta": {
5 "vector": [<array-of-numbers>],
6 "path": "<field-to-search>",
7 "filter": {<filter-specification>},
8 "k": <number>,
9 "score": {<options>}
10 }
11 }
12}
Field
Type
Description
Necessity
filter
document
Any Atlas Search operator to filter the documents based on metadata or certain search criteria, which can help narrow down the scope of vector search.
Optional
k
number
Number of nearest neighbors to return. You can specify a number higher than the number of documents to return ($limit) to increase accuracy.
Required
path
string
Indexed knnVector type field to search. See Path Construction for more information.
Required
score
document
Score assigned to matching documents in the results. To learn more, see scoring behavior.
Optional
vector
array of numbers
Array of numbers of BSON types int or double that represent the query vector. The array size must match the number of vector dimensions specified in the index for the field.
Required

You can run kNN queries against fields that were indexed as Atlas Search type knnVector only.

You can use $limit after the $search stage to limit the number of documents in the knnBeta query results. We recommend setting the value for k higher than the value for $limit. This overrequest pattern is the main way to trade off latency and recall in your approximate nearest neighbor searches. Empirically, we have seen a multiplier of 5-10 work well for many use cases, but we recommend tuning this on your specific dataset.

Example

The following query finds 150 nearest neighbors to the query and limits the remaining number of results to 50.

1db.<collection>.aggregate({
2 "$search": {
3 "knnBeta": {
4 "vector": <array-of-numbers-to-search>,
5 "path": <indexed-field-to-search>,
6 "k": 150
7 }
8 }
9},
10{
11 "$limit": 50
12})

To improve query performance, use the $project stage to select the fields to return in the results, unless you need all the fields in the results. We recommend excluding the vector field in the $project stage.

You can use the score field with the $meta expression searchScore in the $project stage to return the score for the documents in the results.

Atlas Search scores the results for kNN queries in a fixed range from 0 to 1 only. For cosine and dotProduct similarities, Atlas Search normalizes the score using the following algorithm:

score = (1 + cosine/dot_product(v1,v2)) / 2

knnBeta operator must be the top-level operator in your queries and therefore, you can't use the knnBeta operator inside the following:

You can't use the knnBeta operator to query fields indexed using a vectorSearch type index. You can't use the $search sort option with the knnBeta operator.

We don't recommend paginating your Atlas Search results using $skip and $limit after the $search stage.

The following queries search the sample sample_mflix.embedded_movies collection using the knnBeta operator. The queries search the plot_embedding field, which contains embeddings created using OpenAI's text-embedding-ada-002 embeddings model. If you added the sample collection to your Atlas cluster and created the sample index definition for the collection, you can switch to the sample_mflix database and run the following queries against the collection.

←  inmoreLikeThis →