Overview
In this guide, you can learn how to use Django MongoDB Backend to perform Atlas Vector Search queries. This feature allows you to perform a semantic search on your documents. A semantic search is a type of search that locates information that is similar in meaning, but not necessarily identical, to your provided search term or phrase.
Sample Data
The examples in this guide use the MovieWithEmbeddings
model, which represents
the sample_mflix.embedded_movies
collection from the Atlas sample datasets. The MovieWithEmbeddings
model class has the following definition:
from django.db import models from django_mongodb_backend.fields import ArrayField class MovieWithEmbeddings(models.Model): title = models.CharField(max_length=200) runtime = models.IntegerField(default=0) plot_embedding = ArrayField(models.FloatField(), size=1536, null=True, blank=True) class Meta: db_table = "embedded_movies" managed = False def __str__(self): return self.title
The MovieWithEmbeddings
model includes an inner Meta
class, which specifies
model metadata, and a __str__()
method, which defines the
model's string representation. To learn about these
model features, see Define a Model in the
Create Models guide.
Run Code Examples
You can use the Python interactive shell to run the code examples. To enter the shell, run the following command from your project's root directory:
python manage.py shell
After entering the Python shell, ensure that you import the following models and modules:
from <your application name>.models import MovieWithEmbeddings
To learn how to create a Django application that uses the Movie
model and the Python interactive shell to interact with MongoDB documents,
visit the Get Started tutorial.
Perform a Vector Search
Important
Query Requirements
Before you can perform Atlas Vector Search queries, you must create an Atlas Vector Search index on your collection. To learn how to use Django MongoDB Backend to create an Atlas Vector Search index, see Atlas Vector Search Indexes in the Create Indexes guide.
You can use Django MongoDB Backend to query your data based on its semantic meaning. An Atlas Vector Search query returns results based on a query vector, or an array of numbers that represents the meaning of your search term or phrase. MongoDB compares this query vector to the vectors stored in your documents' vector fields.
To specify your Atlas Vector Search criteria, create an instance of the SearchVector
expression class provided by the django_mongodb_backend.expressions
module.
This expression corresponds to the $vectorSearch
MongoDB pipeline
stage. Pass the following arguments to the SearchVector()
constructor:
path
: The field to query.query_vector
: An array of numbers that represents your search criteria. To learn more about query vectors, see vectors in the MongoDB Atlas documentation.limit
: The maximum number of results to return.num_candidates
: (Optional) The number of documents to consider for the query.exact
: (Optional) A boolean value that indicates whether to perform an Exact Nearest Neighbor (ENN) search. The default value isFalse
. To learn more about ENN searches, see ENN (Exact Nearest Neighbor) Search in the MongoDB Atlas documentation.filter
: (Optional) A filter to apply to the query results.
Then, run your Atlas Vector Search query by passing your SearchVector
instance to
the annotate
method from Django's
QuerySet
API. The following code shows the syntax for performing an Atlas Vector
Search query:
from django_mongodb_backend.expressions import SearchVector Model.objects.annotate( score=SearchVector( path="<field name>", query_vector=[<vector values>], limit=<number>, num_candidates=<number>, exact=<boolean>, filter=<filter expression> ) )
Basic Vector Search Example
This example runs an Atlas Vector Search query on the sample_mflix.embedded_movies
collection. The query performs the following actions:
Queries the
plot_embedding
vector field.Limits the results to
5
documents.Specifies an Approximate Nearest Neighbor (ANN) vector search that considers
150
candidates. To learn more about ANN searches, see ANN (Approximate Nearest Neighbor) Search in the MongoDB Atlas documentation.
vector_values = [float(i % 10) * 0.1 for i in range(1536)] MovieWithEmbeddings.objects.annotate( score=SearchVector( path="plot_embedding", query_vector=vector_values, limit=5, num_candidates=150, exact=False, ) )
<QuerySet [<MovieWithEmbeddings: Berserk: The Golden Age Arc I - The Egg of the King>, <MovieWithEmbeddings: Rollerball>, <MovieWithEmbeddings: After Life>, <MovieWithEmbeddings: What Women Want>, <MovieWithEmbeddings: Truth About Demons>]>
Tip
The preceding code example passes an arbitrary vector to the
query_vector
argument. To learn how to generate a vector that
represents the meaning of a search term or phrase, see
How to Create Vector Embeddings in the MongoDB Atlas documentation.
Vector Search Score Example
MongoDB assigns a relevance score to every document returned in an Atlas Search query. The documents included in a result set are ordered from highest to lowest relevance score.
To include this score in your query results, you can use the values()
method
from Django's QuerySet
API. Pass the score
field as an argument
to the values()
method.
The following example shows how to run the same vector search query as the preceding example and print the documents' vector search relevance scores:
vector_values = [float(i % 10) * 0.1 for i in range(1536)] MovieWithEmbeddings.objects.annotate( score=SearchVector( path="plot_embedding", query_vector=vector_values, limit=5, num_candidates=150, exact=False, ) ).values("title", "score")
<QuerySet [{'title': 'Berserk: The Golden Age Arc I - The Egg of the King', 'score': 0.47894009947776794}, {'title': 'Rollerball', 'score': 0.45006513595581055}, {'title': 'After Life', 'score': 0.42825883626937866}, {'title': 'What Women Want', 'score': 0.4211753308773041}, {'title': 'Truth About Demons', 'score': 0.4194544851779938}]>
Additional Information
To learn more about Atlas Vector Search, see the following resources from the MongoDB Atlas documentation: