Overview
In this guide, you can learn how to model relationships between collections in Django MongoDB Backend. Because MongoDB is a document database, Django MongoDB Backend provides embedded model fields that store related data in a single document instead of across multiple collections. Django MongoDB Backend also provides an array field, which allows you to store a list of related scalar values in the parent document.
This guide explains when to use each approach and provides examples to demonstrate the strategies.
Sample Data
The examples in this guide define models that represent the following collections
in the sample_mflix database:
movies: Stores information about moviesembedded_movies: Extendsmovieswith vector plot embeddingsusers: Stores information about movie viewerscomments: Stores information about movie comments
To learn more about the sample_mflix database, see Sample Mflix Dataset in the MongoDB Atlas documentation.
Denormalization Strategies
Relational databases normalize data into separate tables and use joins to combine related data at query time. MongoDB's document model allows you to embed related data directly inside a parent document instead, which is called denormalization. Django MongoDB Backend supports both approaches, but we recommend that you denormalize your data for better performance.
To denormalize, choose one of the following strategies:
Embed Related Models: Store related data inside the parent document by using an
EmbeddedModelFieldorEmbeddedModelArrayField. Embedded data is stored in the same MongoDB document as the parent and retrieved in a single read operation.Store Array Data: Store a list of related scalar values directly in the parent document by using an
ArrayField. Array data is retrieved in a single read operation without$lookupoperations.
Embed Related Models
Use an embedded model when all of the following conditions apply:
The related data is always read together with the parent document.
The related data belongs to a single parent and is not shared across multiple documents.
The number of related items is bounded and predictable.
To embed related data, define an embedded model class as a subclass of
EmbeddedModel, then use an EmbeddedModelField
to store it in the parent model.
The following example defines a Director embedded model, and then
defines a Movie model that stores an instance of the Director
model:
from django.db import models from django_mongodb_backend.models import EmbeddedModel from django_mongodb_backend.fields import EmbeddedModelField class Director(EmbeddedModel): first_name = models.CharField(max_length=100) last_name = models.CharField(max_length=100) class Movie(models.Model): title = models.CharField(max_length=200) director = EmbeddedModelField(Director, null=True, blank=True)
To learn more about embedded models, see Store Embedded Model Data in the Create Models guide.
Store Array Data
Use an ArrayField
to store a list of related scalar values directly on the parent document.
Array data is stored in the same MongoDB document and retrieved in a
single read operation without performing $lookup operations. Use
this strategy when the following conditions apply:
The related values are simple scalars, such as strings or integers.
The array size is bounded and predictable.
The following example defines a Movie model that stores a list of
strings in the cast field:
from django.db import models from django_mongodb_backend.fields import ArrayField class Movie(models.Model): title = models.CharField(max_length=200) cast = ArrayField(models.CharField(max_length=100), blank=True)
To store arrays of structured documents instead of scalar values, use an EmbeddedModelArrayField instead. To learn more, see Store Embedded Model Array Data in the Create Models guide.
Relational Fields
Important
Queries across relational fields use MongoDB's $lookup
operator, which can be slow for large collections. When possible,
use embedded models instead. To learn more about performance
considerations, see Performance Limitations.
If your data is structured similarly to a relational database and you need to model large hierarchical datasets, you can use the following relational Django fields that link documents across separate collections:
Important
You cannot use relational fields inside embedded model classes or as
the base_field of an ArrayField.
ForeignKey
Use a ForeignKey field to create a
many-to-one relationship between two models. Each document in the referencing model links to
one document in the referenced model. Pass the following arguments
to the ForeignKey() constructor:
to: The model class to link toon_delete: The deletion behavior when you delete the referenced document
The following example links a Comment model to a Movie model
by using a ForeignKey field. When you delete a
sample_mflix.movies document, all related
sample_mflix.comments documents are also deleted:
from django.db import models class Movie(models.Model): title = models.CharField(max_length=200) class Comment(models.Model): movie = models.ForeignKey( Movie, on_delete=models.CASCADE, ) text = models.TextField()
Tip
To learn more about on_delete options, see
ForeignKey.on_delete
in the Django documentation.
OneToOneField
Use a OneToOneField field to create a
one-to-one relationship between two models. Each document in the referencing model links to
exactly one document in the referenced model. Pass the following
arguments to the OneToOneField() constructor:
to: The model class to link toon_delete: The deletion behavior when the referenced document is deleted
The following example defines an EmbeddedMovie model that links to
a Movie model by using a OneToOneField field. When you delete
a sample_mflix.movies document, its linked
sample_mflix.embedded_movies document is also deleted:
from django.db import models from django_mongodb_backend.fields import ArrayField class Movie(models.Model): title = models.CharField(max_length=200) class EmbeddedMovie(models.Model): movie = models.OneToOneField( Movie, on_delete=models.CASCADE, ) plot_embedding = ArrayField(models.FloatField(), blank=True)
ManyToManyField
Use a ManyToManyField field to create
a many-to-many relationship between two models. Each document in either model can link to multiple
documents in the other model. Pass the model class to link to as the
first argument to the ManyToManyField() constructor.
The following example links a Viewer model to a Movie model.
Each viewer can watch multiple movies, and each movie can be
watched by multiple viewers:
from django.db import models class Movie(models.Model): title = models.CharField(max_length=200) class Viewer(models.Model): name = models.CharField(max_length=100) email = models.CharField(max_length=200) watched = models.ManyToManyField(Movie, blank=True)
Convert Relational to Embedded Data
If you have existing models that use relational fields and want to
improve read performance, you can convert them to use embedded models
instead. This eliminates $lookup operations and stores all related
data in a single MongoDB document.
This example defines a Movie model that has
the following relational fields:
ForeignKey, which links toDirectorOneToOneField, which links toAwardManyToManyField, which links toWriter
from django.db import models class Director(models.Model): first_name = models.CharField(max_length=100) last_name = models.CharField(max_length=100) class Award(models.Model): wins = models.IntegerField(default=0) nominations = models.IntegerField(default=0) text = models.CharField(max_length=100) class Writer(models.Model): first_name = models.CharField(max_length=100) last_name = models.CharField(max_length=100) class Movie(models.Model): title = models.CharField(max_length=200) director = models.ForeignKey( Director, null=True, on_delete=models.SET_NULL, ) awards = models.OneToOneField( Award, null=True, on_delete=models.SET_NULL, ) writers = models.ManyToManyField(Writer, blank=True)
To convert these relational fields to embedded fields, change each
related model to extend EmbeddedModel instead of
models.Model, then replace the Movie model's relational fields with
the corresponding embedded fields:
Replace the
ForeignKeywith anEmbeddedModelFieldReplace the
OneToOneFieldwith anEmbeddedModelFieldReplace the
ManyToManyFieldwith anEmbeddedModelArrayField
The following example shows the converted Movie model:
from django.db import models from django_mongodb_backend.models import EmbeddedModel from django_mongodb_backend.fields import ( EmbeddedModelField, EmbeddedModelArrayField, ) class Director(EmbeddedModel): first_name = models.CharField(max_length=100) last_name = models.CharField(max_length=100) class Award(EmbeddedModel): wins = models.IntegerField(default=0) nominations = models.IntegerField(default=0) class Writer(EmbeddedModel): first_name = models.CharField(max_length=100) last_name = models.CharField(max_length=100) class Movie(models.Model): title = models.CharField(max_length=200) director = EmbeddedModelField(Director, null=True, blank=True) awards = EmbeddedModelField(Award, null=True, blank=True) writers = EmbeddedModelArrayField(Writer, blank=True)
Additional Information
To learn how to query data across related models, see Advanced Field Queries in the Specify a Query guide.
To learn more about Django relational fields, see the Model field reference in the Django documentation.
To learn more about embedded models, see Store Embedded Model Data in the Create Models guide.