Join us at MongoDB.local London on 7 May to unlock new possibilities for your data. Use WEB50 to save 50%.
Register now >
Docs Menu
Docs Home
/ /
/ / /

Model Relationships and Denormalization

In this guide, you can learn how to model relationships between collections in Django MongoDB Backend. Because MongoDB is a document database, Django MongoDB Backend provides embedded model fields that store related data in a single document instead of across multiple collections. Django MongoDB Backend also provides an array field, which allows you to store a list of related scalar values in the parent document.

This guide explains when to use each approach and provides examples to demonstrate the strategies.

The examples in this guide define models that represent the following collections in the sample_mflix database:

  • movies: Stores information about movies

  • embedded_movies: Extends movies with vector plot embeddings

  • users: Stores information about movie viewers

  • comments: Stores information about movie comments

To learn more about the sample_mflix database, see Sample Mflix Dataset in the MongoDB Atlas documentation.

Relational databases normalize data into separate tables and use joins to combine related data at query time. MongoDB's document model allows you to embed related data directly inside a parent document instead, which is called denormalization. Django MongoDB Backend supports both approaches, but we recommend that you denormalize your data for better performance.

To denormalize, choose one of the following strategies:

  • Embed Related Models: Store related data inside the parent document by using an EmbeddedModelField or EmbeddedModelArrayField. Embedded data is stored in the same MongoDB document as the parent and retrieved in a single read operation.

  • Store Array Data: Store a list of related scalar values directly in the parent document by using an ArrayField. Array data is retrieved in a single read operation without $lookup operations.

Use an embedded model when all of the following conditions apply:

  • The related data is always read together with the parent document.

  • The related data belongs to a single parent and is not shared across multiple documents.

  • The number of related items is bounded and predictable.

To embed related data, define an embedded model class as a subclass of EmbeddedModel, then use an EmbeddedModelField to store it in the parent model.

The following example defines a Director embedded model, and then defines a Movie model that stores an instance of the Director model:

from django.db import models
from django_mongodb_backend.models import EmbeddedModel
from django_mongodb_backend.fields import EmbeddedModelField
class Director(EmbeddedModel):
first_name = models.CharField(max_length=100)
last_name = models.CharField(max_length=100)
class Movie(models.Model):
title = models.CharField(max_length=200)
director = EmbeddedModelField(Director, null=True, blank=True)

To learn more about embedded models, see Store Embedded Model Data in the Create Models guide.

Use an ArrayField to store a list of related scalar values directly on the parent document. Array data is stored in the same MongoDB document and retrieved in a single read operation without performing $lookup operations. Use this strategy when the following conditions apply:

  • The related values are simple scalars, such as strings or integers.

  • The array size is bounded and predictable.

The following example defines a Movie model that stores a list of strings in the cast field:

from django.db import models
from django_mongodb_backend.fields import ArrayField
class Movie(models.Model):
title = models.CharField(max_length=200)
cast = ArrayField(models.CharField(max_length=100), blank=True)

To store arrays of structured documents instead of scalar values, use an EmbeddedModelArrayField instead. To learn more, see Store Embedded Model Array Data in the Create Models guide.

Important

Queries across relational fields use MongoDB's $lookup operator, which can be slow for large collections. When possible, use embedded models instead. To learn more about performance considerations, see Performance Limitations.

If your data is structured similarly to a relational database and you need to model large hierarchical datasets, you can use the following relational Django fields that link documents across separate collections:

Important

You cannot use relational fields inside embedded model classes or as the base_field of an ArrayField.

Use a ForeignKey field to create a many-to-one relationship between two models. Each document in the referencing model links to one document in the referenced model. Pass the following arguments to the ForeignKey() constructor:

  • to: The model class to link to

  • on_delete: The deletion behavior when you delete the referenced document

The following example links a Comment model to a Movie model by using a ForeignKey field. When you delete a sample_mflix.movies document, all related sample_mflix.comments documents are also deleted:

from django.db import models
class Movie(models.Model):
title = models.CharField(max_length=200)
class Comment(models.Model):
movie = models.ForeignKey(
Movie,
on_delete=models.CASCADE,
)
text = models.TextField()

Tip

To learn more about on_delete options, see ForeignKey.on_delete in the Django documentation.

Use a OneToOneField field to create a one-to-one relationship between two models. Each document in the referencing model links to exactly one document in the referenced model. Pass the following arguments to the OneToOneField() constructor:

  • to: The model class to link to

  • on_delete: The deletion behavior when the referenced document is deleted

The following example defines an EmbeddedMovie model that links to a Movie model by using a OneToOneField field. When you delete a sample_mflix.movies document, its linked sample_mflix.embedded_movies document is also deleted:

from django.db import models
from django_mongodb_backend.fields import ArrayField
class Movie(models.Model):
title = models.CharField(max_length=200)
class EmbeddedMovie(models.Model):
movie = models.OneToOneField(
Movie,
on_delete=models.CASCADE,
)
plot_embedding = ArrayField(models.FloatField(), blank=True)

Use a ManyToManyField field to create a many-to-many relationship between two models. Each document in either model can link to multiple documents in the other model. Pass the model class to link to as the first argument to the ManyToManyField() constructor.

The following example links a Viewer model to a Movie model. Each viewer can watch multiple movies, and each movie can be watched by multiple viewers:

from django.db import models
class Movie(models.Model):
title = models.CharField(max_length=200)
class Viewer(models.Model):
name = models.CharField(max_length=100)
email = models.CharField(max_length=200)
watched = models.ManyToManyField(Movie, blank=True)

If you have existing models that use relational fields and want to improve read performance, you can convert them to use embedded models instead. This eliminates $lookup operations and stores all related data in a single MongoDB document.

This example defines a Movie model that has the following relational fields:

from django.db import models
class Director(models.Model):
first_name = models.CharField(max_length=100)
last_name = models.CharField(max_length=100)
class Award(models.Model):
wins = models.IntegerField(default=0)
nominations = models.IntegerField(default=0)
text = models.CharField(max_length=100)
class Writer(models.Model):
first_name = models.CharField(max_length=100)
last_name = models.CharField(max_length=100)
class Movie(models.Model):
title = models.CharField(max_length=200)
director = models.ForeignKey(
Director,
null=True,
on_delete=models.SET_NULL,
)
awards = models.OneToOneField(
Award,
null=True,
on_delete=models.SET_NULL,
)
writers = models.ManyToManyField(Writer, blank=True)

To convert these relational fields to embedded fields, change each related model to extend EmbeddedModel instead of models.Model, then replace the Movie model's relational fields with the corresponding embedded fields:

  • Replace the ForeignKey with an EmbeddedModelField

  • Replace the OneToOneField with an EmbeddedModelField

  • Replace the ManyToManyField with an EmbeddedModelArrayField

The following example shows the converted Movie model:

from django.db import models
from django_mongodb_backend.models import EmbeddedModel
from django_mongodb_backend.fields import (
EmbeddedModelField,
EmbeddedModelArrayField,
)
class Director(EmbeddedModel):
first_name = models.CharField(max_length=100)
last_name = models.CharField(max_length=100)
class Award(EmbeddedModel):
wins = models.IntegerField(default=0)
nominations = models.IntegerField(default=0)
class Writer(EmbeddedModel):
first_name = models.CharField(max_length=100)
last_name = models.CharField(max_length=100)
class Movie(models.Model):
title = models.CharField(max_length=200)
director = EmbeddedModelField(Director, null=True, blank=True)
awards = EmbeddedModelField(Award, null=True, blank=True)
writers = EmbeddedModelArrayField(Writer, blank=True)

To learn how to query data across related models, see Advanced Field Queries in the Specify a Query guide.

To learn more about Django relational fields, see the Model field reference in the Django documentation.

To learn more about embedded models, see Store Embedded Model Data in the Create Models guide.

Back

Geospatial Models

On this page