Docs Menu

Docs HomeDevelop ApplicationsMongoDB Manual

Embedded Data Versus References

On this page

  • Embedded Data Models
  • Use Cases
  • Query Embedded Data
  • References
  • Use Cases
  • Query Normalized Data Models
  • Learn More

Effective data models support your application's needs. One key decision for your schema design is whether to embed data or use references.

You can embed related data in a single document. In the following example, the contact and access fields are embedded documents:

Data model with embedded fields that contain all related information.

Embedded data models are often denormalized, because frequently-accessed data is duplicated in multiple collections.

Embedded data models let applications query related pieces of information in the same database record. As a result, applications require fewer queries and updates to complete common operations.

Use embedded data models in the following scenarios:

Embedding provides the following benefits:

  • Better performance for read operations

  • The ability to retrieve related data in a single database operation

  • The ability to to update related data in a single atomic write operation

To query data within embedded documents, use dot notation. For examples of querying data in arrays and embedded documents, see:


Document Size Limit

Documents in MongoDB must be smaller than 16 megabytes.

For large binary data, consider GridFS.

References store relationships between data by including links, called references, from one document to another. In the following example, the contact and access documents contain a reference to the user document.

Data model using references to link documents. Both the ``contact`` document and the ``access`` document contain a reference to the ``user`` document.

References result in normalized data models because data is divided into multiple collections and not duplicated.

Use references to link related data in the following scenarios:

  • Embedding would result in duplication of data but would not provide sufficient read performance advantages to outweigh the implications of the duplication. For example, when the embedded data frequently changes.

  • You need to represent complex many-to-many relationships or large hierarchical data sets.

  • The related entity is frequently queried on its own. For example, if you have employee and department data, you may consider embedding department information in the employee documents. However, if you often query for a list of departments, your application will perform best with a separate department collection that is linked to the employee collection with a reference.

To query normalized data in multiple collections, MongoDB provides the following aggregation stages:

For an example of normalized data models, see Model One-to-Many Relationships with Document References.

For examples of various tree models, see Model Tree Structures.

For more information on data modeling with MongoDB, download the MongoDB Application Modernization Guide.

The download includes the following resources:

  • Presentation on the methodology of data modeling with MongoDB

  • White paper covering best practices and considerations for migrating to MongoDB from an RDBMS data model

  • Reference MongoDB schema with its RDBMS equivalent

  • Application Modernization scorecard

← Data Modeling Concepts