/ /

MongoDB CRUD Concepts

Docs Home

Development

CRUD Operations

MongoDB CRUD Concepts

Docs Home

Development

CRUD Operations

MongoDB CRUD Concepts

Distributed Queries

Learn how MongoDB handles read and write operations in replica sets and sharded clusters. This page explains the importance of cluster architecture on query routing and data consistency.

Read Operations to Replica Sets

By default, clients read from a replica set's primary. However, you can specify a read preference to direct read operations to other members. For example, configure read preferences to read from secondaries or from the nearest member to:

Reduce latency in deployments with multiple data centers.
Perform backup operations.
Allow reads until a new primary is elected.

Default read preference routes to primary; ``nearest`` routes to closest member.

click to enlarge

Read operations from secondary members of replica sets may not reflect the current state of the primary. Read preferences that direct read operations to different servers may result in non-monotonic reads.

Clients can use causally consistent sessions, which provides various guarantees including monotonic reads.

You can configure the read preference on a per-connection or per-operation basis. For more information on read preference or on the read preference modes, see Read Preference and Read Preference Modes.

Write Operations on Replica Sets

In replica sets, all write operations go to the set's primary. The primary applies the write operation and records the operations on the primary's operation log or oplog. The oplog is a reproducible sequence of operations to the data set. secondary members of the set continuously replicate the oplog and apply the operations to themselves in an asynchronous process.

Diagram of default routing of reads and writes to the primary.

click to enlarge

For more information on replica sets and write operations, see Replication and Write Concern.

Read Operations to Sharded Clusters

Sharded clusters allow you to partition a data set among a cluster of mongod instances in a way that is nearly transparent to the application. For an overview of sharded clusters, see the Sharding section of this manual.

For a sharded cluster, applications issue operations to one of the mongos instances associated with the cluster.

click to enlarge

Read operations on sharded clusters are most efficient when directed to a specific shard. Queries to sharded collections should include the collection's shard key. When a query includes a shard key, the mongos can use cluster metadata from the config database to route the queries to shards.

Targeted query: ``mongos`` routes to specific shards when the shard key is included.

If a query does not include the shard key, the mongos must direct the query to all shards in the cluster. These scatter gather queries can be inefficient. On larger clusters, scatter gather queries are unfeasible for routine operations.

Scatter-gather query: ``mongos`` broadcasts to all shards when no shard key is specified.

For replica set shards, read operations from secondary members of replica sets may not reflect the current state of the primary. Read preferences that direct read operations to different servers may result in non-monotonic reads.

Note

Clients can use causally consistent sessions, which provides various guarantees, including monotonic reads.
All members of a shard replica set, not just the primary, maintain the metadata regarding chunk metadata. This prevents reads from the secondaries from returning orphaned data if not using read concern "available". In earlier versions, reads from secondaries, regardless of the read concern, could return orphaned documents.

For more information on read operations in sharded clusters, see the Routing with mongos and Shard Keys sections.

Write Operations on Sharded Clusters

For sharded collections in a sharded cluster, the mongos directs write operations from applications to the shards that are responsible for the specific portion of the data set. The mongos uses the cluster metadata from the config database to route the write operation to the appropriate shards.

click to enlarge

MongoDB partitions data in a sharded collection into ranges based on the values of the shard key. Then, MongoDB distributes these chunks to shards. The shard key determines the distribution of chunks to shards. This can affect the performance of write operations in the cluster.

Diagram of the shard key value space segmented into smaller ranges or chunks.

Important

Update operations that affect a single document must include the shard key or the _id field. Updates that affect multiple documents are more efficient in some situations if they have the shard key, but can be broadcast to all shards.

If the value of the shard key increases or decreases with every insert, all insert operations target a single shard. As a result, the capacity of a single shard becomes the limit for the insert capacity of the sharded cluster.

For more information, see Sharding and Bulk Write Operations.

Tip

Retryable Writes

Change Streams and Orphan Documents

Starting in MongoDB 5.3, during range migration, change stream events are not generated for updates to orphaned documents.

Back

Atomicity & Transactions

Periods & Dollar Signs