- Sharding >
- Sharding Concepts >
- Sharded Cluster Behavior >
- Shard Keys
Shard Keys¶
On this page
The shard key determines the distribution of the collection’s documents among the cluster’s shards. The shard key is either an indexed field or an indexed compound field that exists in every document in the collection.
MongoDB partitions data in the collection using ranges of shard key values. Each range, or chunk, defines a non-overlapping range of shard key values. MongoDB distributes the chunks, and their documents, among the shards in the cluster.

When a chunk grows beyond the chunk size, MongoDB attempts to split the chunk into smaller chunks, always based on ranges in the shard key.
Considerations¶
Shard keys are immutable and cannot be changed after insertion. See the system limits for sharded cluster for more information.
The index on the shard key cannot be a multikey index.
Hashed Shard Keys¶
New in version 2.4.
Hashed shard keys use a hashed index of a single field as the shard key to partition data across your sharded cluster.
The field you choose as your hashed shard key should have a good cardinality, or large number of different values. Hashed keys work well with fields that increase monotonically like ObjectId values or timestamps.
If you shard an empty collection using a hashed shard key, MongoDB
will automatically create and migrate chunks so that each shard has
two chunks. You can control how many chunks MongoDB will create with
the numInitialChunks
parameter to shardCollection
or
by manually creating chunks on the empty collection using the
split
command.
To shard a collection using a hashed shard key, see Shard a Collection Using a Hashed Shard Key.
Tip
MongoDB automatically computes the hashes when resolving queries using hashed indexes. Applications do not need to compute hashes.
Impacts of Shard Keys on Cluster Operations¶
The shard key affects write and query performance by determining how
the MongoDB partitions data in the cluster and how effectively the
mongos
instances can direct operations to the
cluster. Consider the following operational impacts of shard key
selection:
Write Scaling¶
Some possible shard keys will allow your application to take advantage of the increased write capacity that the cluster can provide, while others do not. Consider the following example where you shard by the values of the default _id field, which is ObjectId.
MongoDB generates ObjectId
values upon document creation to
produce a unique identifier for the object. However, the most
significant bits of data in this value represent a time stamp, which
means that they increment in a regular and predictable pattern. Even
though this value has high cardinality, when using this, any date, or
other monotonically increasing number as the shard key, all insert
operations will be storing data into a single chunk, and therefore, a
single shard. As a result, the write capacity of this shard will
define the effective write capacity of the cluster.
A shard key that increases monotonically will not hinder performance
if you have a very low insert rate, or if most of your write
operations are update()
operations
distributed through your entire data set. Generally, choose shard keys
that have both high cardinality and will distribute write operations
across the entire cluster.
Typically, a computed shard key that has some amount of “randomness,” such as ones that include a cryptographic hash (i.e. MD5 or SHA1) of other content in the document, will allow the cluster to scale write operations. However, random shard keys do not typically provide query isolation, which is another important characteristic of shard keys.
New in version 2.4: MongoDB makes it possible to shard a collection on a hashed index. This can greatly improve write scaling. See Shard a Collection Using a Hashed Shard Key.
Querying¶
The mongos
provides an interface for applications to
interact with sharded clusters that hides the complexity of data
partitioning. A mongos
receives queries from
applications, and uses metadata from the config server, to route queries to the mongod
instances with the appropriate data. While the mongos
succeeds in making all querying operational in sharded environments,
the shard key you select can have a profound affect on query
performance.
See also
The Sharded Cluster Query Routing and config server sections for a more general overview of querying in sharded environments.
Targeted Operations vs. Broadcast Operations¶
Generally, the fastest queries in a sharded environment are those that
mongos
will route to a single shard, using the
shard key and the cluster meta data from the config server. For queries that don’t include the shard
key, mongos
must query all shards, wait for their responses
and then return the result to the application. These “scatter/gather”
queries can be long running operations.
If your query includes the first component of a compound shard
key [1], the mongos
can route the
query directly to a single shard, or a small number of shards, which
provides better performance. Even if you query values of the shard
key that reside in different chunks, the mongos
will route
queries directly to specific shards.
To select a shard key for a collection:
- determine the most commonly included fields in queries for a given application
- find which of these operations are most performance dependent.
If this field has low cardinality (i.e not sufficiently selective) you should add a second field to the shard key making a compound shard key. The data may become more splittable with a compound shard key.
See
Sharded Cluster Query Routing for more information on query operations in the context of sharded clusters.
[1] | In many ways, you can think of the shard key a cluster-wide index. However, be aware that sharded systems cannot enforce cluster-wide unique indexes unless the unique field is in the shard key. Consider the Index Concepts page for more information on indexes and compound indexes. |
Sorting¶
In sharded systems, the mongos
performs a merge-sort of all
sorted query results from the shards. See
Sharded Cluster Query Routing and Use Indexes to Sort Query Results for
more information.
Indivisible Chunks¶
An insufficiently granular shard key can result in chunks that are “unsplittable”. See Create a Shard Key that is Easily Divisible for more information.