Shard Keys
On this page
The shard key is either a single indexed field or multiple fields covered by a compound index that determines the distribution of the collection's documents among the cluster's shards.
MongoDB divides the span of shard key values (or hashed shard key values) into non-overlapping ranges of shard key values (or hashed shard key values). Each range is associated with a chunk, and MongoDB attempts to distribute chunks evenly among the shards in the cluster.
The shard key has a direct relationship to the effectiveness of chunk distribution. See Choose a Shard Key.
Shard Key Indexes
All sharded collections must have an index that supports the shard key. The index can be an index on the shard key or a compound index where the shard key is a prefix of the index.
If the collection is empty,
sh.shardCollection()
creates the index on the shard key if such an index does not already exists.If the collection is not empty, you must create the index first before using
sh.shardCollection()
.
If you drop the last valid index for the shard key, recover by recreating an index on just the shard key.
Unique Indexes
MongoDB can enforce a uniqueness constraint on a ranged shard key index. Through the use of a unique index on the shard key, MongoDB enforces uniqueness on the entire key combination and not individual components of the shard key.
For a ranged sharded collection, only the following indexes can be unique:
The index on the shard key
A compound index where the shard key is a prefix
The default
_id
index.Important
Sharded clusters only enforce the uniqueness constraint on
_id
fields across the cluster when the_id
field is also the shard key.If the
_id
field is not the shard key or if it is only the prefix to the shard key, the uniqueness constraint applies only to the shard that stores the document. This means that two or more documents can have the same_id
value, provided they occur on different shards.In cases where the
_id
field is not the shard key, MongoDB expects applications to enforce the uniqueness of_id
values across the shards.
The unique index constraints mean that:
For a to-be-sharded collection, you cannot shard the collection if the collection has other unique indexes.
For an already-sharded collection, you cannot create unique indexes on other fields.
A unique index stores a null value for a document missing the indexed field; that is a missing index field is treated as another instance of a
null
index key value. For more information, see Missing Document Field in a Unique Single-Field Index.
To enforce uniqueness on the shard key values, pass the unique
parameter as true
to the sh.shardCollection()
method:
If the collection is empty,
sh.shardCollection()
creates the unique index on the shard key if such an index does not already exist.If the collection is not empty, you must create the index first before using
sh.shardCollection()
.
Although you can have a unique compound index where the shard
key is a prefix, if using unique
parameter, the collection must have a unique index that is on the shard
key.
You cannot specify a unique constraint on a hashed index.
To maintain uniqueness on a field that is not your shard key, see Unique Constraints on Arbitrary Fields.
Missing Shard Key Fields
Documents in sharded collections can be missing the shard key fields. To set missing shard key fields, see Set Missing Shard Key Fields.
Chunk Range and Missing Shard Key Fields
Missing shard key fields fall within the same chunk range as shard keys
with null values. For example, if the shard key is on the fields { x:
1, y: 1 }
, then:
Document Missing Shard Key | Falls into Same Range As |
---|---|
{ x: "hello" } | { x: "hello", y: null } |
{ y: "goodbye" } | { x: null, y: "goodbye" } |
{ z: "oops" } | { x: null, y: null } |
Read/Write Operations and Missing Shard Key Fields
To target documents with missing shard key fields, you can use the
{ $exists: false }
filter condition on the shard key
fields. For example, if the shard key is on the fields { x: 1, y: 1
}
, you can find the documents with missing shard key fields by running
this query:
db.shardedcollection.find( { $or: [ { x: { $exists: false } }, { y: { $exists: false } } ] } )
If you specify a null equality match filter condition (e.g. { x: null
}
), the filter matches both those documents with missing shard
key fields and those with shard key fields set to null
.
Some write operations, such as a write with an upsert
specification, require an equality match on the shard key. In those
cases, to target a document that is missing the shard key, include
another filter condition in addition to the null
equality match.
For example:
{ _id: <value>, <shardkeyfield>: null } // _id of the document missing shard key