Docs Menu
Docs Home
/
MongoDB Manual
/ / /

shardCollection

On this page

  • Definition
  • Syntax
  • Command Fields
  • Considerations
  • Example
shardCollection

Shards a collection to distribute its documents across shards. The shardCollection command must be run against the admin database.

Note

Changed in version 6.0.

Starting in MongoDB 6.0, sharding a collection does not require you to first run the enableSharding command to configure the database.

Tip

In mongosh, this command can also be run through the sh.shardCollection() helper method.

Helper methods are convenient for mongosh users, but they may not return the same level of information as database commands. In cases where the convenience is not needed or the additional return fields are required, use the database command.

To run shardCollection, use the db.runCommand( { <command> } ) method.

The command has the following form:

db.adminCommand(
{
shardCollection: "<database>.<collection>",
key: { <field1>: <1|"hashed">, ... },
unique: <boolean>,
numInitialChunks: <integer>,
presplitHashedZones: <boolean>,
collation: { locale: "simple" },
timeseries: <object>
}
)

The command takes the following fields:

Field
Type
Description
shardCollection
string
The namespace of the collection to shard in the form <database>.<collection>.
key
document

The document that specifies the field or fields to use as the shard key.

{ <field1>: <1|"hashed">, ... }

Set the field values to either:

shard key must be supported by an index. Unless the collection is empty, the index must exist prior to the shardCollection command. If the collection is empty, MongoDB creates the index prior to sharding the collection if the index that can support the shard key does not already exist.

See also Shard Key Indexes

unique
boolean

Specify true to ensure that the underlying index enforces a unique constraint. Defaults to false.

You cannot specify true when using hashed shard keys.

numInitialChunks
integer

Specifies the initial number of chunks to create across all shards in the cluster when sharding an empty collection with a hashed shard key. MongoDB will then create and balance chunks across the cluster. The numInitialChunks must result in less than 8192 per shard.

If the collection is not empty or the shard key does not contain a hashed field, the operation returns an error.

  • If sharding with presplitHashedZones: true, MongoDB attempts to evenly distribute the specified number of chunks across the zones in the cluster.

  • If sharding with presplitHashedZones: false or omitted and no zones and zone ranges are defined for the empty collection, MongoDB attempts to evenly distributed the specified number of chunks across the shards in the cluster.

  • If sharding with presplitHashedZones: false or omitted and zones and zone ranges have been defined for the empty collection, numInitChunks has no effect.

collation
document
Optional. If the collection specified to shardCollection has a default collation, you must include a collation document with { locale : "simple" }, or the shardCollection command fails. At least one of the indexes whose fields support the shard key pattern must have the simple collation.
boolean

Optional. Specify true to perform initial chunk creation and distribution for an empty or non-existing collection based on the defined zones and zone ranges for the collection. For hashed sharding only.

shardCollection with presplitHashedZones: true returns an error if any of the following are true:

object

Optional. Specify this option to create a new sharded time series collection.

To shard an existing time series collection, omit this parameter.

When the collection specified to shardCollection is a time series collection and the timeseries option is not specified, MongoDB uses the values that define the existing time series collection to populate the timeseries field.

For detailed syntax, see Time Series Options.

New in version 5.1.

New in version 5.1.

To create a new time series collection that is sharded, specify the timeseries option to shardCollection.

The timeseries option takes the following fields:

Field
Type
Description
timeField
string

Required. The name of the field which contains the date in each time series document. Documents in a time series collection must have a valid BSON date as the value for the timeField.

metaField
string

Optional. The name of the field which contains metadata in each time series document. The metadata in the specified field should be data that is used to label a unique series of documents. The metadata should rarely, if ever, change The name of the specified field may not be _id or the same as the timeseries.timeField. The field can be of any data type.

Although the metaField field is optional, using metadata can improve query optimization. For example, MongoDB automatically creates a compound index on the metaField and timeField fields for new collections. If you do not provide a value for this field, the data is bucketed solely based on time.

granularity
string

Optional. Possible values are:

  • "seconds"

  • "minutes"

  • "hours"

By default, MongoDB sets the granularity to "seconds" for high-frequency ingestion.

Manually set the granularity parameter to improve performance by optimizing how data in the time series collection is stored internally. To select a value for granularity, choose the closest match to the time span between consecutive incoming measurements.

If you specify the timeseries.metaField, consider the time span between consecutive incoming measurements that have the same unique value for the metaField field. Measurements often have the same unique value for the metaField field if they come from the same source.

If you do not specify timeseries.metaField, consider the time span between all measurements that are inserted in the collection.

If you set the granularity parameter, you can't set the bucketMaxSpanSeconds and bucketRoundingSeconds parameters.

Do not run more than one shardCollection command on the same collection at the same time.

Once a collection has been sharded, MongoDB provides no method to unshard a sharded collection.

While you can change your shard key later, it is important to carefully consider your shard key choice to avoid scalability and perfomance issues.

Tip

See also:

When sharding time series collections, you can only specify the following fields in the shard key:

  • The metaField

  • Sub-fields of metaField

  • The timeField

You may specify combinations of these fields in the shard key. No other fields, including _id, are allowed in the shard key pattern.

When you specify the shard key:

Tip

Avoid specifying only the timeField as the shard key. Since the timeField increases monotonically, it may result in all writes appearing on a single chunk within the cluster. Ideally, data is evenly distributed across chunks.

To learn how to best choose a shard key, see:

Hashed shard keys use a hashed index or a compound hashed index as the shard key.

Use the form field: "hashed" to specify a hashed shard key field.

Note

If chunk migrations are in progress while creating a hashed shard key collection, the initial chunk distribution may be uneven until the balancer automatically balances the collection.

Tip

See also:

The shard collection operation (i.e. shardCollection command and the sh.shardCollection() helper) can perform initial chunk creation and distribution for an empty or a non-existing collection if zones and zone ranges have been defined for the collection. Initial chunk distribution allows for a faster setup of zoned sharding. After the initial distribution, the balancer manages the chunk distribution going forward per usual.

See Pre-Define Zones and Zone Ranges for an Empty or Non-Existing Collection for an example. If sharding a collection using a ranged or single-field hashed shard key, the numInitialChunks option has no effect if zones and zone ranges have been defined for the empty collection.

To shard a collection using a compound hashed index, see Zone Sharding and Compound Hashed Indexes.

MongoDB supports sharding collections on compound hashed indexes. When sharding an empty or non-existing collection using a compound hashed shard key, additional requirements apply in order for MongoDB to perform initial chunk creation and distribution.

The numInitialChunks option has no effect if zones and zone ranges have been defined for the empty collection and presplitHashedZones is false.

See Pre-Define Zones and Zone Ranges for an Empty or Non-Existing Collection for an example.

If specifying unique: true:

  • If the collection is empty, shardCollection creates the unique index on the shard key if such an index does not already exist.

  • If the collection is not empty, you must create the index first before using shardCollection.

Although you can have a unique compound index where the shard key is a prefix, if using unique parameter, the collection must have a unique index that is on the shard key.

See also Sharded Collection and Unique Indexes

If the collection has a default collation, the shardCollection command must include a collation parameter with the value { locale: "simple" }. For non-empty collections with a default collation, you must have at least one index with the simple collation whose fields support the shard key pattern.

You do not need to specify the collation option for collections without a collation. If you do specify the collation option for a collection with no collation, it will have no effect.

mongos uses "majority" for the write concern of the shardCollection command and its helper sh.shardCollection().

The following operation enables sharding for the people collection in the records database and uses the zipcode field as the shard key:

db.adminCommand( { shardCollection: "records.people", key: { zipcode: 1 } } )

Back

setAllowMigrations

Next

shardingState