- Sharding >
- Tag Aware Sharding
Tag Aware Sharding¶
On this page
For sharded clusters, MongoDB makes it possible to associate specific ranges of a shard key with a specific shard or subset of shards. This association dictates the policy of the cluster balancer process as it balances the chunks around the cluster. This capability enables the following deployment patterns:
- isolating a specific subset of data on specific set of shards.
- controlling the balancing policy so that in a geographically distributed cluster the most relevant portions of the data set reside on the shards with greatest proximity to the application servers.
This document describes the behavior, operation, and use of tag aware sharding in MongoDB deployments.
Note
Shard key range tags are entirely distinct from replica set member tags.
Behavior and Operations¶
Tags in a sharded cluster are pieces of metadata that dictate the policy and behavior of the cluster balancer. Using tags, you may associate individual shards in a cluster with one or more tags. Then, you can assign this tag string to a range of shard key values for a sharded collection. When migrating a chunk, the balancer will select a destination shard based on the configured tag ranges.
The balancer migrates chunks in tagged ranges to shards with those tags, if tagged shards are not balanced. [1]
Note
Because a single chunk may span different tagged shard key ranges, the balancer may migrate chunks to tagged shards that contain values that exceed the upper bound of the selected tag range.
Example
Given a sharded collection with two configured tag ranges, such that:
- Shard key values between
100
and200
have tags to direct corresponding chunks to shards taggedNYC
. - Shard Key values between
200
and300
have tags to direct corresponding chunks to shards taggedSFO
.
In this cluster, the balancer will migrate a chunk with shard key
values ranging between 150
and 220
to a shard tagged
NYC
, since 150
is closer to 200
than 300
.
After configuring tags on shards and ranges of the shard key, the cluster may take some time to reach the proper distribution of data, depending on the division of chunks (i.e. splits) and the current distribution of data in the cluster. Once configured, the balancer will respect tag ranges during future balancing rounds.
[1] | To migrate chunks in a tagged environment, the balancer selects a target shard with a tag range that has an upper bound that is greater than the migrating chunk’s lower bound. If a shard with a matching tagged range exists, the balancer will migrate the chunk to that shard. |
Administer Shard Tags¶
Associate tags with a particular shard using the
sh.addShardTag()
method when connected to a
mongos
instance. A single shard may have multiple tags, and
multiple shards may also have the same tag.
Example
The following example adds the tag NYC
to two shards, and the tags
SFO
and NRT
to a third shard:
You may remove tags from a particular shard using the
sh.removeShardTag()
method when connected to a
mongos
instance, as in the following example, which removes
the NRT
tag from a shard:
Tag a Shard Key Range¶
To assign a tag to a range of shard keys use the
sh.addTagRange()
method when connected to a
mongos
instance. Any given shard key range may only have
one assigned tag. You cannot overlap defined ranges, or tag the same
range more than once.
Example
Given a collection named users
in the records
database,
sharded by the zipcode
field. The following operations assign:
- two ranges of zip codes in Manhattan and Brooklyn the
NYC
tag - one range of zip codes in San Francisco the
SFO
tag
Note
Shard rages are always inclusive of the lower value and exclusive of the upper boundary.
Remove a Tag From a Shard Key Range¶
The mongod
does not provide a helper for removing a tag
range. You may delete tag assignment from a shard key range by removing
the corresponding document from the tags
collection of
the config
database.
Each document in the tags
holds the namespace
of the sharded collection and a minimum shard key value.
Example
The following example removes the NYC
tag assignment for the
range of zip codes within Manhattan:
View Existing Shard Tags¶
The output from sh.status()
lists tags associated with a
shard, if any, for each shard. A shard’s tags exist in the shard’s
document in the shards
collection of the config
database. To return all shards with a specific tag, use a sequence of
operations that resemble the following, which will return only those
shards tagged with NYC
:
You can find tag ranges for all namespaces in the
tags
collection of the config
database. The output
of sh.status()
displays all tag ranges. To return all shard
key ranges tagged with NYC
, use the following sequence of
operations: