- Sharding >
- Sharded Cluster Architectures
Sharded Cluster Architectures¶
On this page
This document describes the organization and design of sharded cluster deployments.
Restriction on the Use of the localhost
Interface¶
Because all components of a sharded cluster must communicate with each other over the network, there are special restrictions regarding the use of localhost addresses:
If you use either “localhost” or “127.0.0.1
” as the host
identifier, then you must use “localhost” or “127.0.0.1
” for all
host settings for any MongoDB instances in the cluster. This applies
to both the host
argument to addShard
and the value
to the mongos --configdb
run time option. If you mix
localhost addresses with remote host address, MongoDB will produce
errors.
Test Cluster Architecture¶
You can deploy a very minimal cluster for testing and development. These non-production clusters have the following components:
- One config server.
- At least one
mongod
instance (either replica sets or as a standalone node.) - One
mongos
instance.
Warning
Use the test cluster architecture for testing and development only.
Production Cluster Architecture¶
In a production cluster, you must ensure that data is redundant and that your systems are highly available. To that end, a production-level cluster must have the following components:
Three config servers, each residing on a discrete system.
A single sharded cluster must have exclusive use of its config servers. If you have multiple sharded clusters, you will need to have a group of config servers for each cluster.
Two or more replica sets to serve as shards. For information on replica sets, see Replication.
Two or more
mongos
instances. Typically, you deploy a singlemongos
instance on each application server. Alternatively, you may deploy severalmongos
nodes and let your application connect to these via a load balancer.
Sharded and Non-Sharded Data¶
Sharding operates on the collection level. You can shard multiple collections within a database, or have multiple databases with sharding enabled. [1] However, in production deployments some databases and collections will use sharding, while other databases and collections will only reside on a single database instance or replica set (i.e. a shard.)
Regardless of the data architecture of your sharded cluster,
ensure that all queries and operations use the mongos router to
access the data cluster. Use the mongos
even for operations
that do not impact the sharded data.
Every database has a “primary” [2] shard that
holds all un-sharded collections in that database. All collections
that are not sharded reside on the primary for their database. Use
the movePrimary
command to change the primary shard for a
database. Use the db.printShardingStatus()
command or the
sh.status()
to see an overview of the cluster, which contains
information about the chunk and database distribution within the
cluster.
Warning
The movePrimary
command can be expensive because
it copies all non-sharded data to the new shard, during which
that data will be unavailable for other operations.
When you deploy a new sharded cluster, the “first shard” becomes the primary for all databases before enabling sharding. Databases created subsequently, may reside on any shard in the cluster.
[1] | As you configure sharding, you will use the
enableSharding command to enable sharding for a
database. This simply makes it possible to use the
shardCollection command on a collection within that database. |
[2] | The term “primary” in the context of databases and sharding, has nothing to do with the term primary in the context of replica sets. |
High Availability and MongoDB¶
A production cluster has no single point of failure. This section introduces the availability concerns for MongoDB deployments and highlights potential failure scenarios and available resolutions:
Application servers or
mongos
instances become unavailable.If each application server has its own
mongos
instance, other application servers can continue access the database. Furthermore,mongos
instances do not maintain persistent state, and they can restart and become unavailable without loosing any state or data. When amongos
instance starts, it retrieves a copy of the config database and can begin routing queries.A single
mongod
becomes unavailable in a shard.Replica sets provide high availability for shards. If the unavailable
mongod
is a primary, then the replica set will elect a new primary. If the unavailablemongod
is a secondary, and it disconnects the primary and secondary will continue to hold all data. In a three member replica set, even if a single member of the set experiences catastrophic failure, two other members have full copies of the data. [3]Always investigate availability interruptions and failures. If a system is unrecoverable, replace it and create a new member of the replica set as soon as possible to replace the lost redundancy.
All members of a replica set become unavailable.
If all members of a replica set within a shard are unavailable, all data held in that shard is unavailable. However, the data on all other shards will remain available, and it’s possible to read and write data to the other shards. However, your application must be able to deal with partial results, and you should investigate the cause of the interruption and attempt to recover the shard as soon as possible.
One or two config database become unavailable.
Three distinct
mongod
instances provide the config database using a special two-phase commits to maintain consistent state between thesemongod
instances. Cluster operation will continue as normal but chunk migration and the cluster can create no new chunk splits. Replace the config server as soon as possible. If all multiple config databases become unavailable, the cluster can become inoperable.Note
All config servers must be running and available when you first initiate a sharded cluster.
[3] | If an unavailable secondary becomes available while it still has current oplog entries, it can catch up to the latest state of the set using the normal replication process, otherwise it must perform an initial sync. |