Navigation
This version of the documentation is archived and no longer supported.

Restore a Sharded Cluster

Important

In version 3.4, MongoDB removes support for SCCC config servers. To upgrade your config servers from SCCC to CSRS, see Upgrade Config Servers to Replica Set.

The following procedure applies to 3.4 config servers. For earlier versions of MongoDB, refer to the corresponding version of the MongoDB Manual.

This procedure restores a sharded cluster from an existing backup, such as LVM snapshots or database dumps. The source and target sharded cluster must have the same number of shards. For complete documentation on sharded cluster backups, see Back Up a Sharded Cluster with Database Dumps and Back Up a Sharded Cluster with File System Snapshots.

MongoDB Cloud Backups

MongoDB Cloud Services provides built-in backup and restoration features for automatically restore sharded cluster backups.

For more information, see MongoDB Atlas, MongoDB Cloud Manager, and MongoDB Ops Manager.

A. (Optional) Review Replica Set Configurations

This procedure initiates a new replica set for the Config Server Replica Set (CSRS) and each shard replica set using the default configuration. To use a different replica set configuration for your restored CSRS and shards, you must reconfigure the replica set(s).

If your source cluster is healthy and accessible, connect a mongo shell to the primary replica set member in each replica set and run rs.conf() to view the replica configuration document.

If you cannot access one or more components of the source sharded cluster, please reference any existing internal documentation to reconstruct the configuration requirements for each shard replica set and the config server replica set.

B. Prepare the Target Host for Restoration

Storage Space Requirements
Ensure the target host hardware has sufficient open storage space for the restored data. If the target host contains existing sharded cluster data that you want to keep, ensure that you have enough storage space for both the existing data and the restored data.
LVM Requirements
For LVM snapshots, you must have at least one LVM managed volume group and an a logical volume with enough free space for the extracted snapshot data.
MongoDB Version Requirements

Ensure the target host and source host have the same MongoDB Server version. To check the version of MongoDB available on a host machine, run mongod --version from the terminal or shell.

For complete documentation on installation, see Install MongoDB.

Shut Down Running MongoDB Processes

If restoring to an existing cluster, shut down the mongod or mongos process on the target host.

For hosts running mongos, connect a mongo shell to the mongos and run db.shutdownServer() from the admin database:

use admin
db.shutdownServer()

For hosts running a mongod, connect a mongo shell to the mongod and run db.isMaster():

  • If ismaster is false, the mongod is a secondary member of a replica set. You can shut it down by running db.shutdownServer() from the admin database.

  • If ismaster is true, the mongod is the primary member of a replica set. Shut down the secondary members of the replica set first. Use rs.status() to identify the other members of the replica set.

    The primary automatically steps down after it detects a majority of members are offline. After it steps down (db.isMaster returns ismaster: false), you can safely shut down the mongod

Prepare Data Directory

Create a directory on the target host for the restored database files. Ensure that the user that runs the mongod has read, write, and execute permissions for all files and subfolders in that directory:

mkdir /path/to/mongodb
chown -R mongodb:mongodb /path/to/mongodb
chmod -R 770 /path/to/mongodb

Substitute /path/to/mongodb with the path to the data directory you created.

Prepare Log Directory

Create a directory on the target host for the mongod log files. Ensure that the user that runs the mongod has read, write, and execute permissions for all files and subfolders in that directory:

mkdir /path/to/mongodb/logs
chown -R mongodb:mongodb /path/to/mongodb/logs
chmod -R 770 /path/to/mongodb/logs

Substitute /path/to/mongodb/logs with the path to the log directory you created.

Create Configuration File

This procedure assumes starting a mongod with a configuration file.

Create the configuration file in your preferred location. Ensure that the user that runs the mongod has read and write permissions on the configuration file:

touch /path/to/mongod.conf
chown mongodb:mongodb /path/to/mongodb/mongod.conf
chmod 644 /path/to/mongodb/mongod.conf

Open the configuration file in your preferred text editor and modify at it as required by your deployment. Alternatively, if you have access to the original configuration file for the mongod, copy it your preferred location on the target host.

Important

Validate that your configuration file includes the following settings:

C. Restore Config Server Replica Set

1

Restore the CSRS primary mongod data files.

Select the tab that corresponds to your preferred backup method:

  1. Mount the LVM snapshot on the target host machine. The specific steps for mounting an LVM snapshot depends on your LVM configuration.

    The following example assumes an LVM snapshot created using the Create a Snapshot step in the Back Up and Restore with Filesystem Snapshots procedure.

    lvcreate --size 250GB --name mongod-datafiles-snapshot vg0
    gzip -d -c mongod-datafiles-snapshot.gz | dd o/dev/vg0/mongod-datafiles-snapshot
    mount /dev/vg0/mongod-datafiles-snapshot /snap/mongodb
    

    This example may not apply to all possible LVM configurations. Refer to the LVM documentation for your system for more complete guidance on LVM restoration.

  2. Copy the mongod data files from the snapshot mount to the data directory created in B. Prepare the Target Host for Restoration:

    c -a /snap/mongodb/path/to/mongodb /path/to/mongodb
    

    The -a option recursively copies the contents of the source path to the destination path while preserving folder and file permissions.

  3. Comment out or omit the following configuration file settings:

    #replication
    #  replSetName: myCSRSName
    #sharding
    #  clusterRole: configsvr
    

    To start the mongod using a configuration file, specify the --config option in the command line specifying the full path to the configuration file:

    mongod --config /path/to/mongodb/mongod.conf
    

    If you have mongod configured to run as a system service, start it using the recommended process for your system service manager.

    After the mongod starts, connect to it using the mongo shell.

  1. Comment out or omit the following configuration file settings:

    #replication
    #  replSetName: myCSRSName
    #sharding
    #  clusterRole: configsvr
    

    To start the mongod using a configuration file, specify the --config option in the command line specifying the full path to the configuration file:

    mongod --config /path/to/mongodb/mongod.conf
    

    If you have mongod configured to run as a system service, start it using the recommended process for your system service manager.

  2. Use mongorestore to restore the data captured by mongodump into the mongod. If you ran mongodump with --oplog, you must run mongorestore with --oplogReplay to restore the captured oplog entries.

    The following operation restores a mongodump dump created with the --gzip, --archive, and --oplog options:

    mongorestore --host localhost --port 27017 \
      --oplogReplay \
      --gzip --archive="/path/to/dump.gz"
    

    Add any additional options as required by your deployment. Change the hostname and port based on the configuration of the target mongod.

  1. Make the data files stored in your selected backup medium accessible on the host. This may require mounting the backup volume, opening the backup in a software utility, or using another tool to extract the data to disk. Refer to the documentation for your preferred backup tool for instructions on accessing the data contained in the backup.

  2. Copy the mongod data files from the backup data location to the data directory created in B. Prepare the Target Host for Restoration:

    c -a /backup/mongodb/path/to/mongodb /path/to/mongodb
    

    The -a option recursively copies the contents of the source path to the destination path while preserving folder and file permissions.

  3. Comment out or omit the following configuration file settings:

    #replication
    #  replSetName: myCSRSName
    #sharding
    #  clusterRole: configsvr
    

    To start the mongod using a configuration file, specify the --config option in the command line specifying the full path to the configuration file:

    mongod --config /path/to/mongodb/mongod.conf
    

    If you have mongod configured to run as a system service, start it using the recommended process for your system service manager.

    After the mongod starts, connect to it using the mongo shell.

2

Drop the local database.

Use db.dropDatabase() to drop the local database:

use local
db.dropDatabase()
3

For any planned or completed shard hostname or replica set name changes, update the metadata in config.shards .

You can skip this step if all of the following are true:

  • No shard member host machine hostname has or will change during this procedure.
  • No shard replica set name has or will change during this procedure.

Issue the following find() method on the shards collection in the Config Database. Replace <shardName> with the name of the shard. By default the shard name is its replica set name. If you added the shard using the addShard command and specified a custom name, you must specify that name to <shardName>.

use config
db.shards.find( { "_id" : "<shardName>" } )

This operation returns a document that resembles the following:

{
   "_id" : "shard1",
   "host" : "myShardName/alpha.example.net:27018,beta.example.net:27018,charlie.example.net:27018",
   "state" : 1
}

Important

The _id value must match the shardName value in the _id : "shardIdentity" document on the corresponding shard. When restoring the shards later in this procedure, validate that the _id field in shards matches the shardName value on the shard.

Use the updateOne() method to update the hosts string to reflect the planned replica set name and hostname list for the shard. For example, the following operation updates the host connection string for the shard with "_id" : "shard1":

db.shards.updateOne(
  { "_id" : "shard1" },
  { $set : { "host" : "myNewShardName/repl1.example.net:27018,repl2.example.net:27018,repl3.example.net:27018" } }
)

Repeat this process until all shard metadata accurately reflects the planned replica set name and hostname list for each shard in the cluster.

Note

If you do not know the shard name, issue the find() method on the shards collection with an empty filter document {}:

use config
db.shards.find({})

Each document in the result set represents one shard in the cluster. For each document, check the host field for a connection string that matches the shard in question, i.e. a matching replica set name and member hostname list. Use the _id of that document in place of <shardName>.

4

Restart the mongod as a new single-node replica set.

Shut down the mongod. Uncomment or add the following configuration file options:

replication
  replSetName: myNewCSRSName
sharding
  clusterRole: configsvr

If you want to change the replica set name, you must update the replSetName field with the new name before proceeding.

Start the mongod with the updated configuration file:

mongod --config /path/to/mongodb/mongod.conf

If you have mongod configured to run as a system service, start it using the recommended process for your system service manager.

After the mongod starts, connect to it using the mongo shell.

5

Initiate the new replica set.

Initiate the replica set using rs.initiate() with the default settings.

rs.initiate()

Once the operation completes, use rs.status() to check that the member has become the primary.

6

Add additional replica set members.

For each replica set member in the CSRS, start the mongod on its host machine. Once you have started up all remaining members of the cluster successfully, connect a mongo shell to the primary replica set member. From the primary, use the rs.add() method to add each member of the replica set. Include the replica set name as the prefix, followed by the hostname and port of the member’s mongod process:

rs.add("config2.example.net:27019")
rs.add("config3.example.net:27019")

If you want to add the member with specific replica member configuration settings, you can pass a document to rs.add() that defines the member hostname and any members[n] settings your deployment requires.

rs.add(
 {
   "host" : "config2.example.net:27019",
   priority: <int>,
   votes: <int>,
   tags: <int>
 }
)

Each new member performs an initial sync to catch up to the primary. Depending on factors such as the amount of data to sync, your network topology and health, and the power of each host machine, initial sync may take an extended period of time to complete.

The replica set may elect a new primary while you add additional members. Use rs.status() to identify which member is the current primary. You can only run rs.add() from the primary.

7

Configure any additional required replication settings.

The rs.reconfig() method updates the replica set configuration based on a configuration document passed in as a parameter. You must run reconfig() against the primary member of the replica set.

Reference the original configuration file output of the replica set as identified in step A. Review Replica Set Configurations and apply settings as needed.

D. Restore Each Shard Replica Set

1

Restore the shard primary mongod data files.

Select the tab that corresponds to your preferred backup method:

  1. Mount the LVM snapshot on the target host machine. The specific steps for mounting an LVM snapshot depends on your LVM configuration.

    The following example assumes an LVM snapshot created using the Create a Snapshot step in the Back Up and Restore with Filesystem Snapshots procedure.

    lvcreate --size 250GB --name mongod-datafiles-snapshot vg0
    gzip -d -c mongod-datafiles-snapshot.gz | dd o/dev/vg0/mongod-datafiles-snapshot
    mount /dev/vg0/mongod-datafiles-snapshot /snap/mongodb
    

    This example may not apply to all possible LVM configurations. Refer to the LVM documentation for your system for more complete guidance on LVM restoration.

  2. Copy the mongod data files from the snapshot mount to the data directory created in B. Prepare the Target Host for Restoration:

    c -a /snap/mongodb/path/to/mongodb /path/to/mongodb
    

    The -a option recursively copies the contents of the source path to the destination path while preserving folder and file permissions.

  3. Comment out or omit the following configuration file settings:

    #replication
    #  replSetName: myShardName
    #sharding
    #  clusterRole: shardsvr
    

    To start the mongod using a configuration file, specify the --config option in the command line specifying the full path to the configuration file:

    mongod --config /path/to/mongodb/mongod.conf
    

    If you have mongod configured to run as a system service, start it using the recommended process for your system service manager.

    After the mongod starts, connect to it using the mongo shell.

  1. Comment out or omit the following configuration file settings:

    #replication
    #  replSetName: myShardName
    #sharding
    #  clusterRole: shardsvr
    

    To start the mongod using a configuration file, specify the --config option in the command line specifying the full path to the configuration file:

    mongod --config /path/to/mongodb/mongod.conf
    

    If you have mongod configured to run as a system service, start it using the recommended process for your system service manager.

  2. Use mongorestore to restore the data captured by mongodump into the mongod. If you ran mongodump with --oplog, you must run mongorestore with --oplogReplay to restore the captured oplog entries.

    The following operation restores a mongodump dump created with the --gzip, --archive, and --oplog options:

    mongorestore --host localhost --port 27017 \
      --oplogReplay \
      --gzip --archive="/path/to/dump.gz"
    

    Add any additional options as required by your deployment. Change the hostname and port based on the configuration of the target mongod.

  1. Make the data files stored in your selected backup medium accessible on the host. This may require mounting the backup volume, opening the backup in a software utility, or using another tool to extract the data to disk. Refer to the documentation for your preferred backup tool for instructions on accessing the data contained in the backup.

  2. Copy the mongod data files from the backup data location to the data directory created in B. Prepare the Target Host for Restoration:

    c -a /backup/mongodb/path/to/mongodb /path/to/mongodb
    

    The -a option recursively copies the contents of the source path to the destination path while preserving folder and file permissions.

  3. Comment out or omit the following configuration file settings:

    #replication
    #  replSetName: myShardName
    #sharding
    #  clusterRole: shardsvr
    

    To start the mongod using a configuration file, specify the --config option in the command line specifying the full path to the configuration file:

    mongod --config /path/to/mongodb/mongod.conf
    

    If you have mongod configured to run as a system service, start it using the recommended process for your system service manager.

    After the mongod starts, connect to it using the mongo shell.

2

Create a temporary user with the __system role.

During this procedure you will modify documents in the admin.system.version collection. For clusters enforcing authentication, only the __system role grants permission to modify this collection. You can skip this step if the cluster does not enforce authentication.

Warning

The __system role entitles its holder to take any action against any object in the database. This procedure includes instructions for removing the user created in this step. Do not keep this user active beyond the scope of this procedure.

  1. Authenticate as a user with the userAdmin role on the admin database or userAdminAnyDatabase role:

    use admin
    db.auth("myUserAdmin","mySecurePassword")
    
  2. Create a user with the __system role:

    db.createUser(
      {
        user: "mySystemUser",
        pwd: "<replaceMeWithAStrongPassword>",
        roles: [ "__system" ]
      }
    )
    

    Passwords should be random, long, and complex to ensure system security and to prevent or delay malicious access.

  3. Authenticate as the privileged user:

    db.auth("mySystemUser","<replaceMeWithAStrongPassword>")
    
3

Drop the local database.

Use db.dropDatabase() to drop the local database:

use local
db.dropDatabase()
4

Optional: For any CSRS hostname or replica set name changes, update shard metadata in each shard’s identity document.

You can skip this step if all of the following are true:

  • The hostnames for any CSRS host did not change during this procedure.
  • The CSRS replica set name did not change during this procedure.

The system.version collection on the admin database contains metadata related to the shard, including the CSRS connection string. If either the CSRS name or any member hostnames changed while restoring the CSRS, you must update this metadata.

Issue the following find() method on the system.version collection in the admin database:

use admin
db.system.version.find( {"_id" : "shardIdentity" } )

The find() method returns a document that resembles the following:

{
  "_id" : "shardIdentity",
  "clusterId" : ObjectId("2bba123c6eeedcd192b19024"),
  "shardName" : "myNewShardName",
  "configsvrConnectionString" : "myCSRSName/alpha.example.net:27019,beta.example.net:27019,charlie.example.net:27019" }

If the shard does not have a shardIdentity document, skip to the next step.

The following updateOne method updates the document such that the host string represents the most current CSRS connection string:

db.system.version.updateOne(
  { "_id" : "shardIdentity" },
  { $set :
    { "configsvrConnectionString" : "myNewCSRSName/config1.example.net:27019,config2.example.net:27019,config3.example.net:27019"}
  }
)

Important

The shardName value must match the _id value in the shards collection on the CSRS. Validate that the metadata on the CSRS match the metadata for the shard. Refer to substep 3 in the C. Restore Config Server Replica Set portion of this procedure for instructions on viewing the CSRS metadata.

5

Modify the minOpTimeRecovery document from the admin.system.versions collection.

If the shard has a shardIdentity document:

Issue the following deleteOne() method on the system.version collection in the admin database:

use admin
db.system.version.deleteOne( { _id: "minOpTimeRecovery" } )
If the shard does not have a shardIdentity document:

Issue the following find() method on the system.version collection in the admin database.

use admin
db.system.version.find( {"_id" : "minOpTimeRecovery" } )

The find() method returns a document that resembles the following:

{
  "_id" : "minOpTimeRecovery",
  "configsvrConnectionString" : "myCSRSName/alpha.example.net:27019,beta.example.net:27019,charlie.example.net:27019",
  "minOpTime" : {
    "ts" : Timestamp(1554740782, 2),
    "t" : NumberLong(1)
  },
  "minOpTimeUpdaters" : 1,
  "shardName" : "myNewShardName"
}

The following updateOne method updates the document such that:

  • The configsvrConnectionString string represents the most current CSRS string
  • The minOpTime.ts document timestamp is zeroed
  • The minOpTimeUpdaters field is zeroed
use admin
db.system.version.updateOne(
  { "_id" : "minOpTimeRecovery" },
  {
    $set : {
      "configsvrConnectionString" : "myNewCSRSName/config1.example.net:27019,config2.example.net:27019,config3.example.net:27019",
      "minOpTime" : { "ts" : new Timestamp(), "t" : new NumberLong(0)},
      "minOpTimeUpdaters" : 0
    }
  }
)
6

Restart the mongod as a new single-node replica set.

Shut down the mongod. Uncomment or add the following configuration file options:

replication
  replSetName: myNewShardName
sharding
  clusterRole: shardsvr

If you want to change the replica set name, you must update the replSetName field with the new name before proceeding.

Start the mongod with the updated configuration file:

mongod --config /path/to/mongodb/mongod.conf

If you have mongod configured to run as a system service, start it using the recommended process for your system service manager.

After the mongod starts, connect to it using the mongo shell.

7

Initiate the new replica set.

Initiate the replica set using rs.initiate() with the default settings.

rs.initiate()

Once the operation completes, use rs.status() to check that the member has become the primary.

8

Add additional replica set members.

For each replica set member in the shard replica set, start the mongod on its host machine. Once you have started up all remaining members of the cluster successfully, connect a mongo shell to the primary replica set member. From the primary, use the rs.add() method to add each member of the replica set. Include the replica set name as the prefix, followed by the hostname and port of the member’s mongod process:

rs.add("repl2.example.net:27018")
rs.add("repl3.example.net:27018")

If you want to add the member with specific replica member configuration settings, you can pass a document to rs.add() that defines the member hostname and any members[n] settings your deployment requires.

rs.add(
 {
   "host" : "repl2.example.net:27018",
   priority: <int>,
   votes: <int>,
   tags: <int>
 }
)

Each new member performs an initial sync to catch up to the primary. Depending on factors such as the amount of data to sync, your network topology and health, and the power of each host machine, initial sync may take an extended period of time to complete.

The replica set may elect a new primary while you add additional members. Use rs.status() to identify which member is the current primary. You can only run rs.add() from the primary.

9

Configure any additional required replication settings.

The rs.reconfig() method updates the replica set configuration based on a configuration document passed in as a parameter. You must run reconfig() against the primary member of the replica set.

Reference the original configuration file output of the replica set as identified in step A. Review Replica Set Configurations and apply settings as needed.

10

Remove the temporary privileged user.

For clusters enforcing authentication, remove the privileged user created earlier in this procedure:

  1. Authenticate as a user with the userAdmin role on the admin database or userAdminAnyDatabase role:

    use admin
    db.auth("myUserAdmin","mySecurePassword")
    
  2. Delete the privileged user:

    db.removeUser("mySystemUser")
    

E. Restart Each mongos

Restart each mongos in the cluster.

mongos --config /path/to/config/mongos.conf

Include all other command line options as required by your deployment.

If the CSRS replica set name or any member hostname changed, update the mongos configuration file setting sharding.configDB with updated configuration server connection string:

sharding:
  configDB: "myNewCSRSName/config1.example.net:27019,config2.example.net:27019,config3.example.net:27019"

F. Validate Cluster Accessibility

Connect a mongo shell to one of the mongos processes for the cluster. Use sh.status() to check the overall cluster status. If sh.status() indicates that the balancer is not running, use sh.startBalancer() to restart the balancer.

To confirm that all shards are accessible and communicating, insert test data into a temporary sharded collection. Confirm that data is being split and migrated between each shard in your cluster. You can connect a mongo shell to each shard primary and use db.collection.find() to validate that the data was sharded as expected.