Docs Menu

Docs HomeMongoDB Cluster-to-Cluster Sync

Use mongosync on Sharded Clusters

On this page

  • Configure a Single mongosync Instance
  • Configure Multiple mongosync Instances
  • Start Multiple mongosync Instances
  • Check Progress
  • Pause a mongosync Instance
  • Resume Synchronization
  • Commit Synchronization From Multiple mongosync Instances
  • Reverse the Synchronization Direction

There are two ways to synchronize sharded clusters. You can use either one monogosync or several monogosync instances. For best performance with large or heavily loaded clusters, use multiple monogosync instances, one monogosync for each shard in the cluster.

Warning

Syncing large documents to a sharded destination cluster can cause the destination cluster to initiate chunk migration. When the destination cluster migrates chunks during sync, it may trigger a bug that can result in data loss.

To avoid this, it is recommended that you run the balancerStop command on the destination cluster before starting a sync. Once mongosync completes the sync, you can restart the balancer with the balancerStart command.

To configure a single mongosync, follow the connection instructions for your cluster architecture to connect to the mongos instance in your cluster.

When you connect a single mongosync to a sharded cluster do not use the replicaSet option or the id option.

The rest of this page addresses cluster to cluster synchronization using multiple monogosync instances.

To configure multiple mongosync instances:

  1. Verify cluster configuration

  2. Determine the shard IDs

  3. Connect the instances

1

Starting in mongosync 1.1, you can sync between clusters with different numbers of shards. However, if you want to reverse the sync, the source cluster and destination cluster must have the same number of shards.

2

To stop the balancer on the destination cluster, connect to the destination cluster and call the sh.stopBalancer() method:

sh.stopBalancer()
3

To get the shard IDs, connect to the source cluster mongos and run the listShards command.

db.adminCommand( { listShards: 1 } )

The information is in the shards array.

shards: [
{
_id: 'shard01',
host: 'shard01/localhost:27501,localhost:27502,localhost:27503',
state: 1,
topologyTime: Timestamp({ t: 1656612236, i: 2 })
},
{
_id: 'shard02',
host: 'shard02/localhost:27504,localhost:27505,localhost:27506',
state: 1,
topologyTime: Timestamp({ t: 1656612240, i: 4 })
}
]
4

These instructions use a generic connection string. To modify the connection string for your cluster architecture, refer to the architecture specific connection details.

Tip

A single host server can run multiple mongosync instances. To improve performance, run mongosync on multiple host servers.

Run the first mongosync instance:

mongosync \
--cluster0 "mongodb://user:password@cluster0host:27500" \
--cluster1 "mongodb://user:password@cluster1host:27500" \
--id shard01 --port 27601

When running multiple mongosync instances, the number of instances must equal the number of shards. Each mongosync instance must be started with the --id option or id setting to specify the shard it replicates.

Run a new mongosync instance for each shard in the source cluster. Edit the --id and --port fields for each additional mongosync instance.

mongosync \
--cluster0 "mongodb://user:password@cluster0host:27500" \
--cluster1 "mongodb://user:password@cluster1host:27500" \
--id shard02 --port 27602

The connection strings for the --cluster0 and --cluster1 options should point to mongos instances. In the example, they use the same mongos instance.

Each mongosync instance:

  • Connects to mongos instances in the source cluster.

  • Connects to mongos instances in the destination cluster.

  • Replicates a single shard from the source cluster, identified by the --id option.

  • Specifies a unique port to use during synchronization. Consider designating a range of ports to simplify scripting Cluster-to-Cluster Sync operations.

Use curl or another HTTP client to issue the start command to each of the mongosync instances.

curl mongosync01Host:27601/api/v1/start -XPOST --data \
'{ "source": "cluster0", "destination": "cluster1", \
"reversible": false, "enableUserWriteBlocking": false }'
curl mongosync02Host:27602/api/v1/start -XPOST --data \
'{ "source": "cluster0", "destination": "cluster1", \
"reversible": false, "enableUserWriteBlocking": false }'

The start command options must be the same for all of the mongosync instances.

The mongosync instances do not aggregate progress information across shards. To review synchronization progress for a particular shard, use curl or another HTTP client to issue the progress command to the mongosync instance syncing that shard.

curl mongosync02Host:27602/api/v1/progress -XGET

This command checks the progress of the mongosync instance that is running on mongosync02Host and using port 27602 for synchronization. To check progress on other shards, update the host and port number then repeat the API call to each mongosync instance.

The pause command will temporarily halt the synchronization process on a single shard. It does not pause any other mongosync instances that may be running. Use curl or another HTTP client to issue the pause command to a mongosync instance.

curl mongosync01Host:27601/api/v1/pause -XPOST --data '{}'

This command pauses the mongosync instance that is running on mongosync01Host and using port 27601 for synchronization. To pause synchronization on other shards, update the host and port number then repeat the API call to each mongosync instance.

If one or more mongosync instances are paused, you can use the resume command to resume syncing. Run a separate resume command against each paused mongosync instance to continue syncing.

Use curl or another HTTP client to issue the resume command to each mongosync instance.

curl mongosync01Host:27601/api/v1/resume -XPOST --data '{}'

This command resumes synchronization on the mongosync instance that is running on mongosync01Host and using port 27601. To resume synchronization on other shards, update the host and port number then repeat the API call to each mongosync instance.

When you want to complete synchronization, issue the progress command and check the values for canCommit and lagTimeSeconds.

To minimize write blocking on the source cluster, you should only run the commit command when the lagTimeSeconds value is small enough for your application.

If the lagTimeSeconds value is small enough, and canCommit is true, issue the commit command to commit synchronization. Repeat the process on all of the mongosync instances.

The commit operation is blocking. The commit command will not return until commit has been called on every mongosync instance.

// Check progress
curl mongosync01Host:27601/api/v1/progress -XGET
// Commit
curl mongosync01Host:27601/api/v1/commit -XPOST --data '{}'

These commands only check progress and commit synchronization for the mongosync instance that is running on mongosync01Host and using port 27601. To synchronize all of the shards, make additional calls to progress and commit on any other mongosync instances that may be running.

Once this is done, use sh.startBalancer() to restart the balancer on the destination cluster:

sh.startBalancer()

To reverse synchronization so that the original destination cluster acts as the source cluster:

  • If you have not already done so, issue the commit command to each mongosync instance and wait until all of the commits to finish.

  • Issue the reverse command to each mongosync instance.

The reverse operation is blocking. The reverse command will not return until reverse has been called on every mongosync instance.

curl mongosync01Host:27601/api/v1/reverse -XPOST --data '{}'

This command reverses synchronization on the mongosync instance that is running on mongosync01Host and using port 27601. Make additional calls to reverse on any other mongosync instances that may be running.

Note

Reverse synchronization is only possible if reversible and enableUserWriteBlocking are both set to true when the start API initiates mongosync.

←  Connect a Self-Managed Cluster to AtlasReference →
Share Feedback
© 2023 MongoDB, Inc.

About

  • Careers
  • Investor Relations
  • Legal Notices
  • Privacy Notices
  • Security Information
  • Trust Center
© 2023 MongoDB, Inc.