mongosync. View the
current documentation
for up-to-date guidance on mongosync and instructions on how to upgrade
to the latest version.There are two ways to synchronize sharded clusters. You can use either one mongosync or several
mongosync instances. For best performance with large or heavily
loaded clusters, use one mongosync instance for each shard on the
source cluster.
Important
When the source or destination cluster is a sharded cluster, you must stop
the balancer on both clusters and not run the moveChunk or
moveRange commands for the duration of the migration. To stop
the balancer, run the balancerStop command and wait for the
command to complete.
Configure a Single mongosync Instance
To configure a single mongosync, follow the connection
instructions for your cluster architecture to
connect to the mongos instance in your cluster.
When you connect a single mongosync to a sharded cluster do not use
the replicaSet option or the id
option.
The rest of this page addresses cluster to cluster synchronization
using multiple mongosync instances.
Configure Multiple mongosync Instances
The number of mongosync instances must match the number of shards on
the source cluster. You must use the same version of mongosync
between all instances. For a replica set source, you can only use one
mongosync instance.
When you configure multiple mongosync instances to sync between
sharded clusters, you must send identical API endpoint commands to each
mongosync instance.
To configure multiple mongosync instances:
Determine the shard IDs
To get the shard IDs, connect to the source cluster
mongos and run the listShards command.
db.adminCommand( { listShards: 1 } )
The information is in the shards array.
shards: [ { _id: 'shard01', host: 'shard01/localhost:27501,localhost:27502,localhost:27503', state: 1, topologyTime: Timestamp({ t: 1656612236, i: 2 }) }, { _id: 'shard02', host: 'shard02/localhost:27504,localhost:27505,localhost:27506', state: 1, topologyTime: Timestamp({ t: 1656612240, i: 4 }) } ]
Connect the mongosync instances
These instructions use a generic connection string. To modify the connection string for your cluster architecture, refer to the architecture specific connection details.
Tip
A single host server can run multiple mongosync instances. To
improve performance, run mongosync on multiple host servers.
Run the first mongosync instance:
mongosync \ --cluster0 "mongodb://user:password@cluster0host:27500" \ --cluster1 "mongodb://user:password@cluster1host:27500" \ --id shard01 --port 27601
When running multiple mongosync instances, the number of instances
must equal the number of shards. Each mongosync instance must be
started with the --id option or id setting to
specify the shard it replicates.
Run a new mongosync instance for each shard in the source cluster.
Edit the --id and --port fields for each additional
mongosync instance.
mongosync \ --cluster0 "mongodb://user:password@cluster0host:27500" \ --cluster1 "mongodb://user:password@cluster1host:27500" \ --id shard02 --port 27602
The connection strings for the --cluster0 and
--cluster1 options should point to mongos instances.
In the example, they use the same mongos instance.
Each mongosync instance:
Connects to
mongosinstances in the source cluster.Connects to
mongosinstances in the destination cluster.Replicates a single shard from the source cluster, identified by the
--idoption.Specifies a unique port to use during synchronization. Consider designating a range of ports to simplify scripting Mongosync operations.
Start Multiple mongosync Instances
Use curl or another HTTP client to issue the start command to each of the mongosync instances.
curl mongosync01Host:27601/api/v1/start -XPOST --data \ '{ "source": "cluster0", "destination": "cluster1", \ "reversible": false, "enableUserWriteBlocking": false }' curl mongosync02Host:27602/api/v1/start -XPOST --data \ '{ "source": "cluster0", "destination": "cluster1", \ "reversible": false, "enableUserWriteBlocking": false }'
The start command options must be the same for all of the mongosync
instances.
Check Progress
To review synchronization progress for a particular
shard, use curl or another HTTP client to issue the
progress command to the mongosync
instance syncing that shard.
curl mongosync02Host:27602/api/v1/progress -XGET
This command checks the progress of the mongosync instance that is
running on mongosync02Host and using port 27602 for
synchronization. To check progress on other shards, update the host and
port number then repeat the API call to each mongosync instance.
Pause a mongosync Instance
The pause command will temporarily halt the
synchronization process on a single shard. It does not pause any other
mongosync instances that may be running. Use curl or another
HTTP client to issue the pause command to a mongosync instance.
curl mongosync01Host:27601/api/v1/pause -XPOST --data '{}'
This command pauses the mongosync instance that is running on
mongosync01Host and using port 27601 for synchronization. To
pause synchronization on other shards, update the host and port number
then repeat the API call to each mongosync instance.
Resume Synchronization
If one or more mongosync instances are paused, you can use the
resume command to resume syncing. Run a
separate resume command against each paused mongosync instance
to continue syncing.
Use curl or another HTTP client to issue the resume command to each mongosync instance.
curl mongosync01Host:27601/api/v1/resume -XPOST --data '{}'
This command resumes synchronization on the mongosync instance that
is running on mongosync01Host and using port 27601. To
resume synchronization on other shards, update the host and port number
then repeat the API call to each mongosync instance.
Commit Synchronization From Multiple mongosync Instances
When you want to complete synchronization, issue the progress command and check the values for canCommit
and lagTimeSeconds.
To minimize write blocking on the source cluster, you should only run
the commit command when the lagTimeSeconds value is small enough for your application.
If the lagTimeSeconds value is small enough, and canCommit is
true, issue the commit command to commit
synchronization. Repeat the process on all of the mongosync
instances.
The commit operation is blocking. The commit command will not
return until commit has been called on every mongosync
instance.
// Check progress curl mongosync01Host:27601/api/v1/progress -XGET // Commit curl mongosync01Host:27601/api/v1/commit -XPOST --data '{}'
These commands only check progress and commit synchronization for the
mongosync instance that is running on mongosync01Host and using
port 27601. To synchronize all of the shards, make additional calls
to progress and commit on any other mongosync instances
that may be running.
Data Verification
Before transferring your application load from the source cluster to the destination, check your data to ensure that the sync was successful.
For more information, see Verify Data Transfer.
Reverse the Synchronization Direction
Note
For an in-depth tutorial on reversing your synchronization direction, see Reverse Sync Direction.
To reverse synchronization so that the original destination cluster acts as the source cluster:
If you have not already done so, issue the commit command to each
mongosyncinstance and wait until all of the commits to finish. To check if the sync process has been committed, issue the progress command to allmongosyncinstances and see if each response'sstatefield contains the valueCOMMITTED.Issue the reverse command to each
mongosyncinstance.
The reverse operation is blocking. The reverse command will not
return until reverse has been called on every mongosync
instance.
curl mongosync01Host:27601/api/v1/reverse -XPOST --data '{}'
This command reverses synchronization on the mongosync
instance that is running on mongosync01Host and using port
27601. Make additional calls to reverse on any other
mongosync instances that may be running.
Note
Reverse synchronization is only possible if reversible and
enableUserWriteBlocking are both set to true when the
start API initiates mongosync.