Cluster-to-Cluster Sync Quickstart
On this page
- Overview
- Setup
- Define a source and a destination cluster
- Define administrative users
- Download and install
mongosync
- Connect the Clusters
- Format your connection strings
- Initialize mongosync
- Initialization Notes
- Synchronize Data Between Clusters
- Finalize Cutover Process
- One-Time Sync
- Data Verification
- Synchronization Notes
Overview
MongoDB Cluster-to-Cluster Sync connects MongoDB clusters and provides a way to synchronize data between them. The tool that makes the connection is mongosync. For more details, please see the rest of the Cluster-to-Cluster Sync documentation.
mongosync
syncs data between two clusters.
Each cluster can be a replica set or a sharded cluster. If either is a sharded cluster, consult the sharded cluster limitations for
mongosync
.Sharded clusters are not required to have the same number of shards.
The destination cluster must be the same version or up to two versions ahead of the source cluster. The patch version is not important, so long as they meet the minimum patch version requirements.
Follow the instructions below to set up Cluster-to-Cluster Sync, connect your clusters, and synchronize your data.
Setup
Define a source and a destination cluster
If you already have a MongoDB cluster, either self-managed or hosted in MongoDB Atlas, use that cluster as the source cluster. If you don't have a cluster to work with, you will need to create one.
This Quickstart works when the destination cluster and the source cluster are both replica sets.
To sync from a replica set to a sharded cluster, see Sync a Replica Set to a Sharded Cluster. To sync between sharded clusters, see Sync Sharded Clusters.
Tip
See also:
You can migrate between clusters on the same or different MongoDB versions. For more information on MongoDB server version compatibility or cross-version migrations, see MongoDB Server Version Compatibility.
The number of nodes in the destination replica set does not have to equal the number of nodes in the source replica set.
If your clusters are self-managed, they must be MongoDB Enterprise clusters. Cluster-to-Cluster Sync is only supported on MongoDB Community Edition in a limited number of cases. For more information on using Cluster-to-Cluster Sync with MongoDB Community Edition, contact a MongoDB sales representative.
Creating a cluster is beyond the scope of this guide. If you need help, refer to the documentation to create an Atlas cluster or to create a self-managed cluster.
Define administrative users
If either cluster is hosted in Atlas, or if either of them requires authentication, you must create a database user that has permissions in both clusters.
Source Cluster Authentication Requirements
The source user must have the following roles:
readAnyDatabase
roleclusterMonitor
rolebackup
role
In addition, the source user must be able to:
Run the
getParameter
command
If the source cluster is hosted in Atlas, the user must have the Atlas admin role. The user must also be able to read the change stream for the cluster.
Destination Cluster Authentication Requirements
If the destination cluster is hosted in Atlas, the user must have the Atlas admin role.
Cluster Authentication Notes
To add an Atlas user, see: Configure Database Users.
To add a user to a self-managed cluster, see: Create a User on Self-Managed Deployments.
To verify user permissions, run
db.getUser()
.
Download and install mongosync
mongosync
is the tool that connects the source and
destination clusters. You can host mongosync
on its own
hardware, mongosync
does not have to run on the hardware that
hosts your MongodDB clusters.
To install mongosync
:
Locate a host server for the
mongosync
executable that has network connectivity to your source and destination clusters.Go to the MongoDB Download Center.
Download the
mongosync
package for your host system.Unpack the
mongosync
package. Themongosync
executable is in thebin
directory.
For operating system specific installation instructions, see Installation.
Connect the Clusters
Format your connection strings
A connection string contains the network and authentication
details that mongosync
needs to connect to the source and
destination clusters.
Determine the hostname or IP address and port for your source and destination clusters. You will use this information and the user authentication details to construct the connection strings.
The standard URI connection scheme has the form:
mongodb://[username:password@]host1[:port1][,...hostN[:portN]][/[defaultauthdb][?options]]
Your connections strings will resemble:
cluster0: mongodb://clusterAdmin:superSecret@clusterOne01.fancyCorp.com:20020,clusterOne02.fancyCorp.com:20020,clusterOne03.fancyCorp.com:20020 cluster1: mongodb://clusterAdmin:superSecret@clusterTwo01.fancyCorp.com:20020,clusterTwo02.fancyCorp.com:20020,clusterTwo03.fancyCorp.com:20020
For more details, see Connecting mongosync
.
Initialize mongosync
mongosync
must create an initial connection to the source and
destination clusters before it can start to sync data. To create the initial
connection, issue the following command with your connection
strings on a single line (the command is
reformatted here for clarity):
./bin/mongosync \ --logPath /var/log/mongosync \ --cluster0 "mongodb://clusterAdmin:superSecret@clusterOne01.fancyCorp.com:20020,clusterOne02.fancyCorp.com:20020,clusterOne03.fancyCorp.com:20020" \ --cluster1 "mongodb://clusterAdmin:superSecret@clusterTwo01.fancyCorp.com:20020,clusterTwo02.fancyCorp.com:20020,clusterTwo03.fancyCorp.com:20020"
Initialization Notes
When
mongosync
first connects to the source and destination clusters it is in the IDLE state.mongosync
does not synchronize data until it receives the start command.Designate the source and destination clusters with the start command. "cluster0" and "cluster1" are just labels, either cluster can be
cluster0
orcluster1
.
Synchronize Data Between Clusters
The start endpoint initiates data synchronization.
To start syncing, use curl
or a similar program to issue the
start request:
curl localhost:27182/api/v1/start -XPOST \ --data ' { "source": "cluster0", "destination": "cluster1" } '
If the start request is successful, mongosync
returns { "success": true }
and starts to synchronize existing data on
the source cluster with the destination cluster. At this point,
mongosync
enters the RUNNING
state and applies subsequent source
cluster writes to the destination cluster.
To check the status of the sync, call the progress endpoint:
curl localhost:27182/api/v1/progress -XGET
If the progress
response includes the
field canCommit: true
, the clusters are in sync and the destination
cluster continuously replicates data from the source cluster.
The command interface for mongosync
is an HTTP server that publishes
an HTTP API. To control mongosync
, use the API endpoints. The API
documentation provides details on using the following endpoints:
Endpoint | Description |
---|---|
Starts the synchronization between a source and destination
cluster. | |
Returns the status of the synchronization process. | |
Pauses the current synchronization operation. | |
Resumes a paused synchronization session based on data stored on
the destination cluster. | |
Commits the synchronization operation to the destination
cluster. | |
Reverses the direction of a committed sync operation. |
Finalize Cutover Process
You can finalize a migration and transfer your application
workload from the source to the destination cluster using the
mongosync
cutover process.
For more information, see Finalize Cutover Process.
One-Time Sync
After initializing data synchronization, call the progress endpoint to see the status of the synchronization process:
curl localhost:27182/api/v1/progress -XGET
For a one time sync, verify that the progress
response includes the
following field values:
state: "RUNNING"
canCommit: true
lagTimeSeconds
is near0
(Recommended, but not required)
Then, call the commit endpoint to commit the synchronization operation to the destination cluster and stop continuous replication:
curl localhost:27182/api/v1/commit -XPOST --data '{ }'
If the commit
request is successful, mongosync
returns
{ "success": true }
and enters the COMMITTING
state. After the sync
is complete, mongosync
enters the COMMITTED
state and the
clusters are no longer in continuous sync.
Data Verification
Before transferring your application load from the source cluster to the destination, check your data to ensure that the sync was successful.
For more information, see Verify Data Transfer.
Synchronization Notes
The default port for the HTTP API is
27182
. Use the--port
option withmongosync
to configure another port.mongosync
can swap the source and destination clusters to enable reverse synchronization.For more information, see the
reverse
endpoint.The user specified in the
mongosync
connection string must have the required permissions on the source and destination clusters. The permissions vary depending on your environment and if you want to run a write-blocking or reverse sync.To determine the correct the user permissions for your use case, see User Permissions.
You may need to increase the file descriptor
ulimits
on the host that is runningmongosync
. This applies to any UNIX-like system, but macOS in particular has low defaults. See UNIX ulimit settings.To estimate the size of
oplog
needed for initial synchronization, see oplog Sizing.