Cluster-to-Cluster Sync Quickstart

On this page

Overview

Setup
Define a source and a destination cluster
Define administrative users
Download and install mongosync
Connect the Clusters
Format your connection strings
Initialize mongosync
Initialization Notes
Synchronize Data Between Clusters
Finalize Cutover Process
One-Time Sync
Data Verification
Synchronization Notes

Overview

MongoDB Cluster-to-Cluster Sync connects MongoDB clusters and provides a way to synchronize data between them. The tool that makes the connection is mongosync. For more details, please see the rest of the Cluster-to-Cluster Sync documentation.

mongosync syncs data between two clusters.

Each cluster can be a replica set or a sharded cluster. If either is a sharded cluster, consult the sharded cluster limitations for mongosync.
Sharded clusters are not required to have the same number of shards.
The destination cluster must be the same version or up to two versions ahead of the source cluster. The patch version is not important, so long as they meet the minimum patch version requirements.
The source cluster can remain active until commit because mongosync syncs writes on the source cluster for the duration of the migration until commit is called.

Follow the instructions below to set up Cluster-to-Cluster Sync, connect your clusters, and synchronize your data.

Setup

Define a source and a destination cluster

If you already have a MongoDB cluster, either self-managed or hosted in MongoDB Atlas, use that cluster as the source cluster. If you don't have a cluster to work with, you will need to create one.

This Quickstart works when the destination cluster and the source cluster are both replica sets.

To sync from a replica set to a sharded cluster, see Sync a Replica Set to a Sharded Cluster. To sync between sharded clusters, see Sync Sharded Clusters.

Tip

Important

The destination cluster must have enough disk storage to accommodate the logical data size being migrated and the destination oplog entries from the initial sync. For example, to migrate 10 GB of data, the destination cluster must have at least 10 GB available for the data and another 10 GB for the insert oplog entries from the initial sync.

Important

To use embedded verification, you must have a larger oplog on the destination. If you enable the embedded verifier and reduce the size of the destination oplog, the embedded verifier might not be able to keep up, causing mongosync to error.

If you need to reduce the overhead of the destination oplog entries and the embedded verifier is disabled, you can:

Use the oplogSizeMB setting to lower the destination cluster's oplog size.
Use to oplogMinRetentionHours setting to lower or remove the destination cluster's minimum oplog retention period.

If your clusters are self-managed, they must be MongoDB Enterprise clusters. Cluster-to-Cluster Sync is only supported on MongoDB Community Edition in a limited number of cases. For more information on using Cluster-to-Cluster Sync with MongoDB Community Edition, contact a MongoDB sales representative.

Creating a cluster is beyond the scope of this guide. If you need help, refer to the documentation to create an Atlas cluster or to create a self-managed cluster.

Define administrative users

If either cluster is hosted in Atlas, or if either of them requires authentication, you must create a database user that has permissions in both clusters.

Source Cluster Authentication Requirements

The source user must have the following roles:

In addition, the source user must be able to:

Run the getParameter command

If the source cluster is hosted in Atlas, the user must have the Atlas admin role. The user must also be able to read the change stream for the cluster.

Destination Cluster Authentication Requirements

If the destination cluster is hosted in Atlas, the user must have the Atlas admin role.

Cluster Authentication Notes

To add an Atlas user, see: Configure Database Users.
To add a user to a self-managed cluster, see: Create a User on Self-Managed Deployments.
To verify user permissions, run db.getUser().

Download and install `mongosync`

mongosync is the tool that connects the source and destination clusters. You can host mongosync on its own hardware, mongosync does not have to run on the hardware that hosts your MongodDB clusters.

To install mongosync:

Locate a host server for the mongosync executable that has network connectivity to your source and destination clusters.
Go to the MongoDB Download Center.
Download the mongosync package for your host system.
Unpack the mongosync package. The mongosync executable is in the bin directory.

For operating system specific installation instructions, see Installation.

Connect the Clusters

Format your connection strings

A connection string contains the network and authentication details that mongosync needs to connect to the source and destination clusters.

Determine the hostname or IP address and port for your source and destination clusters. You will use this information and the user authentication details to construct the connection strings.

The standard URI connection scheme has the form:

mongodb://[username:password@]host1[:port1][,...hostN[:portN]][/[defaultauthdb][?options]]

Your connections strings will resemble:

cluster0:
mongodb://clusterAdmin:superSecret@clusterOne01.fancyCorp.com:20020,clusterOne02.fancyCorp.com:20020,clusterOne03.fancyCorp.com:20020
cluster1:
mongodb://clusterAdmin:superSecret@clusterTwo01.fancyCorp.com:20020,clusterTwo02.fancyCorp.com:20020,clusterTwo03.fancyCorp.com:20020

For more details, see Connecting mongosync.

Initialize mongosync

mongosync must create an initial connection to the source and destination clusters before it can start to sync data. To create the initial connection, issue the following command with your connection strings on a single line (the command is reformatted here for clarity):

./bin/mongosync \
      --logPath /var/log/mongosync \
      --cluster0 "mongodb://clusterAdmin:superSecret@clusterOne01.fancyCorp.com:20020,clusterOne02.fancyCorp.com:20020,clusterOne03.fancyCorp.com:20020" \
      --cluster1 "mongodb://clusterAdmin:superSecret@clusterTwo01.fancyCorp.com:20020,clusterTwo02.fancyCorp.com:20020,clusterTwo03.fancyCorp.com:20020"

Initialization Notes

When mongosync first connects to the source and destination clusters it is in the IDLE state.
mongosync does not synchronize data until it receives the start command.
Designate the source and destination clusters with the start command. "cluster0" and "cluster1" are just labels, either cluster can be cluster0 or cluster1.

Synchronize Data Between Clusters

The start endpoint initiates data synchronization. To start syncing, use curl or a similar program to issue the start request:

curl localhost:27182/api/v1/start -XPOST \
--data '
   {
      "source": "cluster0",
      "destination": "cluster1"
   } '

If the start request is successful, mongosync returns { "success": true } and starts to synchronize existing data on the source cluster with the destination cluster. At this point, mongosync enters the RUNNING state and applies subsequent source cluster writes to the destination cluster.

To check the status of the sync, call the progress endpoint:

curl localhost:27182/api/v1/progress -XGET

If the progress response includes the field canCommit: true, the clusters are in sync and the destination cluster continuously replicates data from the source cluster.

The command interface for mongosync is an HTTP server that publishes an HTTP API. To control mongosync, use the API endpoints. The API documentation provides details on using the following endpoints:

Endpoint	Description
`start`	Starts the synchronization between a source and destination cluster.
`progress`	Returns the status of the synchronization process.
`pause`	Pauses the current synchronization operation.
`resume`	Resumes a paused synchronization session based on data stored on the destination cluster.
`commit`	Commits the synchronization operation to the destination cluster.
`reverse`	Reverses the direction of a committed sync operation.

Finalize Cutover Process

You can finalize a migration and transfer your application workload from the source to the destination cluster using the mongosync cutover process.

For more information, see Finalize Cutover Process.

One-Time Sync

After initializing data synchronization, call the progress endpoint to see the status of the synchronization process:

curl localhost:27182/api/v1/progress -XGET

For a one time sync, verify that the progress response includes the following field values:

state: "RUNNING"
canCommit: true
lagTimeSeconds is near 0 (Recommended, but not required)

Then, call the commit endpoint to commit the synchronization operation to the destination cluster and stop continuous replication:

curl localhost:27182/api/v1/commit -XPOST --data '{ }'

If the commit request is successful, mongosync returns { "success": true } and enters the COMMITTING state. After the sync is complete, mongosync enters the COMMITTED state and the clusters are no longer in continuous sync.

Data Verification

Before transferring your application load from the source cluster to the destination, check your data to ensure that the sync was successful.

Note

If mongosync stops during commit, before the /progress endpoint reports canWrite: true, you must restart the entire migration to ensure that it's verified.

For more information, see Verify Data Transfer.

Synchronization Notes

The default port for the HTTP API is 27182. Use the --port option with mongosync to configure another port.
mongosync can swap the source and destination clusters to enable reverse synchronization.
For more information, see the reverse endpoint.
The user specified in the mongosync connection string must have the required permissions on the source and destination clusters. The permissions vary depending on your environment and if you want to modify write-blocking settings or use reverse sync.
To determine the correct the user permissions for your use case, see User Permissions.
You may need to increase the file descriptor ulimits on the host that is running mongosync. This applies to any UNIX-like system, but macOS in particular has low defaults. See UNIX ulimit settings.
To estimate the size of oplog needed for initial synchronization, see oplog Sizing.

Back

Cluster-to-Cluster Sync

About mongosync

Overview

Setup

Define a source and a destination cluster

Tip

See also:

Important

Important

Define administrative users

Source Cluster Authentication Requirements

Destination Cluster Authentication Requirements

Cluster Authentication Notes

Download and install mongosync

Connect the Clusters

Format your connection strings

Initialize mongosync

Initialization Notes

Synchronize Data Between Clusters

Finalize Cutover Process

One-Time Sync

Data Verification

Note

Synchronization Notes

Download and install `mongosync`