Docs Menu
Docs Home
/
MongoDB Cluster-to-Cluster Sync

Cluster-to-Cluster Sync Quickstart

On this page

  • Overview
  • Setup
  • Define a source and a destination cluster
  • Define administrative users
  • Download and install mongosync
  • Connect the Clusters
  • Format your connection strings
  • Initialize mongosync
  • Initialization Notes
  • Synchronize Data Between Clusters
  • One-Time Sync
  • Data Verification
  • Synchronization Notes

MongoDB Cluster-to-Cluster Sync connects MongoDB clusters and provides a way to synchronize data between them. The tool that makes the connection is mongosync. This page provides a short introduction to help you get started with Cluster-to-Cluster Sync. For more details, please see the rest of the Cluster-to-Cluster Sync documentation.

mongosync syncs data between two clusters.

  • Each cluster can be a replica set or a sharded cluster. If either is a sharded cluster, consult the sharded cluster limitations for mongosync.

  • Sharded clusters are not required to have the same number of shards.

  • The destination cluster must be the same version or up to two versions ahead of the source cluster. The patch version is not important, so long as they meets the minimum patch version requirements.

Follow the instructions below to set up Cluster-to-Cluster Sync, connect your clusters, and synchronize your data.

1

If you already have a MongoDB cluster, either self-managed or hosted in MongoDB Atlas, use that cluster as the source cluster. If you don't have a cluster to work with, you will need to create one.

This Quickstart works when the destination cluster and the source cluster are both replica sets. To sync from a replica set to a sharded cluster, or between sharded clusters, see: Use mongosync on Sharded Clusters.

Tip

See also:

For more information on MongoDB server version compatibility or cross-version migrations, see MongoDB Server Version Compatibility and Support.

The number of nodes in the destination replica set does not have to equal the number of nodes in the source replica set.

If your clusters are self-managed, they must be MongoDB Enterprise clusters. Cluster-to-Cluster Sync is only supported on MongoDB Community Edition in a limited number of cases. For more information on using Cluster-to-Cluster Sync with MongoDB Community Edition, contact a MongoDB sales representative.

Creating a cluster is beyond the scope of this guide. If you need help, refer to the documentation to create an Atlas cluster or to create a self-managed cluster.

2

If either cluster is hosted in Atlas, or if either of them requires authentication, you must create a database user that has permissions in both clusters.

The source user must have the following roles:

In addition, the source user must be able to:

If the source cluster is hosted in Atlas, the user must have the Atlas admin role. The user must also be able to read the change stream for the cluster.

If the destination cluster is hosted in Atlas, the user must have the Atlas admin role.

3

mongosync is the tool that connects the source and destination clusters. You can host mongosync on its own hardware, mongosync does not have to run on the hardware that hosts your MongodDB clusters.

To install mongosync:

  1. Locate a host server for the mongosync executable that has network connectivity to your source and destination clusters.

  2. Go to the MongoDB Download Center.

  3. Download the mongosync package for your host system.

  4. Unpack the mongosync package. The mongosync executable is in the bin directory.

For operating system specific installation instructions, see Installation.

1

A connection string contains the network and authentication details that mongosync needs to connect to the source and destination clusters.

Determine the hostname or IP address and port for your source and destination clusters. You will use this information and the user authentication details to construct the connection strings.

The standard URI connection scheme has the form:

mongodb://[username:password@]host1[:port1][,...hostN[:portN]][/[defaultauthdb][?options]]

Your connections strings will resemble:

cluster0:
mongodb://clusterAdmin:superSecret@clusterOne01.fancyCorp.com:20020,clusterOne02.fancyCorp.com:20020,clusterOne03.fancyCorp.com:20020
cluster1:
mongodb://clusterAdmin:superSecret@clusterTwo01.fancyCorp.com:20020,clusterTwo02.fancyCorp.com:20020,clusterTwo03.fancyCorp.com:20020

For more details, see Connecting mongosync.

2

mongosync must create an initial connection to the source and destination clusters before it can start to sync data. To create the initial connection, issue the following command with your connection strings on a single line (the command is reformatted here for clarity):

./bin/mongosync \
--logPath /var/log/mongosync \
--cluster0 "mongodb://clusterAdmin:superSecret@clusterOne01.fancyCorp.com:20020,clusterOne02.fancyCorp.com:20020,clusterOne03.fancyCorp.com:20020" \
--cluster1 "mongodb://clusterAdmin:superSecret@clusterTwo01.fancyCorp.com:20020,clusterTwo02.fancyCorp.com:20020,clusterTwo03.fancyCorp.com:20020"
  • When mongosync first connects to the source and destination clusters it is in the IDLE state.

  • mongosync does not synchronize data until it receives the start command.

  • Designate the source and destination clusters with the start command. "cluster0" and "cluster1" are just labels, either cluster can be cluster0 or cluster1.

The start endpoint initiates data synchronization. To start syncing, use curl or a similar program to issue the start request:

curl localhost:27182/api/v1/start -XPOST \
--data '
{
"source": "cluster0",
"destination": "cluster1"
} '

If the start request is successful, mongosync returns { "success": true } and starts to synchronize existing data on the source cluster with the destination cluster. At this point, mongosync enters the RUNNING state and applies subsequent source cluster writes to the destination cluster.

To check the status of the sync, call the progress endpoint:

curl localhost:27182/api/v1/progress -XGET

If the progress response includes the field canCommit: true, the clusters are in sync and the destination cluster continuously replicates data from the source cluster.

The command interface for mongosync is an HTTP server that publishes an HTTP API. To control mongosync, use the API endpoints. The API documentation provides details on using the following endpoints:

Endpoint
Description
start
Starts the synchronization between a source and destination cluster.
Returns the status of the synchronization process.
Pauses the current synchronization operation.
Resumes a paused synchronization session based on data stored on the destination cluster.
Commits the synchronization operation to the destination cluster.
Reverses the direction of a committed sync operation.

After initializing data synchronization, call the progress endpoint to see the status of the synchronization process:

curl localhost:27182/api/v1/progress -XGET

For a one time sync, verify that the progress response includes the following field values:

  • state: "RUNNING"

  • canCommit: true

  • lagTimeSeconds is near 0 (Recommended, but not required)

Then, call the commit endpoint to commit the synchronization operation to the destination cluster and stop continuous replication:

curl localhost:27182/api/v1/commit -XPOST --data '{ }'

If the commit request is successful, mongosync returns { "success": true } and enters the COMMITTING state. After the sync is complete, mongosync enters the COMMITTED state and the clusters are no longer in continuous sync.

Before transferring your application load from the source cluster to the destination, check your data to ensure that the sync was successful.

For more information, see Verify Data Transfer.

  • The default port for the HTTP API is 27182. Use the --port option with mongosync to configure another port

  • mongosync can swap the source and destination clusters to enable reverse synchronization.

    For more information, see the reverse endpoint.

  • The user specified in the mongosync connection string must have the required permissions on the source and destination clusters. The permissions vary depending on your environment and if you want to run a write-blocking or reverse sync.

    To determine the correct the user permissions for your use case, see User Permissions.

  • You may need to increase the file descriptor ulimits on the host that is running mongosync. This applies to any UNIX-like system, but macOS in particular has low defaults. See UNIX ulimit settings.

  • To estimate the size of oplog needed for initial synchronization, see oplog Sizing.

← Cluster-to-Cluster Sync