The mongosync binary is the primary process used in
Mongosync. mongosync migrates data from one cluster to
another.
For an overview of the mongosync process, see About mongosync.
To get started with mongosync, refer to the Quick Start Guide.
For more detailed information, refer to the
Installation or Connecting mongosync page that best fits your
situation.
Embedded Verifier Disclaimer
Starting in 1.9, mongosync includes an embedded verifier to
perform a series of verification checks on all supported
collections on the destination cluster to confirm that
it was successful in transferring documents from the
source cluster to the destination.
When you start the mongosync process, it provides the following disclaimer:
Embedded verification is enabled by default. Verification checks for data consistency between the source and destination clusters. Verification will cause mongosync to fail if any inconsistencies are detected, but it does not check for all possible data inconsistencies. Please see the documentation at https://www.mongodb.com/docs/cluster-to-cluster-sync/current/reference/verification/embedded for more details. Verification requires approximately 0.5 GB of memory per 1 million documents on the source cluster and will fail if insufficient memory is available. Accepting this disclaimer indicates that you understand the limitations and memory requirements for this tool. To skip this disclaimer prompt, use –-acceptDisclaimer. To disable the embedded verifier, specify 'verification: false' when starting mongosync. Please see https://www.mongodb.com/docs/cluster-to-cluster-sync/current/reference/verification/ for alternative verification methods. Do you want to continue? (y/n):
If you have already read and accepted the disclaimer, you can
start mongosync with the --acceptDisclaimer option
to skip this notification.
Settings
Cluster Independence
mongosync syncs collection data between a source cluster and
destination cluster. mongosync does not synchronize users or roles. As a result, you can create
users with different access permissions on each cluster.
Configuration File
Options for mongosync can be set in a YAML configuration file. Use
the --config option. For example:
mongosync --config /etc/mongosync.conf
For information on available settings, see Configuration.
Cluster and Collection Types
Sharded Clusters
Mongosync supports replication between sharded clusters.
mongosync replicates individual shards in parallel from the source
cluster to the destination cluster. However mongosync does not
preserve the source cluster's sharding configuration.
Important
You must always disable the balancer on a sharded destination
cluster by using balancerStop.
After stopping the balancer, wait fifteen minutes before
starting mongosync. This gives the cluster time to
finish any in-progress chunk migrations.
If the source or destination cluster is a sharded cluster
and you are not running mongosync with namespace
filtering,
you must disable the source cluster's balancer
by running the balancerStop command and waiting 15 minutes
for the command to complete.
If the source or destination cluster is a sharded cluster and you
are running mongosync with namespace filtering, you can
globally enable the source cluster's
balancer but you must disable it for
all collections within the namespace filter.
See Disabling Balancer for Collections in Filtered Sync. You can also fully disable
the source cluster's balancer.
During migration, do not run the moveChunk or
moveRange commands. If you have enabled the source cluster's
balancer, but disabled it for collections within the namespace
filter, do not run shardCollection on collections
within the namespace filter. If you run shardCollection on
collections within the namespace filter during the migration, mongosync
returns an error and stops, which requires you to start the migration
from scratch.
Automatic Balancer Disabling
Starting in version 1.17, mongosync disables the balancer on source
and destination clusters during initialization if it detects that the
balancers are not disabled.
This only applies during initialization. If mongosync detects that
either balancer is enabled after the migration begins, mongosync fails.
After disabling the balancer, mongosync waits for 15 minutes to ensure that
in-progress chunk migrations complete before continuing with the migration.
If the migration is not reversible and mongosync disables the source or destination
balancer during initialization, after a successful commit mongosync re-enables
the balancer(s) it disabled. If the migration is reversible, mongosync does not re-enable
any balancers to avoid making users wait 15 minutes.
IMPORTANT: If mongosync disables the balancer for either cluster
and then fails before commit, you must re-enable the balancer(s) manually
by using the balancerStart database command if you do not plan
to run mongosync again.
Disabling Balancer for Collections in Filtered Sync
If you are using a namespace filter
and want to enable your source cluster's balancer for
collections outside the namespace filter,
follow these instructions before
you start mongosync.
Enable the balancer for the source cluster.
Before starting mongosync with a namespace filter, enable the balancer for the source cluster
by running the sh.startBalancer() method in mongosh.
Disable the balancer for each collection.
Disable the balancer for each collection within the
namespace filter by running the setAllowMigrations command:
db.adminCommand( { setAllowMigrations: “<db>.<collection>”, allowMigrations: false } )
Run the preceding command for every collection within the namespace filter.
Important
If you enable the source cluster's balancer but do not use a
namespace filter, or if you do not disable the balancer for all
collections within the namespace filter, mongosync fails.
Pre-Split Chunks
When mongosync syncs to a sharded destination cluster, it pre-splits chunks
for sharded collections on the destination cluster. For each sharded collection,
mongosync attempts to create 90 chunks.
Chunk Distribution
Important
Even if the source cluster is balanced, mongosync doesn't
ensure balance of the destination cluster. Because mongosync
doesn't support the execution of sharding operations during
migration, you must wait until it is safe to accept writes
to rebalance the destination cluster. See Sharded Cluster Balancer
for guidance on how to rebalance the cluster and
sharded cluster limitations
for information on sharded cluster limitations in mongosync.
mongosync does not preserve chunk distribution from the source to
the destination, even with multiple mongosync instances. It is not
possible to reproduce a particular pre-split of chunks from a source
cluster on the destination cluster.
The only sharding configuration that mongosync preserves from the
source cluster to the destination cluster is the sharding key. Once the
migration finishes, you can enable the destination cluster's balancer which
distributes documents independently of the source cluster's distribution.
Primary Shards
When you sync to a sharded destination cluster, mongosync assigns a
primary shard to each database by means of a round-robin.
Warning
Running movePrimary on the source or destination cluster
during migration may result in a fatal error or require you to
restart the migration from the start. For more information, see
Sharded Clusters.
Config Shard Cluster
Starting in 8.0, MongoDB introduces support for config shard clusters, also known as embedded config server clusters.
mongosync supports sync from dedicated config server sharded clusters to
embedded config server sharded clusters and vice versa. Additionally, mongosync
supports sync from replica sets to config sharded clusters, but not vice versa.
To learn more about embedded config servers, see Config Shard.
Multiple Clusters
To sync a source cluster to multiple destination clusters, use one
mongosync instance for each destination cluster. For more
information, see Multiple Clusters Limitations.
Capped Collections
Starting in 1.3.0, Mongosync supports capped collections with some limitations.
convertToCappedis not supported. If you runconvertToCapped,mongosyncexits with an error.cloneCollectionAsCappedis not supported.
Capped collections on the source cluster work normally during sync.
Capped collections on the destination cluster have temporary changes during sync:
There is no maximum number of documents.
The maximum collection size is 1PB.
mongosync restores the original values for maximum number of
documents and maximum document size during commit.
Reads and Writes
Write Blocking
By default, mongosync enables destination-only
write-blocking on the destination cluster.
mongosync unblocks writes right before the
/progress endpoint reports
that canWrite is true. You can explicitly
enable destination-only write-blocking by using
the /start endpoint to set
enableUserWriteBlocking to "destinationOnly".
You can enable dual write-blocking.
If you enable dual write-blocking, mongosync blocks writes:
On the destination cluster during the migration.
mongosyncunblocks writes right before it setscanWritetotrueOn the source cluster after you call
/commit
To enable dual write-blocking, use /start
to set enableUserWriteBlocking to "sourceAndDestination".
You can use
/start
to set enableUserWriteBlocking to "none".
You cannot enable dual write-blocking or disable write-blocking after the sync starts.
If you want to use reverse synchronization later,
you must enable dual write-blocking when you start mongosync.
User Permissions
To set enableUserWriteBlocking, the mongosync user must have a
role that includes the setUserWriteBlockMode and
bypassWriteBlockingMode ActionTypes.
Note
When using enableUserWriteBlocking, writes are only blocked for users
that do not have the bypassWriteBlockingMode ActionType. Users
who have this ActionType are able to perform writes.
Permissible Reads
Read operations on the source cluster are always permitted.
When the /progress endpoint reports canWrite is
true, the data on the source and destination clusters is consistent.
Permissible Writes
To see what state mongosync is in, call the /progress API endpoint. The /progress output includes a
boolean value, canWrite.
When
canWriteistrue, it is safe to write to the destination cluster.When
canWriteisfalse, do not write to the destination cluster.
You can safely write to the source cluster while mongosync is
syncing. Do not write to the destination cluster unless canWrite is
true.
Read and Write Concern
By default, mongosync sets the read concern level to
"majority" for reads on the source cluster. For writes on
the destination cluster, mongosync sets the write concern level to
"majority" with j: true.
For more information on read and write concern configuration and behavior, see Read Concern and Write Concern.
Read Preference
mongosync requires the primary read preference when
connecting to the source and destination clusters. For more information,
see Read Preference Options.
Legacy Index Handling
mongosync rewrites legacy index values, like 0 or an empty
string, to 1 on the destination. mongosync also removes any
invalid index options on the destination.
Mid-sync Considerations
mongosync replication is different from replication of data
within a Replica Set.
mongosync combines and reorders writes
from the source to destination cluster during the sync, and also
temporarily modifies various collection
characteristics.
As a result, the destination is not guaranteed to match the source cluster at any point while the sync is still executing, including when the sync is paused. To ensure the destination and source clusters match before cutting over, call the commit endpoint.
The relationship between the source and destination cluster terminates upon commit, unless you use the reverse functionality. For information on mid-sync constraints, see Limitations for limitations.
Important
Until you've called commit on mongosync and canWrite successfully
returns true, the migrated collections on the destination cluster cannot be used to accept
application read or write traffic. Do not use mongosync to maintain secondary clusters for Disaster Recovery,
Analytics, or other similar use cases.
Temporary Changes to Collection Characteristics
mongosync temporarily alters the following collection characteristics during
synchronization. The original values are restored during the commit process.
Change | Description |
|---|---|
Unique Indexes | Unique indexes on the source cluster are synced as non-unique indexes on the destination cluster. |
TTL Indexes | Synchronization sets |
Hidden Indexes | Synchronization replicates hidden indexes as non-hidden. |
Write Blocking | If you enable dual write-blocking,
To learn more, see Write Blocking. |
Capped Collections | Synchronization sets capped collections to the maximum allowable size. |
Dummy Indexes | In some cases, synchronization may create dummy indexes on the destination to support writes on sharded or collated collections. |
Rolling Index Builds
mongosync does not support rolling index builds during migration. To avoid building
indexes in a rolling fashion during migration, use one of the following
methods to ensure that your destination indexes match your source
indexes:
Build the index on the source before migration.
Build the index on the source during migration with a default index build.
Build the index on the destination after migration.
mongosync Metadata
mongosync stores its metadata in a database or multiple databases
during migration. The metadata databases can be named any of the following:
mongosync_reserved_for_internal_useAnything beginning with
mongosync_internal_Anything beginning with
mongosync_reserved_for_verification_
You should drop any metadata databases after a successful migration. After dropping metadata, it is not possible to reverse the migration.
Destination Clusters
Consistency
mongosync supports eventual consistency on the destination
cluster. Read consistency is not guaranteed on the destination cluster until
commit. Before committing, the source and destination clusters may differ at a
given point in time. To learn more, see Mid-sync Considerations.
While mongosync is syncing, mongosync may reorder or combine writes
as it relays them from source to destination. For a given document, the total
number of writes may differ between source and destination.
Transactions might not appear atomically on the destination cluster. Retryable writes may not be retryable on the destination cluster.
Profiling
If profiling is enabled on a source database, MongoDB creates a special
collection named <db>.system.profile. After synchronization is
complete, Mongosync will not drop the
<db>.system.profile collection from the destination even if the
source database is dropped at a later time. The <db>.system.profile
collection will not change the accuracy of user data on the
destination.
Views
If a database with views is dropped on the source, the destination may
show an empty system.views collection in that database. The empty
system.views collection will not change the accuracy of user
data on the destination.
System Collections
Mongosync does not replicate system collections to the destination cluster.
If you issue a dropDatabase command on the source cluster,
this change is not directly applied on the destination cluster. Instead,
Mongosync drops user collections and views in the database
on the destination cluster, but it does not drop system collections
on that database.
For example, on the destination cluster:
The drop operation does not affect a user-created
system.jscollection.If you enable profiling, the
system.profilecollection remains.If you create views on the source cluster and then drop the database, replicating the drop removes the views, but leaves an empty
system.viewscollection.
In these cases, the replication of dropDatabase removes all user-created
collections from the database, but leaves its system collections on the
destination cluster.
UUIDs
mongosync creates collections with new UUIDs on the destination cluster. There is no
relationship between UUIDs on the source cluster and the destination
cluster. If applications contain hard-coded UUIDs (which MongoDB does
not recommend), you may need to update those applications before they
work properly with the migrated cluster.
Sorting
mongosync inserts documents on the destination cluster in an
undefined order which does not preserve natural sort order from the
source cluster. If applications depend on document order but don't have
a defined sort method, you may need to update those applications to
specify the expected sort order before the applications work properly
with the migrated cluster.
Performance
Resilience
mongosync is resilient and able to handle non-fatal errors. Logs
that contain the word "error" or "failure" do not indicate that
mongosync is failing or corrupting data. For example, if a network
error occurs, the mongosync log may contain the word "error' but
mongosync is still able to complete the sync. In the case that a
sync does not complete, mongosync writes a fatal log entry.
Data Definition Language (DDL) Operations
Using DDL operations (operations that act on collections or databases
such as db.createCollection() and db.dropDatabase())
during sync increase the risk of migration failure and may negatively
impact mongosync performance. For best performance, refrain from
performing DDL operations on the source cluster while the sync is in
progress.
For more information on DDL operations, see Pending DDL Operations and Transactions.
Network Latency
Network latency or long physical distances between migration components can negatively affect sync speed.
- Latency between
mongosyncand destination shards - For each operation on the source cluster,
mongosyncdoes two roundtrips to the destination server. The larger the latency, the slower the sync. - Latency between destination shards
mongosyncruns operations and updates its own metadata in batches in a transaction on the destination cluster. This can result in cross-shard transactions, which may be more costly if the shards are far apart.- Latency between the nodes of any replica set on the source or destination cluster
mongosyncuses"majority"writes and"majority"reads, which require acknowledgement from multiple nodes in a replica set, including shard-backing replica sets. If the majority of these nodes aren't in the same region, there will be negative performance implications.
Interruptions During Sync
The following considerations pertain to interruptions during the
mongosync process.
Errors and Crashes
If mongosync encounters an error or becomes unavailable during
synchronization, or you can resume your mongosync operation from where
it stopped. The mongosync binary is stateless and stores the
metadata for a restart on the destination cluster.
To continue sync, restart mongosync once it becomes available again
and use the same parameters as your interupted sync. Once you restart
mongosync, the process resumes from where it stopped.
Cluster Availability
If your source or destination cluster crashes unexpectedly, you can safely
restart mongosync from where it left off. Once your cluster is available
again, restart mongosync and use the same parameters as your interupted
sync.
Paused Sync
If mongosync is in the PAUSED state,
mongosync does not support the following actions:
Upgrading the MongoDB version of the source or destination cluster
Enabling and then disabling the balancer
You can upgrade mongosync while it is in the PAUSED state.