In order to maintain up-to-date copies of the shared data set, secondary members of a replica set sync or replicate data from other members. MongoDB uses two forms of data synchronization: initial sync to populate new members with the full data set, and replication to apply ongoing changes to the entire data set.
Initial sync copies all the data from one member of the replica set to another member. See Initial Sync Source Selection for more information on initial sync source selection criteria.
Starting in MongoDB 5.2, initial syncs can be logical or file copy based.
When you perform a logical initial sync, MongoDB:
Builds all collection indexes as the documents are copied for each collection.
Pulls newly added oplog records during the data copy. Ensure that the target member has enough disk space in the
localdatabase to temporarily store these oplog records for the duration of this data copy stage.
Applies all changes to the data set. Using the oplog from the source, the
mongodupdates its data set to reflect the current state of the replica set.
To perform an initial sync, see Resync a Member of a Replica Set.
Available in MongoDB Enterprise only.
File copy based initial sync runs the initial sync process by copying and moving files on the file system. This sync method can be faster than logical initial sync.
File copy based initial sync may cause inaccurate counts
To enable file copy based initial sync, set the
initialSyncMethod parameter to
fileCopyBased on the
destination member for the initial sync. This parameter can only be set
File copy based initial sync replaces the
local database on the
member being synced to with the
local database from the member
being synced from.
During a file copy based initial sync:
You cannot run a backup on the member that is being synced to or the member that is being synced from.
You cannot write to the
localdatabase on the member that is being synced to.
You can only run an initial sync from one given member at a time.
When using the encrypted storage engine, MongoDB uses the the source key to encrypt the destination.
If a secondary performing initial sync encounters a non-transient (i.e. persistent) network error during the sync process, the secondary restarts the initial sync process from the beginning.
Starting in MongoDB 4.4, a secondary performing initial sync can attempt to resume the sync process if interrupted by a transient (i.e. temporary) network error, collection drop, or collection rename. The sync source must also run MongoDB 4.4 to support resumable initial sync. If the sync source runs MongoDB 4.2 or earlier, the secondary must restart the initial sync process as if it encountered a non-transient network error.
By default, the secondary tries to resume initial sync for 24 hours.
MongoDB 4.4 adds the
parameter for controlling the amount of time the secondary attempts to
resume initial sync. If the secondary cannot successfully resume the
initial sync process during the configured time period, it selects a new
healthy source from the replica set and restarts the initial
synchronization process from the beginning.
The secondary attempts to restart the initial sync up to
before returning a fatal error.
chainingis disabled), select the primary as the sync source. If the primary is unavailable or unreachable, log an error and periodically check for primary availability.
primaryPreferred(default for voting replica set members), attempt to select the primary as the sync source. If the primary is unavailable or unreachable, perform sync source selection from the remaining replica set members.
For all other supported read modes, perform sync source selection from the replica set members.
Members performing initial sync source selection make two passes through the list of all replica set members:
If the member cannot select an initial sync source after two passes, it
logs an error and waits
1 second before restarting the selection
process. The secondary
mongod can restart the initial
sync source selection process up to
10 times before exiting with an
Secondaries may automatically change their sync from source as needed based on changes in the ping time and state of other members' replication. See Replication Sync Source Selection for more information on sync source selection criteria.
||| Starting in version 4.2, secondary
members of a replica set now log oplog entries that take longer than the slow operation
threshold to apply. These slow oplog messages:|
Starting in MongoDB 4.4, sync from sources send a continuous stream of oplog entries to their syncing secondaries. Streaming replication mitigates replication lag in high-load and high-latency networks. It also:
Reduces staleness for reads from secondaries.
Reduces risk of losing write operations with w: 1 due to primary failover.
Prior to MongoDB 4.4, secondaries fetched batches of oplog entries by issuing a request to their sync
from source and waiting for a response. This required a network roundtrip
for each batch of oplog entries. MongoDB
4.4 adds the
oplogFetcherUsesExhaust startup parameter for
disabling streaming replication and using the older replication behavior.
oplogFetcherUsesExhaust parameter to
false only if
there are any resource constraints on the sync from source or if you wish
to limit MongoDB's usage of network bandwidth for replication.
MongoDB applies write operations in batches using multiple threads to improve concurrency. MongoDB groups batches by document ID (WiredTiger) and simultaneously applies each group of operations using a different thread. MongoDB always applies write operations to a given document in their original write order.
Read operations that
target secondaries and are
configured with a read concern level of
"majority" read from
a WiredTiger snapshot of the data if the read
takes place on a secondary where replication batches are being applied.
Reading from a snapshot guarantees a consistent view of the data, and allows the read to occur simultaneously with the ongoing replication without the need for a lock. As a result, secondary reads requiring these read concern levels no longer need to wait for replication batches to be applied, and can be handled as they are received.
Starting in MongoDB 4.2, administrators can limit the rate at which
the primary applies its writes with the goal of keeping the
committed lag under
a configurable maximum value
By default, flow control is
For more information, see Flow Control.
Replication sync source selection depends on the replica set
With chaining enabled (default), perform sync source selection from the replica set members.
With chaining disabled, select the primary as the sync source. If the primary is unavailable or unreachable, log an error and periodically check for primary availability.
Members performing replication sync source selection make two passes through the list of all replica set members:
If the member cannot select a sync source after two passes, it logs an
error and waits
1 second before restarting the selection process.
The number of times a source can be changed per hour is
configurable by setting the
Starting in MongoDB 4.4, the startup parameter
initialSyncSourceReadPreference takes precedence over
the replica set's
settings.chainingAllowed setting when
selecting an initial sync source. After a replica set member
successfully performs initial sync, it defers to the value of
chainingAllowed when selecting a replication sync
See Initial Sync Source Selection for more information on initial sync source selection.