Replication >
Replica Set Data Synchronization

Replica Set Data Synchronization¶

On this page

Initial Sync
Replication

In order to maintain up-to-date copies of the shared data set, secondary members of a replica set sync or replicate data from other members. MongoDB uses two forms of data synchronization: initial sync to populate new members with the full data set, and replication to apply ongoing changes to the entire data set.

Initial Sync¶

Initial sync copies all the data from one member of the replica set to another member.

Process¶

When you perform an initial sync, MongoDB:

Clones all databases except the local database. To clone, the mongod scans every collection in each source database and inserts all data into its own copies of these collections.

Changed in version 3.4: Initial sync builds all collection indexes as the documents are copied for each collection. In earlier versions of MongoDB, only the _id indexes are built during this stage.

Changed in version 3.4: Initial sync pulls newly added oplog records during the data copy. Ensure that the target member has enough disk space in the local database to temporarily store these oplog records for the duration of this data copy stage.
Applies all changes to the data set. Using the oplog from the source, the mongod updates its data set to reflect the current state of the replica set.

When the initial sync finishes, the member transitions from STARTUP2 to SECONDARY.

To perform an initial sync, see Resync a Member of a Replica Set.

Fault Tolerance¶

To recover from transient network or operation failures, initial sync has built-in retry logic.

Changed in version 3.4: MongoDB 3.4 improves the initial sync retry logic to be more resilient to intermittent failures on the network.

Replication¶

Secondary members replicate data continuously after the initial sync. Secondary members copy the oplog from their sync from source and apply these operations in an asynchronous process. [1]

Secondaries may automatically change their sync from source as needed based on changes in the ping time and state of other members’ replication.

Changed in version 3.2: MongoDB 3.2 replica set members with 1 vote cannot sync from members with 0 votes.

Secondaries avoid syncing from delayed members and hidden members.

If a secondary member has members[n].buildIndexes set to true, it can only sync from other members where buildIndexes is true. Members where buildIndexes is false can sync from any other member, barring other sync restrictions. buildIndexes is true by default.

[1] Starting in version 4.0.6, secondary members of a replica set now log oplog entries that take longer than the slow operation threshold to apply. These slow oplog messages are logged for the secondaries in the diagnostic log under the REPL component with the text applied op: <oplog entry> took <num>ms. These slow oplog entries depend only on the slow operation threshold. They do not depend on the log levels (either at the system or component level), or the profiling level, or the slow operation sample rate. The profiler does not capture slow oplog entries. For more information, see Slow Oplog Application.

Multithreaded Replication¶

MongoDB applies write operations in batches using multiple threads to improve concurrency. MongoDB groups batches by namespace (MMAPv1) or by document id (WiredTiger) and simultaneously applies each group of operations using a different thread. MongoDB always applies write operations to a given document in their original write order.

Changed in version 4.0.

Starting in MongoDB 4.0, read operations that target secondaries and are configured with a read concern level of "local" or "majority" will now read from a WiredTiger snapshot of the data if the read takes place on a secondary where replication batches are being applied. Reading from a snapshot guarantees a consistent view of the data, and allows the read to occur simultaneously with the ongoing replication without the need for a lock. As a result, secondary reads requiring these read concern levels no longer need to wait for replication batches to be applied, and can be handled as they are received.

Pre-Fetching Indexes to Improve Replication Throughput¶

Note

Applies to MMAPv1 only.

With the MMAPv1 storage engine, MongoDB fetches memory pages that hold affected data and indexes to help improve the performance of applying oplog entries. This pre-fetch stage minimizes the amount of time MongoDB holds write locks while applying oplog entries. By default, secondaries will pre-fetch all Indexes.

Optionally, you can disable all pre-fetching or only pre-fetch the index on the _id field. See the secondaryIndexPrefetch setting for more information.

← Replica Set Oplog Replica Set Deployment Architectures →