More info on MongoDB initial sync architecture

kevinadi · October 1, 2020, 3:32am

I’d like to directly answer some of your questions:

If I understand this correctly, while the data copy is happening from the source to the target, any data changes picked up on the source are written to the local database on the target. Once the data copy is complete, the changes in the target local are then executed to ensure the source and target are in sync? This is the behaviour from 3.4 onwards.

Correct.

The Oplog is 150GB. My thinking is that if we generated 300GB of data changes, they would be synced to the local on target and then replayed once the data copy stage has completed?

Correct.

Previously in MongoDB pre-3.4, initial sync happens in two distinct phases: data copy → oplog pull & apply. The primary would need to have enough oplog to cover the whole initial sync time. In some cases, this could be hard to approximate.

Starting from 3.4, the oplog was tailed during the data copy phase, eliminating the need for the primary to have a large oplog to cover the data copy phase. However, the primary would still need to have oplog coverage during the oplog apply phase. This is typically a much smaller window than the data copy phase, though.

Further, in the latest MongoDB 4.4, there are further improvements in replication (see the 4.4 release notes); namely, initial sync would automatically resume in the face of transient network error, streaming replication, and the ability to set the minimum oplog retention period in hours.

Best regards,
Kevin