What is the impact of different oplog sizes?

Hi Team,

What would happen if

  1. Oplog size of a DB is reduced from number A to B (A>B)?
  2. Oplog size of a replica set Primary is higher than that of the secondaries. Will secondary nodes notice any difference in syncing due to oplog size difference? Is it advisable to have different oplog size of members of same replica set?

Hi @Joanne,

Reducing the size of the oplog will reduce the amount of data it can hold. You want your oplog to be able to hold at least as much data as your longest potential downtime would be. If you have some sort of monitoring on your system you will be able to trend how much of a window your oplog contains over time. Reducing the oplog size could cause problems with a resync if too small as the oldest data could be overwritten before the sync completes and you would be stuck in a loop of trying to sync data.

My recommendation is to keep the oplogs the same size on all machines in a replica set. You never know when your primary will go down and if the secondary has a smaller oplog size t will hold less data which could cause problems. Also note that all the data in the Primary oplog is replicated down to the Secondary oplogs so your Secondary oplog window would be shorter.

2 Likes

Hi,

To further clarify Doug’s comment: Primary should be considered a transient role so another secondary can be elected in the event of failover. If you have different replica set member configurations in terms of oplog size or system resources, failover may result in unanticipated consequences (for example, reducing the time you have to get a former primary back online before it becomes stale).

Varying member configurations are supported, but you should have good reasons for doing so and should definitely try to model failover scenarios. You will encounter fewer operational challenges using the default deployment settings with identically configured replica set members.

Some replica set configuration/failover considerations:

  • Failover is not always a result of failure. For example, regular maintenance activities such as upgrading software versions may also require you to briefly restart services on a replica set member.

  • If your deployment is distributed across multiple data centres, consider the effect of chained replication (which is enabled by default). With chained replication a secondary can choose to replicate from another secondary of the replica set which is closer (based on network ping time) than the current primary.

  • The current duration of the oplog is estimated based on the timestamps between first and last entries. If data insert or update patterns change significantly in your workload, the oplog duration will also be affected. Note: the upcoming MongoDB 4.4 server release adds a new Minimum Oplog Retention Period to provide better assurance on oplog duration.

If you want to better understand behaviour for a proposed deployment configuration, I suggest standing up a replica set in a test environment to simulate scenarios.

If oplog sizes vary, members with a larger oplog will be able to store more history. The oplog sizes on each replica set member are not determined by the size of the source oplog.

Regards,
Stennie