I have a mongo cluster in us-west-1 with three members. I have another member sitting in a us-east-1 which is designated for DR.
The member in us-east-1 died and we had to restart the server. Ever since, we are unable to sync it against the primary. We took a snapshot from the primary, copied it to us-east-1, created a volume, attached the volume to the DR member and started mongo.
The DR member shows that it’s 3.67 hours behind the primary and is unable to catch up. When I run
db.printSlaveReplicationInfo() the lag keeps increasing until it reaches a point where it’s so far behind the primary that it transitions to
could not find member to sync from.
The question is, what could cause this member not to catch up with the primary? I’m thinking it’s a combination of network latency together with high writing operations. I read up a bit on write concern but it doesn’t seem like it’s what would solve the issue.
Any ideas how to overcome this issue and sync the DR member successfully?