Hi,
I have a mongo cluster in us-west-1 with three members. I have another member sitting in a us-east-1 which is designated for DR.
The member in us-east-1 died and we had to restart the server. Ever since, we are unable to sync it against the primary. We took a snapshot from the primary, copied it to us-east-1, created a volume, attached the volume to the DR member and started mongo.
The DR member shows that it’s 3.67 hours behind the primary and is unable to catch up. When I run db.printSlaveReplicationInfo()
the lag keeps increasing until it reaches a point where it’s so far behind the primary that it transitions to RECOVERING
and rs.status()
says could not find member to sync from
.
The question is, what could cause this member not to catch up with the primary? I’m thinking it’s a combination of network latency together with high writing operations. I read up a bit on write concern but it doesn’t seem like it’s what would solve the issue.
Any ideas how to overcome this issue and sync the DR member successfully?