One of the two secondary nodes did not start election when primary node is down for a longtime

Hi,

I have deployed Mongodb as a replicaset on a RKE2 cluster with 3 replicas. One of them acts as PRIMARY node and the other 2 replicas act as SECONDARY nodes.

Unfortunately one of the worker nodes on which the PRIMARY node is scheduled is taken for maintenance then the replica is trying to coming up until the maintenance window is finished.

Here the worker node is still in READY state, so the pod is not rescheduled on a different node. Also, the pod did not come into READY state until the maintenance is finished.

Then the election should happen between the two SECONDARY nodes but it did not happen.

Please help me in understating the reason for no election happened in rest of the 2 SECONDARY nodes when PRIMARY is not available.

Mongodb Version: 4.4.12
Replicaset with 3 replicas (1 PRIMARY & 2 SECONDARY & 1 ARBITER)

I would assume it has to do with your replica set configuration, you need an ODD number of voting members in the replica set, but you have an EVEN number. An arbiter is a voting member in the election process so you have 4 numbers.

Even number of voting members can cause issues in elections and voting.

Deploy an Odd Number of Members

Ensure that the replica set has an odd number of voting members. A replica set can have up to 7 voting members. If you have an even number of voting members, deploy another data bearing voting member or, if constraints prohibit against another data bearing voting member, an arbiter.

An arbiter does not store a copy of the data and requires fewer resources. As a result, you may run an arbiter on an application server or other shared resource. With no copy of the data, it may be possible to place an arbiter into environments that you would not place other members of the replica set. Consult your security policies.

3 Likes

Thanks for the suggestion. Now we are planning to increase the total voting members to 5 by adding one more Arbiter.
Can you please let us know how to decrease the time for primary election? because we have a lot of writes that will fail until the Primary is elected.