What will happen if a 2 node replica set lose its primary?

jia_shizhen · October 17, 2024, 8:09pm

Hi team, I have read about mongodb document about election. I see in a replica set at least 3 node is required to make sure majority can be reached. So as I understand, if a 2 node replica set loses its primary, the replica set is dead.
Because you need both nodes up and running to reach majority and complete an election.

But when I try to verify this theory, I found that when primary of a 2 node replica set is dead, the secondary can still complete the election and promote it self to primary.

This is really confusing because I don’t think this is the expected behaviour.

Here is how I test it:

start 2 k8s pods, one is mongo-shard-0-0, one is mongo-shard-0-1. This is how it looks like when I enter mongo-shard-0-0 and do rs.status(), result is this:
{
“set” : “mongo-shard-rs-0”,
“date” : ISODate(“2024-10-16T05:41:27.347Z”),
“myState” : 2,
“term” : NumberLong(46),
“syncSourceHost” : “mongo-shard-0-1.mongo-shard-0-svc.default.svc.cluster.local:27017”,
“syncSourceId” : 1,
“heartbeatIntervalMillis” : NumberLong(2000),
“majorityVoteCount” : 2,
“writeMajorityCount” : 2,
“votingMembersCount” : 2,
“writableVotingMembersCount” : 2,
“optimes” : {
“lastCommittedOpTime” : {
“ts” : Timestamp(1729057285, 1),
“t” : NumberLong(46)
},
“lastCommittedWallTime” : ISODate(“2024-10-16T05:41:25.147Z”),
“readConcernMajorityOpTime” : {
“ts” : Timestamp(1729057285, 1),
“t” : NumberLong(46)
},
“appliedOpTime” : {
“ts” : Timestamp(1729057285, 1),
“t” : NumberLong(46)
},
“durableOpTime” : {
“ts” : Timestamp(1729057285, 1),
“t” : NumberLong(46)
},
“lastAppliedWallTime” : ISODate(“2024-10-16T05:41:25.147Z”),
“lastDurableWallTime” : ISODate(“2024-10-16T05:41:25.147Z”)
},
“lastStableRecoveryTimestamp” : Timestamp(1729057275, 1),
“electionParticipantMetrics” : {
“votedForCandidate” : true,
“electionTerm” : NumberLong(46),
“lastVoteDate” : ISODate(“2024-10-16T05:40:25.036Z”),
“electionCandidateMemberId” : 1,
“voteReason” : “”,
“lastAppliedOpTimeAtElection” : {
“ts” : Timestamp(1729057182, 1),
“t” : NumberLong(44)
},
“maxAppliedOpTimeInSet” : {
“ts” : Timestamp(1729057182, 1),
“t” : NumberLong(44)
},
“priorityAtElection” : 1,
“newTermStartDate” : ISODate(“2024-10-16T05:40:25.060Z”),
“newTermAppliedDate” : ISODate(“2024-10-16T05:40:27.177Z”)
},
“members” : [
{
“_id” : 0,
“name” : “mongo-shard-0-0.mongo-shard-0-svc.default.svc.cluster.local:27017”,
“health” : 1,
“state” : 2,
“stateStr” : “SECONDARY”,
“uptime” : 77,
“optime” : {
“ts” : Timestamp(1729057285, 1),
“t” : NumberLong(46)
},
“optimeDate” : ISODate(“2024-10-16T05:41:25Z”),
“lastAppliedWallTime” : ISODate(“2024-10-16T05:41:25.147Z”),
“lastDurableWallTime” : ISODate(“2024-10-16T05:41:25.147Z”),
“syncSourceHost” : “mongo-shard-0-1.mongo-shard-0-svc.default.svc.cluster.local:27017”,
“syncSourceId” : 1,
“infoMessage” : “”,
“configVersion” : 1153530,
“configTerm” : 46,
“self” : true,
“lastHeartbeatMessage” : “”
},
{
“_id” : 1,
“name” : “mongo-shard-0-1.mongo-shard-0-svc.default.svc.cluster.local:27017”,
“health” : 1,
“state” : 1,
“stateStr” : “PRIMARY”,
“uptime” : 69,
“optime” : {
“ts” : Timestamp(1729057285, 1),
“t” : NumberLong(46)
},
“optimeDurable” : {
“ts” : Timestamp(1729057285, 1),
“t” : NumberLong(46)
},
“optimeDate” : ISODate(“2024-10-16T05:41:25Z”),
“optimeDurableDate” : ISODate(“2024-10-16T05:41:25Z”),
“lastAppliedWallTime” : ISODate(“2024-10-16T05:41:25.147Z”),
“lastDurableWallTime” : ISODate(“2024-10-16T05:41:25.147Z”),
“lastHeartbeat” : ISODate(“2024-10-16T05:41:25.570Z”),
“lastHeartbeatRecv” : ISODate(“2024-10-16T05:41:27.060Z”),
“pingMs” : NumberLong(0),
“lastHeartbeatMessage” : “”,
“syncSourceHost” : “”,
“syncSourceId” : -1,
“infoMessage” : “”,
“electionTime” : Timestamp(1729057225, 1),
“electionDate” : ISODate(“2024-10-16T05:40:25Z”),
“configVersion” : 1153530,
“configTerm” : 46
}
],
“ok” : 1,
“$gleStats” : {
“lastOpTime” : Timestamp(0, 0),
“electionId” : ObjectId(“000000000000000000000000”)
},
“lastCommittedOpTime” : Timestamp(1729057285, 1),
“$configServerState” : {
“opTime” : {
“ts” : Timestamp(1729057284, 1),
“t” : NumberLong(-1)
}
},
“$clusterTime” : {
“clusterTime” : Timestamp(1729057285, 1),
“signature” : {
“hash” : BinData(0,“AAAAAAAAAAAAAAAAAAAAAAAAAAA=”),
“keyId” : NumberLong(0)
}
},
“operationTime” : Timestamp(1729057285, 1)
}
Then I scale down mongo shard 0 to only one pod, now mongo-shard-0-1 is killed, only mongo-shard-0-0 exists. When I enter mongo-shard-0-0 and do rs.status(), result is this:
{
“set” : “mongo-shard-rs-0”,
“date” : ISODate(“2024-10-16T05:48:18.841Z”),
“myState” : 1,
“term” : NumberLong(48),
“syncSourceHost” : “”,
“syncSourceId” : -1,
“heartbeatIntervalMillis” : NumberLong(2000),
“majorityVoteCount” : 1,
“writeMajorityCount” : 1,
“votingMembersCount” : 1,
“writableVotingMembersCount” : 1,
“optimes” : {
“lastCommittedOpTime” : {
“ts” : Timestamp(1729057690, 1),
“t” : NumberLong(48)
},
“lastCommittedWallTime” : ISODate(“2024-10-16T05:48:10.632Z”),
“readConcernMajorityOpTime” : {
“ts” : Timestamp(1729057690, 1),
“t” : NumberLong(48)
},
“appliedOpTime” : {
“ts” : Timestamp(1729057690, 1),
“t” : NumberLong(48)
},
“durableOpTime” : {
“ts” : Timestamp(1729057690, 1),
“t” : NumberLong(48)
},
“lastAppliedWallTime” : ISODate(“2024-10-16T05:48:10.632Z”),
“lastDurableWallTime” : ISODate(“2024-10-16T05:48:10.632Z”)
},
“lastStableRecoveryTimestamp” : Timestamp(1729057630, 1),
“electionCandidateMetrics” : {
“lastElectionReason” : “electionTimeout”,
“lastElectionDate” : ISODate(“2024-10-16T05:45:30.617Z”),
“electionTerm” : NumberLong(48),
“lastCommittedOpTimeAtElection” : {
“ts” : Timestamp(1729057495, 1),
“t” : NumberLong(46)
},
“lastSeenOpTimeAtElection” : {
“ts” : Timestamp(1729057495, 1),
“t” : NumberLong(46)
},
“numVotesNeeded” : 1,
“priorityAtElection” : 1,
“electionTimeoutMillis” : NumberLong(10000),
“newTermStartDate” : ISODate(“2024-10-16T05:45:30.628Z”),
“wMajorityWriteAvailabilityDate” : ISODate(“2024-10-16T05:45:30.639Z”)
},
“electionParticipantMetrics” : {
“votedForCandidate” : true,
“electionTerm” : NumberLong(46),
“lastVoteDate” : ISODate(“2024-10-16T05:40:25.036Z”),
“electionCandidateMemberId” : 1,
“voteReason” : “”,
“lastAppliedOpTimeAtElection” : {
“ts” : Timestamp(1729057182, 1),
“t” : NumberLong(44)
},
“maxAppliedOpTimeInSet” : {
“ts” : Timestamp(1729057182, 1),
“t” : NumberLong(44)
},
“priorityAtElection” : 1
},
“members” : [
{
“_id” : 0,
“name” : “mongo-shard-0-0.mongo-shard-0-svc.default.svc.cluster.local:27017”,
“health” : 1,
“state” : 1,
“stateStr” : “PRIMARY”,
“uptime” : 488,
“optime” : {
“ts” : Timestamp(1729057690, 1),
“t” : NumberLong(48)
},
“optimeDate” : ISODate(“2024-10-16T05:48:10Z”),
“lastAppliedWallTime” : ISODate(“2024-10-16T05:48:10.632Z”),
“lastDurableWallTime” : ISODate(“2024-10-16T05:48:10.632Z”),
“syncSourceHost” : “”,
“syncSourceId” : -1,
“infoMessage” : “”,
“electionTime” : Timestamp(1729057530, 1),
“electionDate” : ISODate(“2024-10-16T05:45:30Z”),
“configVersion” : 1182308,
“configTerm” : -1,
“self” : true,
“lastHeartbeatMessage” : “”
}
],
“ok” : 1,
“$gleStats” : {
“lastOpTime” : Timestamp(0, 0),
“electionId” : ObjectId(“7fffffff0000000000000030”)
},
“lastCommittedOpTime” : Timestamp(1729057690, 1),
“$configServerState” : {
“opTime” : {
“ts” : Timestamp(1729057697, 2),
“t” : NumberLong(-1)
}
},
“$clusterTime” : {
“clusterTime” : Timestamp(1729057697, 2),
“signature” : {
“hash” : BinData(0,“AAAAAAAAAAAAAAAAAAAAAAAAAAA=”),
“keyId” : NumberLong(0)
}
},
“operationTime” : Timestamp(1729057690, 1)
}

steevej · October 18, 2024, 12:42pm

Please read Formatting code and log snippets in posts and update your post accordingly so that we can understand what you shared. JSON is not readable when it is not indented correct.

If you know

why are you testing replica set with only 2 nodes since it is NOT RECOMMENDED configuration?

It makes me feel like you are planning to deploy something that is NOT RECOMMENDED and trying to see if you could cut corner and get away it.

Please explain the procedure you did to

In the second rs.status(), I was expecting en entry in the members array for the down node with a health value of 0. But I see only one entry.

chris · October 18, 2024, 1:38pm

The topic originally interested me but due to the lack of formatting I start passing these posts by.

steevej · October 18, 2024, 1:43pm

Please step in when the formatting get fixed.

@+

jia_shizhen · October 21, 2024, 1:23pm

I did not find an edit button, so I started a new post “What happens if a 2 node replica set lose its primary?”.
The format should be good now.
Thank you.

jia_shizhen · October 21, 2024, 1:23pm

Hi, thank you for your reply.
The reason why I’m doing this is because I didn’t know there must be at least 3 nodes when I set up this mongo cluster. I was surprised that a 2 node replica set works, which is against document says, so I decide to understand the reason.
How I shut down mongo-shard-0-1:
First I started 2 mongo pods as kubernetes stateful set.
mongo-shard-0-0 is secondary and mongo-shard-0-1 is primary.
Then I scale down this stateful set down to 1 replicas, k8s will terminate mongo-shard-0-1, then you can see mongo-shard-0-0 became primary.

steevej · October 21, 2024, 7:58pm

This thread is related or the same as What happens if a 2 node replica set lose its primary?.

Please explain how you did

From you output, it appears that you have removed one of the node from the replica set because the members: array only has 1 member. If you did that, then you do not have a 2 nodes replica set anymore. You have a 1 node replica set. It is normal to have it as the PRIMARY.

system · October 30, 2024, 2:37am

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.