readConcern with more than 1 secondary and an arbiter

MaBeuLux88_xxx · October 12, 2021, 5:04pm

Arbiters are not recommended in production environments in general because they can create “weird” situations that can be dangerous for the stability of the RS.

Let’s take your example: 4 nodes + 1 arbiter. Your majority = 3. So this means that – in theory – you should still be up and running if 2 nodes are suddenly out of the picture (data center that contains 2 nodes has a connection failure for example).
Yet, it’s not the case here. If 2 nodes fails, you will end up with 2 nodes (P+S) and one arbiter. The arbiter participates in the majority for the votes, but not in the majority for the majority commit point. This will force the majority commit point to lag behind which will start to create some cache pressure on your storage engine that will need to keep all the changes that happen after the commit point on disk to retain a durable history.

See Performance Issues with PSA replica sets and Mitigate Performance Issues with PSA Replica Set.

The consequence is that if 2 nodes fail, your majority commit point cannot move forward and you cannot read anything with readConcern=majority + you are building cache pressure (==timed bomb).

Same problem if you try to write something with writeConcern=“majority” (==3). As only 2 nodes are really bearing data in your PSA cluster (S+S down) => You cannot write.

Conclusion: Don’t use an Arbiter in production environment to avoid issues. If you do, you shouldn’t use writeConcern=“majority”. You are stuck at maximum w=2 in this scenario and you cannot use readConcern=“majority”. Note that doing this doesn’t solve the problem of the cache pressure that will start to build up in this situation as soon as you are in this state.

Cheers,
Maxime.