How Enabling FCV on Sharded Clusters works

Cristi_Radan · September 15, 2020, 12:47pm

Hello,

I’m trying to understand how setting FCV on a sharded cluster works. Taking as an example version 4.0.
According to the documentation setting new FCV on a shared cluster is being done through a mongos instance.
I tried to check the code and found the following workflow:

FCV is triggered on mongos : mongo/cluster_set_feature_compatibility_version_cmd.cpp at 91e3352a1aa717674575fce3cc6edb2f279a4479 · mongodb/mongo · GitHub
mongos sends the command to config servers: mongo/cluster_set_feature_compatibility_version_cmd.cpp at 91e3352a1aa717674575fce3cc6edb2f279a4479 · mongodb/mongo · GitHub
config replicaset executes mongo/set_feature_compatibility_version_command.cpp at v4.0 · mongodb/mongo · GitHub
it does that through the block dedicated to config server: mongo/set_feature_compatibility_version_command.cpp at v4.0 · mongodb/mongo · GitHub
Within this block config will update to new FCV and trigger upgradeChunksHistory for each collection.
It will trigger the upgrade of FCV on each shard: mongo/set_feature_compatibility_version_command.cpp at v4.0 · mongodb/mongo · GitHub.
setting FCV on config replicaset.
===============================
As I understand from the above FCV is set on config replicaset after it was set on all shards.
Locks, at least in version 4.0 seem to be set only at individual replicaset level, correct?
Is this correct flow in case of a sharded cluster?

Thank you,
Cristian

Joe_Drumgoole · September 16, 2020, 4:33pm

Hi Cristian, why is important to know this. This is the kind of internal detail that is likely to change in future versions of MongoDB and shouldn’t make a difference to end users?

Cristi_Radan · September 22, 2020, 9:39am

Hi ,

Sorry for the late reply, I’ll try to give some context on why we’re interested how this works, especially for 4.0.
We have some very large clusters in PROD (over 20 shards / multiple databases and collections / config database having over 1M documents) . We’ve observed impact during setting FCV to 4.0 on such a cluster while the whole cluster “freezes” when cache chunks are refreshed on each shard.
I think that understanding this process, would help us seek/find a way to do this with minimum impact on the overall cluster. I hope to understand what would be the factors that can cause impact when setting FCV.

Thank you,
Cristian

Joe_Drumgoole · September 22, 2020, 10:20am

Hi Cristian, you should raise a SERVER ticket on jira.mongodb.org. Our core engineering team may have some answers.

Cristi_Radan · September 22, 2020, 10:48am

Thank you Joe.
I will do that.

Best Regards,
Cristian