Hey,
we have a very simple mongo replication cluster setup with 3 nodes. We have our custom software using mongo backend.
Last week I upgraded (or tried, actually) the cluster first from 3.4 → 3.6 (Thursday) and then 3.6 → 4.0 (Friday morning). Later on Friday we noticed odd problem with one component of our custom software to which we couldn’t find any reason and I had to downgrade back to 3.6, which solved the issue.
The only change - besides mongod upgrade - was adding readPreference-parameter to mongodb-connect string to our software components (readPreference=secondaryPreferred) on 3.6 → 4.0 upgrade and removing it (debugging our own software) on the last jump (bindIp 0.0.0.0), which explain the blue area changes.
We have a very simple monitoring for this cluster using Zabbix and mongostat. Today I noticed that, on the primary node getmore commands jumped during the upgrades. LogLevel 1 tells me that (currently, no idea about before) that, those are 99,9% to collection oplog.rs (local.oplog.rs). The graph below represents statistics of the primary node:
So my question is:
A. Is this something adding the readPreference=secondaryPreferred option would cause?
B. Is this something a cluster upgrade would cause?
C. Some other logical explanation?
I can provide more details about upgrade/downgrade processes if that helps, but mostly I’m just wondering if this is “normal” between said versions or something that should be investigated more?
Any advice/comment would be helpful,
Thanks!
