Hi all,
I had a PSA architecture setup and running no problem. Then I was tasked with switching the arbiter and secondary around for bandwidth costs in AZs in a single AWS region.
First I removed the arbiter, converted it to a secondary and added it back to the cluster. Now there it’s PSS.
Then I removed the secondary I’m converting to an arbiter, removed the data, etc.
I then cleared out the data for the data dir, start mongod on the new arbiter, which is the exact same instance class I was using before, t4g.small. More than enough. I get the typical init failure logs/checkpoint logs waiting for replicate set information. Totally normal.
When I go to add the arbiter back with rs.addArb("mongoA.somewhere.com:27017")
, it stays in ARBITER
and healthy for about 10s, then on the mongoA host, the load just completely skyrockets. No connections from anywhere but the PS in the replSet (netstat -natp
). Like 2k+ sysload. The little box dies, and the cluster considers the arbiter unhealthy.
Is there something strange about re-adding an arbiter that was a secondary? Is reusing the hostname bad?
This seems pretty bonkers that the initialization fails. Adding the arbiter originally did not do anything of the sort and only increased CPU a bit. Obviously I don’t want to use something massive just to init the arbiter. Then I’ll have to downgrade it, and add/rm the arbiter again anyways.
Mongo self-hosted 4.4.23 for all three nodes. About 350GB data size.