Mongo cluster availability issues

We are having problems from time to time in the same environment, where the mongo cluster (installed operator - percona), becomes unavailable.

But until it’s time to re-establish the environment, we de-escalate the mongo and re-climb.

Percona version 11.2.1
Mongo version 5.0.7

Some bugs that caught my attention:

mongo-cluster-cfg-0:

"s":"I",  "c":"CONNPOOL", "id":22576,   "ctx":"ReplicaSetMonitor-TaskExecutor","msg":"Connecting","attr":{"hostAndPort":"mongodb-cluster-rs0-1.mongodb-cluster-rs0.mongodb.svc.cluster.local:27017"}}

"s":"W",  "c":"NETWORK",  "id":23235,   "ctx":"ReplicaSetMonitor-TaskExecutor","msg":"SSL peer certificate validation failed","attr":{"reason":"self signed certificate"}}

"s":"I",  "c":"NETWORK",  "id":4333213, "ctx":"ReplicaSetMonitor-TaskExecutor","msg":"RSM Topology Change","attr":{"replicaSet":"rs0","newTopologyDescription":"{ id: \"1cfeadb1-bb62-4744-912b-c19ce0c33385\", topologyType: \"ReplicaSetNoPrimary\", servers: { mongodb-cluster-rs0-0.mongodb-cluster-rs0.mongodb.svc.cluster.local:27017: { address: \"mongodb-cluster-rs0-0.mongodb-cluster-rs0.mongodb.svc.cluster.local:27017\", type: \"Unknown\", minWireVersion: 0, maxWireVersion: 0, lastUpdateTime: new Date(-9223372036854775808), hosts: {}, arbiters: {}, passives: {} }, mongodb-cluster-rs0-1.mongodb-cluster-rs0.mongodb.svc.cluster.local:27017: { address: \"mongodb-cluster-rs0-1.mongodb-cluster-rs0.mongodb.svc.cluster.local:27017\", topologyVersion: { processId: ObjectId('63a050118196af8adf3e85d9'), counter: 3 }, roundTripTime: 792, lastWriteDate: new Date(1676554432000), opTime: { ts: Timestamp(1676554432, 1), t: 32 }, type: \"RSSecondary\", minWireVersion: 13, maxWireVersion: 13, me: \"mongodb-cluster-rs0-1.mongodb-cluster-rs0.mongodb.svc.cluster.local:27017\", setName: \"rs0\", setVersion: 108768, primary: \"mongodb-cluster-rs0-2.mongodb-cluster-rs0.mongodb.svc.cluster.local:27017\", lastUpdateTime: new Date(1676554439781), logicalSessionTimeoutMinutes: 30, hosts: { 0: \"mongodb-cluster-rs0-0.mongodb-cluster-rs0.mongodb.svc.cluster.local:27017\", 1: \"mongodb-cluster-rs0-1.mongodb-cluster-rs0.mongodb.svc.cluster.local:27017\", 2: \"mongodb-cluster-rs0-2.mongodb-cluster-rs0.mongodb.svc.cluster.local:27017\" }, arbiters: {}, passives: {}, tags: { podName: \"mongodb-cluster-rs0-1\", serviceName: \"mongodb-cluster\" } }, mongodb-cluster-rs0-2.mongodb-cluster-rs0.mongodb.svc.cluster.local:27017: { address: \"mongodb-cluster-rs0-2.mongodb-cluster-rs0.mongodb.svc.cluster.local:27017\", type: \"Unknown\", minWireVersion: 0, maxWireVersion: 0, lastUpdateTime: new Date(-9223372036854775808), hosts: {}, arbiters: {}, passives: {} } }, logicalSessionTimeoutMinutes: 30, setName: \"rs0\", compatible: true, maxElectionIdSetVersion: { electionId: ObjectId('7fffffff0000000000000020'), setVersion: 108768 } }","previousTopologyDescription":"{ id: \"f3fd6d57-5393-4493-8c11-c37d7262fbcc\", topologyType: \"ReplicaSetNoPrimary\", servers: { mongodb-cluster-rs0-0.mongodb-cluster-rs0.mongodb.svc.cluster.local:27017: { address: \"mongodb-cluster-rs0-0.mongodb-cluster-rs0.mongodb.svc.cluster.local:27017\", type: \"Unknown\", minWireVersion: 0, maxWireVersion: 0, lastUpdateTime: new Date(-9223372036854775808), hosts: {}, arbiters: {}, passives: {} }, mongodb-cluster-rs0-1.mongodb-cluster-rs0.mongodb.svc.cluster.local:27017: { address: \"mongodb-cluster-rs0-1.mongodb-cluster-rs0.mongodb.svc.cluster.local:27017\", type: \"Unknown\", minWireVersion: 0, maxWireVersion: 0, lastUpdateTime: new Date(-9223372036854775808), hosts: {}, arbiters: {}, passives: {} }, mongodb-cluster-rs0-2.mongodb-cluster-rs0.mongodb.svc.cluster.local:27017: { address: \"mongodb-cluster-rs0-2.mongodb-cluster-rs0.mongodb.svc.cluster.local:27017\", type: \"Unknown\", minWireVersion: 0, maxWireVersion: 0, lastUpdateTime: new Date(-9223372036854775808), hosts: {}, arbiters: {}, passives: {} } }, setName: \"rs0\", compatible: true, maxElectionIdSetVersion: { electionId: ObjectId('7fffffff0000000000000020'), setVersion: 108768 } }"}}

cluster-mongoS-0

"s":"I",  "c":"NETWORK",  "id":4333213, "ctx":"ReplicaSetMonitor-TaskExecutor","msg":"RSM Topology Change","attr":{"replicaSet":"cfg","newTopologyDescription":"{ id: \"61b03df2-d4a2-453f-b841-8483b2c25bb7\", topologyType: \"ReplicaSetWithPrimary\", servers: { mongodb-cluster-cfg-0.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017: { address: \"mongodb-cluster-cfg-0.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\", topologyVersion: { processId: ObjectId('63ee2aac594f4fe7eab6c104'), counter: 5 }, roundTripTime: 997, lastWriteDate: new Date(1676554296000), opTime: { ts: Timestamp(1676554296, 4), t: 39 }, type: \"RSPrimary\", minWireVersion: 13, maxWireVersion: 13, me: \"mongodb-cluster-cfg-0.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\", setName: \"cfg\", setVersion: 137515, electionId: ObjectId('7fffffff0000000000000027'), primary: \"mongodb-cluster-cfg-0.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\", lastUpdateTime: new Date(1676554296828), logicalSessionTimeoutMinutes: 30, hosts: { 0: \"mongodb-cluster-cfg-0.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\", 1: \"mongodb-cluster-cfg-1.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\", 2: \"mongodb-cluster-cfg-2.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\" }, arbiters: {}, passives: {}, tags: { podName: \"mongodb-cluster-cfg-0\", serviceName: \"mongodb-cluster\" } }, mongodb-cluster-cfg-1.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017: { address: \"mongodb-cluster-cfg-1.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\", topologyVersion: { processId: ObjectId('63ee2a68c3b77cbea7abef9b'), counter: 80 }, roundTripTime: 765, lastWriteDate: new Date(1676554296000), opTime: { ts: Timestamp(1676554296, 4), t: 39 }, type: \"RSSecondary\", minWireVersion: 13, maxWireVersion: 13, me: \"mongodb-cluster-cfg-1.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\", setName: \"cfg\", setVersion: 137515, primary: \"mongodb-cluster-cfg-0.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\", lastUpdateTime: new Date(1676554296920), logicalSessionTimeoutMinutes: 30, hosts: { 0: \"mongodb-cluster-cfg-0.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\", 1: \"mongodb-cluster-cfg-1.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\", 2: \"mongodb-cluster-cfg-2.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\" }, arbiters: {}, passives: {}, tags: { podName: \"mongodb-cluster-cfg-1\", serviceName: \"mongodb-cluster\" } }, mongodb-cluster-cfg-2.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017: { address: \"mongodb-cluster-cfg-2.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\", topologyVersion: { processId: ObjectId('63b71bb65ff176f83ec5c1b1'), counter: 6 }, roundTripTime: 992, lastWriteDate: new Date(1676552843000), opTime: { ts: Timestamp(1676552843, 2), t: 38 }, type: \"RSOther\", minWireVersion: 13, maxWireVersion: 13, me: \"mongodb-cluster-cfg-2.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\", setName: \"cfg\", setVersion: 137515, lastUpdateTime: new Date(1676554295898), logicalSessionTimeoutMinutes: 30, hosts: { 0: \"mongodb-cluster-cfg-0.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\", 1: \"mongodb-cluster-cfg-1.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\", 2: \"mongodb-cluster-cfg-2.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\" }, arbiters: {}, passives: {}, tags: { podName: \"mongodb-cluster-cfg-2\", serviceName: \"mongodb-cluster\" } } }, logicalSessionTimeoutMinutes: 30, setName: \"cfg\", compatible: true, maxElectionIdSetVersion: { electionId: ObjectId('7fffffff0000000000000027'), setVersion: 137515 } }","previousTopologyDescription":"{ id: \"3da4bfe1-ae74-47ab-9598-a2b6aa66224a\", topologyType: \"ReplicaSetWithPrimary\", servers: { mongodb-cluster-cfg-0.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017: { address: \"mongodb-cluster-cfg-0.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\", topologyVersion: { processId: ObjectId('63ee2aac594f4fe7eab6c104'), counter: 5 }, roundTripTime: 997, lastWriteDate: new Date(1676554296000), opTime: { ts: Timestamp(1676554296, 4), t: 39 }, type: \"RSPrimary\", minWireVersion: 13, maxWireVersion: 13, me: \"mongodb-cluster-cfg-0.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\", setName: \"cfg\", setVersion: 137515, electionId: ObjectId('7fffffff0000000000000027'), primary: \"mongodb-cluster-cfg-0.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\", lastUpdateTime: new Date(1676554296828), logicalSessionTimeoutMinutes: 30, hosts: { 0: \"mongodb-cluster-cfg-0.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\", 1: \"mongodb-cluster-cfg-1.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\", 2: \"mongodb-cluster-cfg-2.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\" }, arbiters: {}, passives: {}, tags: { podName: \"mongodb-cluster-cfg-0\", serviceName: \"mongodb-cluster\" } }, mongodb-cluster-cfg-1.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017: { address: \"mongodb-cluster-cfg-1.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\", type: \"Unknown\", minWireVersion: 0, maxWireVersion: 0, lastUpdateTime: new Date(-9223372036854775808), hosts: {}, arbiters: {}, passives: {} }, mongodb-cluster-cfg-2.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017: { address: \"mongodb-cluster-cfg-2.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\", topologyVersion: { processId: ObjectId('63b71bb65ff176f83ec5c1b1'), counter: 6 }, roundTripTime: 992, lastWriteDate: new Date(1676552843000), opTime: { ts: Timestamp(1676552843, 2), t: 38 }, type: \"RSOther\", minWireVersion: 13, maxWireVersion: 13, me: \"mongodb-cluster-cfg-2.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\", setName: \"cfg\", setVersion: 137515, lastUpdateTime: new Date(1676554295898), logicalSessionTimeoutMinutes: 30, hosts: { 0: \"mongodb-cluster-cfg-0.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\", 1: \"mongodb-cluster-cfg-1.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\", 2: \"mongodb-cluster-cfg-2.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017\" }, arbiters: {}, passives: {}, tags: { podName: \"mongodb-cluster-cfg-2\", serviceName: \"mongodb-cluster\" } } }, logicalSessionTimeoutMinutes: 30, setName: \"cfg\", compatible: true, maxElectionIdSetVersion: { electionId: ObjectId('7fffffff0000000000000027'), setVersion: 137515 } }"}}

"s":"I",  "c":"SHARDING", "id":471693,  "ctx":"ReplicaSetMonitor-TaskExecutor","msg":"Updating the shard registry with confirmed replica set","attr":{"connectionString":"cfg/mongodb-cluster-cfg-0.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017,mongodb-cluster-cfg-1.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017,mongodb-cluster-cfg-2.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017"}}

"s":"I",  "c":"SHARDING", "id":22846,   "ctx":"UpdateReplicaSetOnConfigServer","msg":"Updating sharding state with confirmed replica set","attr":{"connectionString":"cfg/mongodb-cluster-cfg-0.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017,mongodb-cluster-cfg-1.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017,mongodb-cluster-cfg-2.mongodb-cluster-cfg.mongodb.svc.cluster.local:27017"}}

Do you have any idea what’s going on?

I didn’t put the log files here because I didn’t know if I could

Hi @Rafael_Carvalho2 welcome to the community!

By “Percona” do you mean Percona Operator for MongoDB?

If yes, then I’m afraid Percona is a separate entity not affiliated with MongoDB, and thus we cannot tell you what went wrong.

From the logs you posted, it seems that you wanted to deploy a sharded cluster using TLS? For official MongoDB servers, you might find these links helpful:

Note that deploying a sharded cluster is considered to be an advanced MongoDB topic. A sharded cluster is a great deployment when you need horizontal scaling and more parallelization, but it requires careful planning and more advanced operational skills vs. a more basic replica set deployment.

Alternatively you might want to check out MongoDB Atlas if you prefer to offload the operational concerns.

Best regards
Kevin

1 Like