Mongo primary stopped accepting connections - No election happened

  • We faced an issue in our prod env, where one Primary VM stopped accepting any connections, and all prod API s started failing during the period.

  • Attempt to manual Mongo login to the primary member was not successful. Manual try to start up MongoDB in the VM was also unsuccessful.

  • Since , The problemed VM was still showing as “primary” according to rs.status(), No election happened among the available secondaries.

  1. Has anyone faced this issue?
  2. It would be a great help, if you kindly suggest what could be reason behind such behavior

We checked mongos logs, connection was getting accepted and no error was not present till 07:11 UTC. But the error suddenly started exact at 07:12 utc.

{"t":{"$date":"2023-03-29T07:11:58.639+00:00"},"s":"I",  "c":"CONNPOOL", "id":22576,   "ctx":"establishCursors cleanup","msg":"Connecting","attr":{"hostAndPort":"hostname:port"}}

{"t":{"$date":"2023-03-29T07:12:51.493+00:00"},"s":"I",  "c":"NETWORK",  "id":4712102, "ctx":"conn29626580","msg":"Host failed in replica set","attr":{"replicaSet":"rs1","host":"hostname:port","error":{"code":202,"codeName":"NetworkInterfaceExceededTimeLimit","errmsg":"Couldn't get a connection within the time limit of 8ms"},"action":{"dropConnections":false,"requestImmediateCheck":false,"outcome":{"host":"hostname:port","success":false,"errorMessage":"NetworkInterfaceExceededTimeLimit: Couldn't get a connection within the time limit of 8ms"}}}}

Additional details:
We are using Mongo 4.4.7 community version. And we are having below configuration:
config replicaset - 1 Primary, 2 Secondary
shard1 - 1 Primary, 2 Secondary
shard2 - 1 Primary, 2 Secondary
shard3 - 1 Primary, 2 Secondary
2 query router.

We checked our query router available connection, during the issue period.
QR1
“current” : 7654,
“available” : 43546,
“totalCreated” : 134309236,
“active” : 2890,
“exhaustIsMaster” : 487,
“exhaustHello” : 229,
“awaitingTopologyChanges” : 716

QR2
“current” : 7746,
“available” : 43454,
“totalCreated” : 134299931,
“active” : 2997,
“exhaustIsMaster” : 487,
“exhaustHello” : 229,
“awaitingTopologyChanges” : 716

Hi @Debalina_Saha and welcome to the MongoDB community forum!!

The error message that you are facing could occur due to couple of reason and hence there could me multiple resolutions to the same error.

Primarily, the issue could be resolved by increasing the connection pooling size as mentioned by user on the community post.

Also, please upgrade to the latest 4.4.x version available(4.4.19) for bug fixes and new features introduced.

Further, the documentation to add maxpool size in the connection string, could also be a good start.

Also share whole output of below from all shards, including config server replica set

  1. sh.status()
  2. rs.status()
  3. rs.comf()

Let us know if you have any further questions.

Regards
Aasawari

1 Like

Hi @Aasawari ,
Thank you for your suggestions and response.
I will try to check and analyse further about connection Pooling size.

We are planning for version upgrade to Mongo 5.x.x . Kindly confirm the stable release for 5.x version.

I have attached output of sh.status, rs.conf and rs.status from all shards rs1, rs2, rs3 and config. Kindly check and suggest.
reqd_details.txt (42.0 KB)