Mongodb cluster : one node failing - "Heartbeat failed after max retries"

I’m new to the mongoDB and in my current project we have setup mongodb cluster (1 Master and 2 slaves) for document management service in development environment (In Progress).

As the setup already existed, it was running correctly. There is a REST API end point which uploads the document to the mongo DB.

As we are required to do some performance testing with mongodb, I initiated the load testing via the JMeter/groovy and created a test plan which inserts data directly to the mongoDB by simulating 5 users.

This was working fine, and I was able to run the test couple of hours without any problem.

Later I ran the test more than 5 hours and then was trying to stop the load test and it seems it was running continuously, then I analysed the mongo db nodes and it was still running and inserting data to the db, since this was still running after sometime, I stopped the each mongod services (mongod01, mongod02, mongod03) and then was trying to start the services.

Now in my setup mongod01 or mongod02, only one of them starts and mongod03 is working fine. And mongod03 is currently works as PRIMARY.

The error givesz:

“ctx”:“ReplCoord-5”,“msg”:“Heartbeat failed after max retries”,“attr”:{“target”:“:27018”,“maxHeartbeatRetries”:2,“error”:{“code”:6,“codeName”:“HostUnreachable”,“errmsg”:“Error connecting to :27018 (10.103.58.45:27018) :: caused by :: Connection refused”}}}

I would really appreciate any comment or idea, thank you.

My Setup :

Node mongodb01

storage:
dbPath: /u01/mongo/cluster/mongo01

# network interfaces
replication:
  replSetName: "rs0"
net:
  port: 27017
  bindIp: 0.0.0.0  # Enter 0.0.0.0,:: to bind to all IPv4 and IPv6 addresses or, alternatively, use the net.bindIpAll setting.

Node mongodb02

storage:
dbPath: /u01/mongo/cluster/mongo02


# network interfaces
replication:
   replSetName: "rs0"
net:
  port: 27018
  bindIp: 0.0.0.0 

Node mongodb03

dbPath: /u01/mongo/cluster/mongo03

storage:
# network interfaces
replication:
   replSetName: "rs0"
net:
  port: 27019
  bindIp: 0.0.0.0  # Enter 0.0.0.0,:: to bind to all IPv4 and IPv6 addresses or, alternatively, use the net.bindIpAll setting.

Hi, I believe the “HostUnreachable” error is definitely caused by either network between your nodes or TCP port rules of your firewall. Besides, the syntax of dbPath in your configuration file, in my opinion, seems also erroneous and confusing.

1 Like