MongoDB Command failed with error 91(shutDownInProgress)

We have mongo cluster which is a 3 node replicaset and we connect the cluster using Spring boot app. This cluster have a daily maintenance window of 20-30 sec during which one of the node goes down and it automatically comes up post that greenzone.

We are using Mongodb java sync driver(Spring-data-mongodb) of version 4.6.1 and MongoDB cluster is having version - 5.0.14.

In the logs, I could see it continously logs the error “Exception in monitor thread - Command failed with error 91 - The server is in Quiesce mode and will shutdown” for around 1000 times during this period (~30 sec). And after that it automatically connects back with the log - “Monitor thread successfully connected”.

My Query is:

  1. How to avoid logging these many redundant exception from our app as it’s difficult to monitor the useful logs
  2. Why the app tries to connect so frequently, is there any way to increase the retry attempt time so that it improves the logging
  3. In the error it says “Command failed”, what exactly the command is which it tries to execute during the greenzone so frequently(~30/sec). It’s not coming from our App as we don’t have any health check monitor in built.
  4. Does this failure counts comes under Login failure attempt?

is the log your app’s or mongodb node’s?

is the node shutdown ? or only network is down for this node (and mongodb is still running)?

mongodb drivers and cluster nodes all have health check connections to other nodes (so that they know which node is down/up), not sure if the check messages you see come from those.

1 Like

We are connecting through Spring Data MongoDB and so we are checking logs in our Spring boot app which connects to the 3 nodes cluster (replicaset). We have written configuration class to connect the cluster like below:

MongoClient mongoClient = MongoClients.create(MongoClientSettings.builder() 
													.applyToClusterSettings(builder -> builder
																	new ServerAddress(maasReplicaSetHost1, port), 
																	new ServerAddress(maasReplicaSetHost2, port), 
																	new ServerAddress(maasReplicaSetHost3, port))))

This is a planned downtime happens everyday at same time. And whenever this happens we see this error. Not sure about N/W being down.

in that case, i’m guessing it’s the cluster monitoring thread (health check) in the driver code. You can check driver documentation and see if there’s any configuration to increase the check interval.