MongoDB Command failed with error 91(shutDownInProgress)

Deepak_Kumar18 · October 2, 2023, 3:40pm

We have mongo cluster which is a 3 node replicaset and we connect the cluster using Spring boot app. This cluster have a daily maintenance window of 20-30 sec during which one of the node goes down and it automatically comes up post that greenzone.

We are using Mongodb java sync driver(Spring-data-mongodb) of version 4.6.1 and MongoDB cluster is having version - 5.0.14.

In the logs, I could see it continously logs the error “Exception in monitor thread - Command failed with error 91 - The server is in Quiesce mode and will shutdown” for around 1000 times during this period (~30 sec). And after that it automatically connects back with the log - “Monitor thread successfully connected”.

My Query is:

How to avoid logging these many redundant exception from our app as it’s difficult to monitor the useful logs
Why the app tries to connect so frequently, is there any way to increase the retry attempt time so that it improves the logging
In the error it says “Command failed”, what exactly the command is which it tries to execute during the greenzone so frequently(~30/sec). It’s not coming from our App as we don’t have any health check monitor in built.
Does this failure counts comes under Login failure attempt?

Kobe_W · October 3, 2023, 3:51am

is the log your app’s or mongodb node’s?

is the node shutdown ? or only network is down for this node (and mongodb is still running)?

mongodb drivers and cluster nodes all have health check connections to other nodes (so that they know which node is down/up), not sure if the check messages you see come from those.

Deepak_Kumar18 · October 3, 2023, 12:31pm

We are connecting through Spring Data MongoDB and so we are checking logs in our Spring boot app which connects to the 3 nodes cluster (replicaset). We have written configuration class to connect the cluster like below:

MongoClient mongoClient = MongoClients.create(MongoClientSettings.builder() 
													.applyToClusterSettings(builder -> builder
															.requiredReplicaSetName(replicaSet)
															.hosts(Arrays.asList( 
																	new ServerAddress(maasReplicaSetHost1, port), 
																	new ServerAddress(maasReplicaSetHost2, port), 
																	new ServerAddress(maasReplicaSetHost3, port))))

This is a planned downtime happens everyday at same time. And whenever this happens we see this error. Not sure about N/W being down.

Kobe_W · October 3, 2023, 4:34pm

in that case, i’m guessing it’s the cluster monitoring thread (health check) in the driver code. You can check driver documentation and see if there’s any configuration to increase the check interval.