Operator: Detected unclean shutdown - mongod seems never to shutdown gracefully

John_Moser1 · November 12, 2022, 5:36pm

Hi there

It looks to me that mongod’s in the statefulset pods are not properly shut down.

I can see during the startup this error “Detected unclean shutdown”. Since there are a lot of collections/files to be scanned the startup takes like > 1h.

I was wondering if the shutdown can be performed gracefully like (resp. why is this not built in ?)

        lifecycle:
          preStop:
            exec:
              command:
              - /bin/bash
              - -c
              - <gracefully shutdown mongodb>

Regards
John

Yilmaz_Durmaz · November 12, 2022, 11:50pm

how are you shutting down the servers?

shutting down mongod instances is done with a database command run on admin database with admin rights.

check this link: shutdown — MongoDB Manual

you can also issue a database command from the terminal with mongo shell. here is an example:

mongo --eval "db.getSiblingDB('admin').shutdownServer()"

John_Moser1 · November 13, 2022, 12:12pm

I am talking of kubernetes/operator where you don’t have explicit control of mongod.

A pod with a container running mongod can be shutdown any time (for ex. to be scheduled to another node). Not sure if you are aware that the operator manages at least 3 pods in a replicaset.

=> so your proposal does not make sense at all.

the shutdown needs to be included in the pod’s lifecycle (then as you suggest - a graceful shutdown)

Yilmaz_Durmaz · November 13, 2022, 12:49pm

It is not possible to run a server without some control of your own (or at least of some admin user). each server is an image after all on which you can either give customized parameters or have a new customized image if it misses something in it, for example, mongo shell.

And the command to use in your preStop hook can be the one I wrote above.

your problem can also be for the default grace period being 30 seconds. if servers take longer, then increase this with terminationGracePeriodSeconds:60 for example. in fact, why don’t you first give this a shot before diving into possibly more complicated waters of troubleshooting?

PS: shutdowns of k8s machines are not instant. they are waited up to 30 sec before stopped completely

John_Moser1 · November 13, 2022, 2:45pm

Again … I am not asking for possibilities. I assume this is a well known issue and the solution should be well known.

PS: we are running an app with 1000 cores on GKE … I think, I should know the basics.

steevej · November 13, 2022, 10:59pm

If it works for 10k but not for 100k, I would tend to assume that the termination logic is implemented correctly.

Note that 100k collections is at least 200k number of files. And 1 more files per index per collection. It is quite possible that the problem is related to something taking too much time to do. The following makes a lot of sense

so is the proposed idea:

From k8s’ documentation:

Once the grace period has expired, the KILL signal is sent to any remaining processes

Receiving the KILL signal with generate

and the KILL signal is sent to process that are still running after terminationGracePeriodSeconds, like it might be the case when trying to flush and close 200k files.