It looks to me that mongod’s in the statefulset pods are not properly shut down.
I can see during the startup this error “Detected unclean shutdown”. Since there are a lot of collections/files to be scanned the startup takes like > 1h.
I was wondering if the shutdown can be performed gracefully like (resp. why is this not built in ?)
I am talking of kubernetes/operator where you don’t have explicit control of mongod.
A pod with a container running mongod can be shutdown any time (for ex. to be scheduled to another node). Not sure if you are aware that the operator manages at least 3 pods in a replicaset.
=> so your proposal does not make sense at all.
the shutdown needs to be included in the pod’s lifecycle (then as you suggest - a graceful shutdown)
It is not possible to run a server without some control of your own (or at least of some admin user). each server is an image after all on which you can either give customized parameters or have a new customized image if it misses something in it, for example, mongo shell.
And the command to use in your preStop hook can be the one I wrote above.
your problem can also be for the default grace period being 30 seconds. if servers take longer, then increase this with terminationGracePeriodSeconds:60 for example. in fact, why don’t you first give this a shot before diving into possibly more complicated waters of troubleshooting?
PS: shutdowns of k8s machines are not instant. they are waited up to 30 sec before stopped completely
If it works for 10k but not for 100k, I would tend to assume that the termination logic is implemented correctly.
Note that 100k collections is at least 200k number of files. And 1 more files per index per collection. It is quite possible that the problem is related to something taking too much time to do. The following makes a lot of sense
Once the grace period has expired, the KILL signal is sent to any remaining processes
Receiving the KILL signal with generate
and the KILL signal is sent to process that are still running after terminationGracePeriodSeconds, like it might be the case when trying to flush and close 200k files.