We have some problems with 3 nodes replicaset cluster. Two or three times a week, one of the secondary nodes loses connection to the master node, starts to give the error mongodb_target_down host: port mongodb error and nothing helps to fix it. We have to stop the process, delete the data, and start it to replicate. At the same time, there is a connection between the servers, there are no network problems, monitoring (prometheus) of many metrics does not give any problems. Can anyone help with this issue? Mongo is 4.4 in docker container
Welcome to the community, @Anton_Dvornikov! Glad to have you here!
Recently, we were in a similar situation where Docker failed to mount the data volume and lost some files after a restart. That’s the theory we are going with for now till we dig deeper to find the root cause. Anyway, we downgraded docker to 19.03.9 from latest, on centos 7, and we haven’t experienced it since then. It’s something to try if any of this applies to you. I will be curious to find out the final resolution though.
Thank you for your answer, we will try to make a downgrade and see what happens, I hope this will help. Have a nice day!