If any of you using “Mongo 3.6.27 in CentOS 8 Environment” in your projects and not facing any issues ?
Sometime back we upgraded CentOS 7 to CentOS 8 in our product, as part of which we had to upgrade from mongoDB 3.6.9 to 3.6.17.
Post this we are experiencing frequent issues where in after 1-2 days of system run, the mongo secondary members are starting to lag far behind the primary and we have to manually stop the service, followed by deleting the db path and then restart to recover the members. So far we have found this is affects only secondary members.
As mentioned, we currently use mongodb(3.6.17) in our environment and some of our operations involves opening and closing connections. However we already verified that the connections are closed from mongo side but the system keeps holding those connections.
This is leading to high number of files being opened by mongo which we can see in our lsof command and because of this, our server is crashing and our mongo goes into recovery state. Kindly check and let us know why this is happening in our environment
I would like to attached SOSreport from the affected systems for your reference, but the tool is not allowing this file type to upload.
Also when issue is hit below errors are found:
2021-02-24T22:50:04.692+0000 I - [listener] pthread_create failed: Resource temporarily unavailable
2021-02-24T22:50:04.692+0000 W EXECUTOR [conn480782] Terminating session due to error: InternalError: failed to create service entry worker thread
2021-02-24T22:50:05.589+0000 I - [listener] pthread_create failed: Resource temporarily unavailable
2021-02-24T22:50:05.589+0000 W EXECUTOR [conn480783] Terminating session due to error: InternalError: failed to create service entry worker thread
We are manually recovering our mongo replica set but after some time it is again facing the same issue and going into recovery mode.
One more observation is we are noticing this issue only with CentOS 8, and when we try to use the same MongoDB Version 3.6.17 in CentOS 7, there were no such issues reported.
Please help with your inputs/thoughts as we are stuck on this issue for a while.