We are seeing an issue while resynchronizing one of our failure node in the shared cluster. Our Synchronization is getting failed with an error message every time. We could not able to figure out what could be the exact issue. Initially we through the issue with the open file limit on OS. Currently, the server is configured with the value 1048576 but still we are facing the same error.
Failed to commit collection indexes dbname.tblname: Location16814: error opening file “/shardserver/data/_tmp/extsort-index.218”: errno:24 Too many open files
2023-06-14T00:08:06.979+0000 E INITSYNC [replication-7] collection clone for ‘dbname.tblname’ failed due to Location16814: Error cloning collection ‘dbname.tblname’ :: caused by :: error opening file “/shardserver/data/_tmp/extsort-index.218”: errno:24 Too many open files
Note: The table size is huge and mentioned below.
Can anyone provide your valuable feedback about the issue we are facing?
Hi @ashwin_reddy1 and welcome to MongoDB community forums!!
Based on the details shared, could you help me understand a few more details like:
- Can you confirm if the sharded cluster is deployed in a kubernetes environment ?
- The space mentioned in the above post, could you confirm using
ulimit -a if the configuration has been set?
- The MongoDB version 4.2 has reached End of Life in April 2023 and hence no further updates will be made, would you mind upgrading to the latest stable version which involves bug fixes and new features and confirm if you are facing the similar issue?
Thanks for your response.
- There is no Kubernetes involved in the MongoDB shard cluster it is running on-prem machines.
- Yes the Current ulimit -a value is 1048576.
- Yes, We will take it up soon. Before upgrading, we want to fix this issue and make sure that all 3 nodes in the cluster are up. Currently, only 2 nodes are active.
If you need any more information, please let me know.
Thank you for the information shared.
As per the response, it seems the value for the limit has been set. Just to make sure and also mentioned in the MongoDB documentation, could you confirm if the system was started using
systemctl which uses the ulimit setting.
Please refer to the Linux ulimit documentation for further reference.
Yes, the service is configured with systemctl.
Do you have any update about the above issue?