I have deployed mongodb replicaset using helm charts on RKE2 cluster (RHEL8 VMs) using “managed-nfs-storage”. The first mongo replica came up and the PRIMARY is also set properly but the second replica is crashing with the below fsync error.
`
{“t”:{“$date”:“2023-02-17T18:35:18.407+00:00”},“s”:“E”, “c”:“STORAGE”, “id”:20557, “ctx”:“initandlisten”,“msg”:“DBException in initAndListen, terminating”,“attr”:{“error”:“FileStreamFailed: Unable to write process id 1\n to file (fsync failed): /bitnami/mongodb/data/db/mongod.lock Input/output error”}}
Mongo pods under the database namespace:
Please find below some of the findings around this issue,
- The pods are crashing only when “managed-nfs-storage” storage class is used and they came up successfully when “longhorn” storage class is used
- Also the mongo replicaset came up successfully with the same “managed-nfs-storage” storage class on another RKE2 cluster (RHEL8 VMs)
- There is enough Disk Space available on the “nfs-server” VM and also the mount files ownership is also same as the working cluster’s mount files.
- And the database mount files has “1001” as user id and root as the group id permissions.
I need some inputs in debugging the root cause for crashing of the mongo secondary replica. Any suggestions would be appreciated