Hi there,
I currently have an M10 cluster running v4.4.18 with a 20GB database residing on it. Atlas takes a daily backup snapshot and keeps it for 7 days. Recently I have been trying to download snapshots to my local machine to do some testing only to find that all the snapshots are corrupt. When I extract the folder from the download and connect to it using mongod I get errors and then the process terminates (I can’t upload the logs as I’m a new user of the community).
The common error I get from all these snapshots is:
{“t”:{"$date":“2023-01-26T13:01:13.273+00:00”},“s”:“E”, “c”:“STORAGE”, “id”:22435, “ctx”:“initandlisten”,“msg”:“WiredTiger error”,“attr”:{“error”:-31802,“message”:"[1674738073:273058][9008:140719278152144], file:sizeStorer.wt, WT_SESSION.open_cursor: int __cdecl __win_file_read(struct __wt_file_handle *,struct __wt_session *,__int64,unsigned __int64,void *), 288: C:/databases/productionv2\sizeStorer.wt: handle-read: ReadFile: failed to read 4096 bytes at offset 24576: Reached the end of the file.\r\n: WT_ERROR: non-specific WiredTiger error"}}
If it’s of any relevance the sizeStorer.wt file is exactly 4096 bytes.
Right now I have zero faith that any snapshots are of actual use if I ever need to restore to my cluster. With nearly 400,000 users and associated data in the database this is of real concern.
Can anybody please advise as to what might be going on and possible solutions. This sort of undermines the exact reason why we’re currently paying for Atlas.
Thanks,
Paul