Error on Container running MongoDB

We have a mongodb 3.4.0 pod running in a kubernetes cluster, we are trying to upgrade however this is blocking us. This pod runs for about 7 minutes before we see the following error:

2022-10-28T17:49:26.597+0000 E STORAGE  [thread2] WiredTiger error (5) [1666979366:596967][1:0x7f187d54c700], log-server: /datastore/db/journal: directory-list: opendir: Input/output error
2022-10-28T17:49:26.597+0000 E STORAGE  [thread2] WiredTiger error (5) [1666979366:597060][1:0x7f187d54c700], log-server: log pre-alloc server error: Input/output error
2022-10-28T17:49:26.597+0000 E STORAGE  [thread2] WiredTiger error (5) [1666979366:597072][1:0x7f187d54c700], log-server: log server error: Input/output error
2022-10-28T17:49:31.001+0000 W FTDC     [ftdc] Uncaught exception in 'FileNotOpen: Failed to open interim file /datastore/db/diagnostic.data/metrics.interim.temp' in full-time diagnostic data capture subsystem. Shutting down the full-time diagnostic data capture subsystem.

2022-10-28T17:51:53.882+0000 E STORAGE  [WTJournalFlusher] WiredTiger error (5) [1666979513:882390][1:0x7f187ad48700], WT_SESSION.log_flush: /datastore/db/journal/WiredTigerLog.0000000671: handle-write: pwrite: failed to write 768 bytes at offset 11904: Input/output error
2022-10-28T17:51:53.882+0000 I -        [WTJournalFlusher] Invariant failure: s->log_flush(s, "sync=on") resulted in status UnknownError: 5: Input/output error at src/mongo/db/storage/wiredtiger/wiredtiger_session_cache.cpp 229
2022-10-28T17:51:53.882+0000 I -        [WTJournalFlusher] 

***aborting after invariant() failure

This is the file it is referencing: mongo/wiredtiger_session_cache.cpp at r3.4.0 · mongodb/mongo · GitHub

We have ran a repair against MongoDB using the following command:

kubectl exec -it mongodb-0 – mongod --dbpath /data/db --port 27020 --bind_ip 0.0.0.0 --repair

Here is the output of that command:

2022-10-28T18:47:30.513+0000 I CONTROL  [initandlisten] MongoDB starting : pid=41 port=27020 dbpath=/data/db 64-bit host=mongodb-0
2022-10-28T18:47:30.513+0000 I CONTROL  [initandlisten] db version v3.4.0
2022-10-28T18:47:30.513+0000 I CONTROL  [initandlisten] git version: f4240c60f005be757399042dc12f6addbc3170c1
2022-10-28T18:47:30.513+0000 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.1t  3 May 2016
2022-10-28T18:47:30.513+0000 I CONTROL  [initandlisten] allocator: tcmalloc
2022-10-28T18:47:30.513+0000 I CONTROL  [initandlisten] modules: none
2022-10-28T18:47:30.513+0000 I CONTROL  [initandlisten] build environment:
2022-10-28T18:47:30.513+0000 I CONTROL  [initandlisten]     distmod: debian81
2022-10-28T18:47:30.513+0000 I CONTROL  [initandlisten]     distarch: x86_64
2022-10-28T18:47:30.513+0000 I CONTROL  [initandlisten]     target_arch: x86_64
2022-10-28T18:47:30.513+0000 I CONTROL  [initandlisten] options: { net: { bindIp: "0.0.0.0", port: 27020 }, repair: true, storage: { dbPath: "/data/db" } }
2022-10-28T18:47:30.517+0000 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=1425M,session_max=20000,eviction=(threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),,log=(enabled=false),
2022-10-28T18:47:30.630+0000 I CONTROL  [initandlisten] 
2022-10-28T18:47:30.630+0000 I CONTROL  [initandlisten] ** WARNING: Access control is not enabled for the database.
2022-10-28T18:47:30.630+0000 I CONTROL  [initandlisten] **          Read and write access to data and configuration is unrestricted.
2022-10-28T18:47:30.630+0000 I CONTROL  [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended.
2022-10-28T18:47:30.630+0000 I CONTROL  [initandlisten] 
2022-10-28T18:47:30.631+0000 I STORAGE  [initandlisten] finished checking dbs
2022-10-28T18:47:30.631+0000 I NETWORK  [initandlisten] shutdown: going to close listening sockets...
2022-10-28T18:47:30.631+0000 I NETWORK  [initandlisten] removing socket file: /tmp/mongodb-27020.sock
2022-10-28T18:47:30.631+0000 I NETWORK  [initandlisten] shutdown: going to flush diaglog...
2022-10-28T18:47:30.631+0000 I STORAGE  [initandlisten] WiredTigerKVEngine shutting down
2022-10-28T18:47:30.642+0000 I STORAGE  [initandlisten] shutdown: removing fs lock...
2022-10-28T18:47:30.642+0000 I CONTROL  [initandlisten] now exiting
2022-10-28T18:47:30.642+0000 I CONTROL  [initandlisten] shutting down with code:0

I am unsure if this is a successful repair output or not, but would really appreciate some guidance in next steps or things to check if anyone has seen or dealt with a failure like this before. Any input would be greatly appreciated.

Thank you.

Hi @Austin_Hauer and welcome to the MongoDB community forum!!

In most cases, Input/output error in the logs was caused by issues in the hardware or OS layer instead of from MongoDB itself. From the log snippet you posted, to me it appears that WiredTiger tried to write a journal file, but failed to do so.

It would be very helpful if you could share a few more details on the concerns that you have mentioned above:

  1. How does the kubernetes deployment looks like and if the deployment is in active use.
  2. If there has been any kind of inserts or updates during this period.
  1. Could you help in understanding if the pod is in the running state for 7 mins and the error comes up or this comes up during the upgrade process.

  2. Are you following any documentation to upgrade the MongoDB on kubernetes setup ? and which version are you trying to upgrade to?

  3. Could you also share the pod logs from the begining upto 7 mins till the failure has been observed

Let us know if you have any further questions.

Best Regards
Aasawari

1 Like