MongoDB Crashes after Upgrade to 6.0

I upgraded our MongoDB server from 4.4 to 6.0 following the instructions to first upgrade to 5.0 and then to 6.0. All seemed to go well. However, I am facing a couple issues.

First, the preferred approach to starting mongod doesn’t work. My server is RHEL 7.9. The preferred startup approach is:

sudo systemctl start mongod

This command results in the following:

Job for mongod.service failed because the control process exited with error code. See “systemctl status mongod.service” and “journalctl -xe” for details.

I looked at systemctl status and journalctl but didn’t see any obvious. The output from systemctl status is:

mongod.service - MongoDB Database Server
   Loaded: loaded (/usr/lib/systemd/system/mongod.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Wed 2022-09-28 13:14:07 EDT; 1min 22s ago
     Docs: https://docs.mongodb.org/manual
  Process: 183848 ExecStart=/usr/bin/mongod $OPTIONS (code=exited, status=14)
  Process: 183845 ExecStartPre=/usr/bin/chmod 0755 /var/run/mongodb (code=exited, status=0/SUCCESS)
  Process: 183842 ExecStartPre=/usr/bin/chown mongod:mongod /var/run/mongodb (code=exited, status=0/SUCCESS)
  Process: 183839 ExecStartPre=/usr/bin/mkdir -p /var/run/mongodb (code=exited, status=0/SUCCESS)
 Main PID: 150176 (code=exited, status=0/SUCCESS)

Sep 28 13:14:06 <server> systemd[1]: Starting MongoDB Database Server...
Sep 28 13:14:06 <server> mongod[183848]: about to fork child process, waiting until server is ready for connections.
Sep 28 13:14:06 <server> mongod[183848]: forked process: 183851
Sep 28 13:14:07 <server> mongod[183848]: ERROR: child process failed, exited with 14
Sep 28 13:14:07 <server> mongod[183848]: To see additional information in this output, start without the "--fork" option.
Sep 28 13:14:07 <server> systemd[1]: mongod.service: control process exited, code=exited status=14
Sep 28 13:14:07 <server> systemd[1]: Failed to start MongoDB Database Server.
Sep 28 13:14:07 <server> systemd[1]: Unit mongod.service entered failed state.
Sep 28 13:14:07 <server> systemd[1]: mongod.service failed.

I don’t know how to start without the “–fork” option. So, I have not tried that option.

The output from journalctl is longer but also not very enlightening:

-- Unit mongod.service has begun starting up.
Sep 28 13:16:07 <server> mongod[184281]: about to fork child process, waiting until server is ready for connections.
Sep 28 13:16:07 <server> mongod[184281]: forked process: 184284
Sep 28 13:16:08 <server> mongod[184281]: ERROR: child process failed, exited with 14
Sep 28 13:16:08 <server> mongod[184281]: To see additional information in this output, start without the "--fork" option.
Sep 28 13:16:08 <server> systemd[1]: mongod.service: control process exited, code=exited status=14
Sep 28 13:16:08 <server> systemd[1]: Failed to start MongoDB Database Server.
-- Subject: Unit mongod.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit mongod.service has failed.
--
-- The result is failed.
Sep 28 13:16:08 <server> systemd[1]: Unit mongod.service entered failed state.
Sep 28 13:16:08 <server> systemd[1]: mongod.service failed.
Sep 28 13:16:08 <server> python3[1035]: Job for mongod.service failed because the control process exited with error code. See "systemctl status mongod.service" and "journalctl -xe" f
Sep 28 13:16:17 <server> sshd[132898]: pam_unix(sshd:session): session closed for user dnc23
Sep 28 13:16:17 <server> sudo[150323]: pam_unix(sudo:session): session closed for user root
Sep 28 13:17:34 <server> omiserver[184545]: pam_unix(omi:auth): authentication failure; logname= uid=0 euid=0 tty= ruser= rhost=  user=svc_scom
Sep 28 13:17:34 <server> omiserver[184545]: pam_krb5[184545]: TGT verified using key for 'host/<server>@DHE.DUKE.EDU'
Sep 28 13:17:34 <server> omiserver[184545]: pam_krb5[184545]: authentication succeeds for 'svc_scom' (svc_scom@DHE.DUKE.EDU)
Sep 28 13:17:34 <server> sudo[184550]: svc_scom : TTY=unknown ; PWD=/var/opt/microsoft/scx/tmp ; USER=root ; COMMAND=/bin/sh -c df -kP | awk '{print $6, $3}' | grep -v /dev/
Sep 28 13:17:34 <server> sudo[184550]: pam_unix(sudo:session): session opened for user root by (uid=0)
Sep 28 13:17:34 <server> sudo[184550]: pam_unix(sudo:session): session closed for user root
Sep 28 13:17:37 <server> sudo[184558]: svc_scom : TTY=unknown ; PWD=/var/opt/omi/run ; USER=root ; COMMAND=/opt/microsoft/scx/bin/scxlogfilereader -p
Sep 28 13:17:37 <server> sudo[184558]: pam_unix(sudo:session): session opened for user root by (uid=0)
Sep 28 13:17:37 <server> sudo[184558]: pam_unix(sudo:session): session closed for user root
Sep 28 13:18:08 <server> systemd[1]: Starting MongoDB Database Server...
-- Subject: Unit mongod.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit mongod.service has begun starting up.
Sep 28 13:18:08 <server> mongod[184692]: about to fork child process, waiting until server is ready for connections.
Sep 28 13:18:08 <server> mongod[184692]: forked process: 184695
Sep 28 13:18:09 <server> mongod[184692]: ERROR: child process failed, exited with 14
Sep 28 13:18:09 <server> mongod[184692]: To see additional information in this output, start without the "--fork" option.
Sep 28 13:18:09 <server> systemd[1]: mongod.service: control process exited, code=exited status=14
Sep 28 13:18:09 <server> systemd[1]: Failed to start MongoDB Database Server.
-- Subject: Unit mongod.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit mongod.service has failed.
--
-- The result is failed.
Sep 28 13:18:09 <server> systemd[1]: Unit mongod.service entered failed state.
Sep 28 13:18:09 <server> systemd[1]: mongod.service failed.
Sep 28 13:18:09 <server> python3[1035]: Job for mongod.service failed because the control process exited with error code. See "systemctl status mongod.service" and "journalctl -xe" f
lines 1708-1757/1757 (END)

NOTE: I replaced my actual server name with in this output to comply with my organization’s security requirements.

Following this attempt to start MongoDB, I used the following command:

mongod --config ~/mongod.conf

With this command the server starts and runs - for about half an hour. Then it disappears. Looking at the log file doesn’t really offer an explanation. The log shows that the server starts and runs. Then after a short time a series of the following block of messages appears. Each block begins with a SERVER RESTARTED message:

{"t":{"$date":"2022-09-28T09:57:38.072-04:00"},"s":"I",  "c":"CONTROL",  "id":20698,   "ctx":"-","msg":"***** SERVER RESTARTED *****"}

{"t":{"$date":"2022-09-28T09:57:38.073-04:00"},"s":"I",  "c":"NETWORK",  "id":4915701, "ctx":"main","msg":"Initialized wire specification","attr":{"spec":{"incomingExternalClient":{"minWireVersion":0,"maxWireVersion":17},"incomingInternalClient":{"minWireVersion":0,"maxWireVersion":17},"outgoing":{"minWireVersion":6,"maxWireVersion":17},"isInternalClient":true}}}

{"t":{"$date":"2022-09-28T09:57:38.076-04:00"},"s":"I",  "c":"CONTROL",  "id":23285,   "ctx":"main","msg":"Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'"}

{"t":{"$date":"2022-09-28T09:57:38.083-04:00"},"s":"I",  "c":"NETWORK",  "id":4648601, "ctx":"main","msg":"Implicit TCP FastOpen unavailable. If TCP FastOpen is required, set tcpFastOpenServer, tcpFastOpenClient, and tcpFastOpenQueueSize."}

{"t":{"$date":"2022-09-28T09:57:38.106-04:00"},"s":"I",  "c":"REPL",     "id":5123008, "ctx":"main","msg":"Successfully registered PrimaryOnlyService","attr":{"service":"TenantMigrationDonorService","namespace":"config.tenantMigrationDonors"}}

{"t":{"$date":"2022-09-28T09:57:38.106-04:00"},"s":"I",  "c":"REPL",     "id":5123008, "ctx":"main","msg":"Successfully registered PrimaryOnlyService","attr":{"service":"TenantMigrationRecipientService","namespace":"config.tenantMigrationRecipients"}}

{"t":{"$date":"2022-09-28T09:57:38.106-04:00"},"s":"I",  "c":"REPL",     "id":5123008, "ctx":"main","msg":"Successfully registered PrimaryOnlyService","attr":{"service":"ShardSplitDonorService","namespace":"config.tenantSplitDonors"}}

{"t":{"$date":"2022-09-28T09:57:38.107-04:00"},"s":"I",  "c":"CONTROL",  "id":5945603, "ctx":"main","msg":"Multi threading initialized"}

{"t":{"$date":"2022-09-28T09:57:38.107-04:00"},"s":"I",  "c":"CONTROL",  "id":4615611, "ctx":"initandlisten","msg":"MongoDB starting","attr":{"pid":144512,"port":27017,"dbPath":"/data/db_storage/mongo","architecture":"64-bit","host":"vlt-metis-01.dhe.duke.edu"}}

{"t":{"$date":"2022-09-28T09:57:38.107-04:00"},"s":"I",  "c":"CONTROL",  "id":23403,   "ctx":"initandlisten","msg":"Build Info","attr":{"buildInfo":{"version":"6.0.1","gitVersion":"32f0f9c88dc44a2c8073a5bd47cf779d4bfdee6b","openSSLVersion":"OpenSSL 1.0.1e-fips 11 Feb 2013","modules":[],"allocator":"tcmalloc","environment":{"distmod":"rhel70","distarch":"x86_64","target_arch":"x86_64"}}}}

{"t":{"$date":"2022-09-28T09:57:38.107-04:00"},"s":"I",  "c":"CONTROL",  "id":51765,   "ctx":"initandlisten","msg":"Operating System","attr":{"os":{"name":"Red Hat Enterprise Linux Server release 7.9 (Maipo)","version":"Kernel 3.10.0-1160.76.1.el7.x86_64"}}}

{"t":{"$date":"2022-09-28T09:57:38.107-04:00"},"s":"I",  "c":"CONTROL",  "id":21951,   "ctx":"initandlisten","msg":"Options set by command line","attr":{"options":{"config":"/etc/mongod.conf","net":{"bindIp":"0.0.0.0","port":27017},"processManagement":{"fork":true,"pidFilePath":"/var/run/mongodb/mongod.pid","timeZoneInfo":"/usr/share/zoneinfo"},"security":{"authorization":"enabled"},"storage":{"dbPath":"/data/db_storage/mongo","journal":{"enabled":true}},"systemLog":{"destination":"file","logAppend":true,"path":"/var/log/mongodb/mongod.log"}}}}

{"t":{"$date":"2022-09-28T09:57:38.108-04:00"},"s":"E",  "c":"NETWORK",  "id":23024,   "ctx":"initandlisten","msg":"Failed to unlink socket file","attr":{"path":"/tmp/mongodb-27017.sock","error":"Operation not permitted"}}

{"t":{"$date":"2022-09-28T09:57:38.108-04:00"},"s":"F",  "c":"ASSERT",   "id":23091,   "ctx":"initandlisten","msg":"Fatal assertion","attr":{"msgid":40486,"file":"src/mongo/transport/transport_layer_asio.cpp","line":1120}}

{"t":{"$date":"2022-09-28T09:57:38.108-04:00"},"s":"F",  "c":"ASSERT",   "id":23092,   "ctx":"initandlisten","msg":"\n\n***aborting after fassert() failure\n\n"}

So, at the moment I am dead in the water. This is the community edition that we continue to evaluate for purposes. In general, we like MongoDB. However, it actually has to work to be useable. Any insights to address our server crashing will be most appreciated.

Hi David, I noticed this line in the log file you pasted:

The socket file might be owned by the root user instead of the user that’s trying to run the service. You can delete this file and it will be recreated on the next start of the process.

1 Like

Thanks, I will give that a try.

Quick question. When starting mongod, should I start it as root? The instructions don’t say.

You should never start the mongod process as the root user. The package you installed this as should have created a mongod (or mongodb) user and that is the user that the service will use when you run sudo systemctl start mongod.

2 Likes

Thanks.
I just got out of a meeting where we discussed this issue. Unknown to me there was another (home grown) process running the background that monitors mongodb and restarts it if it detects that mongodb has stopped.

We stopped that process and now MongoDB seems to be working fine. Will take a closer look at that monitoring process. But I think the preferred approach is to use replication to ensure that the system continues working.

2 Likes