4.2 replica set upgrade to 4.4 issue

In our upgrade of 4.2 to 4.4 (will finally upgrade to 6.0) from the first standby node we observed the process crash a few minutes after startup. We follow the upgrade guide here

The mongod log of standby is below, this is the first node we do upgrade, appreciate your input and thanks in advance :grin:.

{"t":{"$date":"2023-03-15T08:54:59.720+00:00"},"s":"I",  "c":"CONTROL",  "id":23285,   "ctx":"main","msg":"Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'"}
{"t":{"$date":"2023-03-15T08:54:59.736+00:00"},"s":"I",  "c":"NETWORK",  "id":4648601, "ctx":"main","msg":"Implicit TCP FastOpen unavailable. If TCP FastOpen is required, set tcpFastOpenServer, tcpFastOpenClient, and tcpFastOpenQueueSize."}
{"t":{"$date":"2023-03-15T08:54:59.802+00:00"},"s":"I",  "c":"STORAGE",  "id":4615611, "ctx":"initandlisten","msg":"MongoDB starting","attr":{"pid":4500,"port":27017,"dbPath":"/data/mongodb/","architecture":"64-bit","host":"denotsl2716.int.kn"}}
{"t":{"$date":"2023-03-15T08:54:59.802+00:00"},"s":"I",  "c":"CONTROL",  "id":23403,   "ctx":"initandlisten","msg":"Build Info","attr":{"buildInfo":{"version":"4.4.19","gitVersion":"9a996e0ad993148b9650dc402e6d3b1804ad3b8a","openSSLVersion":"OpenSSL 1.0.1e-fips 11 Feb 2013","modules":[],"allocator":"tcmalloc","environment":{"distmod":"rhel70","distarch":"x86_64","target_arch":"x86_64"}}}}
{"t":{"$date":"2023-03-15T08:54:59.802+00:00"},"s":"I",  "c":"CONTROL",  "id":51765,   "ctx":"initandlisten","msg":"Operating System","attr":{"os":{"name":"Red Hat Enterprise Linux Server release 7.9 (Maipo)","version":"Kernel 3.10.0-1160.80.1.el7.x86_64"}}}
{"t":{"$date":"2023-03-15T08:54:59.802+00:00"},"s":"I",  "c":"CONTROL",  "id":21951,   "ctx":"initandlisten","msg":"Options set by command line","attr":{"options":{"config":"/etc/mongod.conf","net":{"bindIp":"127.0.0.1,denotsl2716.int.kn","port":27017},"processManagement":{"fork":true,"pidFilePath":"/var/run/mongodb/mongod.pid","timeZoneInfo":"/usr/share/zoneinfo"},"replication":{"replSetName":"tea-uat-rs"},"security":{"keyFile":"/data/mongodb/mongo-keyfile"},"storage":{"dbPath":"/data/mongodb/","journal":{"enabled":true}},"systemLog":{"destination":"file","logAppend":true,"path":"/var/log/mongodb/mongod.log"}}}}
{"t":{"$date":"2023-03-15T08:54:59.804+00:00"},"s":"I",  "c":"STORAGE",  "id":22270,   "ctx":"initandlisten","msg":"Storage engine to use detected by data files","attr":{"dbpath":"/data/mongodb/","storageEngine":"wiredTiger"}}
{"t":{"$date":"2023-03-15T08:54:59.804+00:00"},"s":"I",  "c":"STORAGE",  "id":22315,   "ctx":"initandlisten","msg":"Opening WiredTiger","attr":{"config":"create,cache_size=3389M,session_max=33000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000,close_scan_interval=10,close_handle_minimum=250),statistics_log=(wait=0),verbose=[recovery_progress,checkpoint_progress,compact_progress],"}}
{"t":{"$date":"2023-03-15T08:55:00.287+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"initandlisten","msg":"WiredTiger message","attr":{"message":"[1678870500:287119][4500:0x7f1f8648ebc0], txn-recover: [WT_VERB_RECOVERY_PROGRESS] Recovering log 215 through 216"}}
{"t":{"$date":"2023-03-15T08:55:00.384+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"initandlisten","msg":"WiredTiger message","attr":{"message":"[1678870500:384874][4500:0x7f1f8648ebc0], txn-recover: [WT_VERB_RECOVERY_PROGRESS] Recovering log 216 through 216"}}
{"t":{"$date":"2023-03-15T08:55:00.488+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"initandlisten","msg":"WiredTiger message","attr":{"message":"[1678870500:488429][4500:0x7f1f8648ebc0], txn-recover: [WT_VERB_RECOVERY | WT_VERB_RECOVERY_PROGRESS] Main recovery loop: starting at 215/2048 to 216/256"}}
{"t":{"$date":"2023-03-15T08:55:00.488+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"initandlisten","msg":"WiredTiger message","attr":{"message":"[1678870500:488824][4500:0x7f1f8648ebc0], txn-recover: [WT_VERB_RECOVERY_PROGRESS] Recovering log 215 through 216"}}
{"t":{"$date":"2023-03-15T08:55:00.563+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"initandlisten","msg":"WiredTiger message","attr":{"message":"[1678870500:563001][4500:0x7f1f8648ebc0], txn-recover: [WT_VERB_RECOVERY_PROGRESS] Recovering log 216 through 216"}}
{"t":{"$date":"2023-03-15T08:55:00.623+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"initandlisten","msg":"WiredTiger message","attr":{"message":"[1678870500:623133][4500:0x7f1f8648ebc0], txn-recover: [WT_VERB_RECOVERY | WT_VERB_RECOVERY_PROGRESS] Set global recovery timestamp: (1678869729, 1)"}}
{"t":{"$date":"2023-03-15T08:55:00.623+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"initandlisten","msg":"WiredTiger message","attr":{"message":"[1678870500:623215][4500:0x7f1f8648ebc0], txn-recover: [WT_VERB_RECOVERY | WT_VERB_RECOVERY_PROGRESS] Set global oldest timestamp: (1678869729, 1)"}}
{"t":{"$date":"2023-03-15T08:55:00.629+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"initandlisten","msg":"WiredTiger message","attr":{"message":"[1678870500:629795][4500:0x7f1f8648ebc0], WT_SESSION.checkpoint: [WT_VERB_CHECKPOINT_PROGRESS] saving checkpoint snapshot min: 5, snapshot max: 5 snapshot count: 0, oldest timestamp: (1678869729, 1) , meta checkpoint timestamp: (1678869729, 1) base write gen: 8716377"}}
{"t":{"$date":"2023-03-15T08:55:00.644+00:00"},"s":"I",  "c":"STORAGE",  "id":4795906, "ctx":"initandlisten","msg":"WiredTiger opened","attr":{"durationMillis":840}}
{"t":{"$date":"2023-03-15T08:55:00.644+00:00"},"s":"I",  "c":"RECOVERY", "id":23987,   "ctx":"initandlisten","msg":"WiredTiger recoveryTimestamp","attr":{"recoveryTimestamp":{"$timestamp":{"t":1678869729,"i":1}}}}
{"t":{"$date":"2023-03-15T08:55:00.649+00:00"},"s":"I",  "c":"STORAGE",  "id":22383,   "ctx":"initandlisten","msg":"The size storer reports that the oplog contains","attr":{"numRecords":9813781,"dataSize":2131948496}}
{"t":{"$date":"2023-03-15T08:55:00.649+00:00"},"s":"I",  "c":"STORAGE",  "id":22386,   "ctx":"initandlisten","msg":"Sampling the oplog to determine where to place markers for truncation"}
{"t":{"$date":"2023-03-15T08:55:00.650+00:00"},"s":"I",  "c":"STORAGE",  "id":22389,   "ctx":"initandlisten","msg":"Sampling from the oplog to determine where to place markers for truncation","attr":{"from":{"$timestamp":{"t":1670371221,"i":6818}},"to":{"$timestamp":{"t":1678869729,"i":1}}}}
{"t":{"$date":"2023-03-15T08:55:00.650+00:00"},"s":"I",  "c":"STORAGE",  "id":22390,   "ctx":"initandlisten","msg":"Taking samples and assuming each oplog section contains","attr":{"numSamples":1001,"containsNumRecords":98002,"containsNumBytes":21289981}}
{"t":{"$date":"2023-03-15T08:55:00.724+00:00"},"s":"I",  "c":"STORAGE",  "id":22393,   "ctx":"initandlisten","msg":"Oplog sampling complete"}
{"t":{"$date":"2023-03-15T08:55:00.724+00:00"},"s":"I",  "c":"STORAGE",  "id":22382,   "ctx":"initandlisten","msg":"WiredTiger record store oplog processing finished","attr":{"durationMillis":74}}
{"t":{"$date":"2023-03-15T08:55:00.744+00:00"},"s":"I",  "c":"STORAGE",  "id":22262,   "ctx":"initandlisten","msg":"Timestamp monitor starting"}
{"t":{"$date":"2023-03-15T08:55:00.750+00:00"},"s":"W",  "c":"CONTROL",  "id":22178,   "ctx":"initandlisten","msg":"/sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'","tags":["startupWarnings"]}
{"t":{"$date":"2023-03-15T08:55:00.751+00:00"},"s":"W",  "c":"CONTROL",  "id":22181,   "ctx":"initandlisten","msg":"/sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'","tags":["startupWarnings"]}
{"t":{"$date":"2023-03-15T08:55:00.794+00:00"},"s":"I",  "c":"STORAGE",  "id":20536,   "ctx":"initandlisten","msg":"Flow Control is enabled on this deployment"}
{"t":{"$date":"2023-03-15T08:55:00.796+00:00"},"s":"I",  "c":"SHARDING", "id":20997,   "ctx":"initandlisten","msg":"Refreshed RWC defaults","attr":{"newDefaults":{}}}
{"t":{"$date":"2023-03-15T08:55:00.798+00:00"},"s":"I",  "c":"FTDC",     "id":20625,   "ctx":"initandlisten","msg":"Initializing full-time diagnostic data capture","attr":{"dataDirectory":"/data/mongodb/diagnostic.data"}}
{"t":{"$date":"2023-03-15T08:55:00.799+00:00"},"s":"I",  "c":"REPL",     "id":6015317, "ctx":"initandlisten","msg":"Setting new configuration state","attr":{"newState":"ConfigStartingUp","oldState":"ConfigPreStart"}}
{"t":{"$date":"2023-03-15T08:55:00.799+00:00"},"s":"I",  "c":"REPL",     "id":4280500, "ctx":"initandlisten","msg":"Attempting to create internal replication collections"}
{"t":{"$date":"2023-03-15T08:55:00.801+00:00"},"s":"I",  "c":"REPL",     "id":4280501, "ctx":"initandlisten","msg":"Attempting to load local voted for document"}
{"t":{"$date":"2023-03-15T08:55:00.801+00:00"},"s":"I",  "c":"REPL",     "id":4280502, "ctx":"initandlisten","msg":"Searching for local Rollback ID document"}
{"t":{"$date":"2023-03-15T08:55:00.802+00:00"},"s":"I",  "c":"REPL",     "id":21529,   "ctx":"initandlisten","msg":"Initializing rollback ID","attr":{"rbid":1}}
{"t":{"$date":"2023-03-15T08:55:00.802+00:00"},"s":"I",  "c":"REPL",     "id":4280504, "ctx":"initandlisten","msg":"Cleaning up any partially applied oplog batches & reading last op from oplog"}
{"t":{"$date":"2023-03-15T08:55:00.803+00:00"},"s":"I",  "c":"REPL",     "id":21544,   "ctx":"initandlisten","msg":"Recovering from stable timestamp","attr":{"stableTimestamp":{"$timestamp":{"t":1678869729,"i":1}},"topOfOplog":{"ts":{"$timestamp":{"t":1678869729,"i":1}},"t":116},"appliedThrough":{"ts":{"$timestamp":{"t":0,"i":0}},"t":-1},"oplogTruncateAfterPoint":{"$timestamp":{"t":0,"i":0}}}}
{"t":{"$date":"2023-03-15T08:55:00.803+00:00"},"s":"I",  "c":"REPL",     "id":21545,   "ctx":"initandlisten","msg":"Starting recovery oplog application at the stable timestamp","attr":{"stableTimestamp":{"$timestamp":{"t":1678869729,"i":1}}}}
{"t":{"$date":"2023-03-15T08:55:00.803+00:00"},"s":"I",  "c":"REPL",     "id":21549,   "ctx":"initandlisten","msg":"No oplog entries to apply for recovery. Start point is at the top of the oplog"}
{"t":{"$date":"2023-03-15T08:55:00.803+00:00"},"s":"I",  "c":"REPL",     "id":4280506, "ctx":"initandlisten","msg":"Reconstructing prepared transactions"}
{"t":{"$date":"2023-03-15T08:55:00.804+00:00"},"s":"I",  "c":"REPL",     "id":4280507, "ctx":"initandlisten","msg":"Loaded replica set config, scheduled callback to set local config"}
{"t":{"$date":"2023-03-15T08:55:00.804+00:00"},"s":"I",  "c":"REPL",     "id":4280508, "ctx":"ReplCoord-0","msg":"Attempting to set local replica set config; validating config for startup"}
{"t":{"$date":"2023-03-15T08:55:00.805+00:00"},"s":"I",  "c":"CONTROL",  "id":20714,   "ctx":"LogicalSessionCacheRefresh","msg":"Failed to refresh session cache, will try again at the next refresh interval","attr":{"error":"NotYetInitialized: Replication has not yet been configured"}}
{"t":{"$date":"2023-03-15T08:55:00.805+00:00"},"s":"I",  "c":"CONTROL",  "id":20711,   "ctx":"LogicalSessionCacheReap","msg":"Failed to reap transaction table","attr":{"error":"NotYetInitialized: Replication has not yet been configured"}}
{"t":{"$date":"2023-03-15T08:55:00.805+00:00"},"s":"I",  "c":"REPL",     "id":40440,   "ctx":"initandlisten","msg":"Starting the TopologyVersionObserver"}
{"t":{"$date":"2023-03-15T08:55:00.805+00:00"},"s":"I",  "c":"REPL",     "id":40445,   "ctx":"TopologyVersionObserver","msg":"Started TopologyVersionObserver"}
{"t":{"$date":"2023-03-15T08:55:00.806+00:00"},"s":"I",  "c":"NETWORK",  "id":23015,   "ctx":"listener","msg":"Listening on","attr":{"address":"/tmp/mongodb-27017.sock"}}
{"t":{"$date":"2023-03-15T08:55:00.806+00:00"},"s":"I",  "c":"NETWORK",  "id":23015,   "ctx":"listener","msg":"Listening on","attr":{"address":"10.61.229.171"}}
{"t":{"$date":"2023-03-15T08:55:00.806+00:00"},"s":"I",  "c":"NETWORK",  "id":23015,   "ctx":"listener","msg":"Listening on","attr":{"address":"127.0.0.1"}}
{"t":{"$date":"2023-03-15T08:55:00.806+00:00"},"s":"I",  "c":"NETWORK",  "id":23016,   "ctx":"listener","msg":"Waiting for connections","attr":{"port":27017,"ssl":"off"}}
{"t":{"$date":"2023-03-15T08:55:00.808+00:00"},"s":"I",  "c":"CONTROL",  "id":23377,   "ctx":"SignalHandler","msg":"Received signal","attr":{"signal":15,"error":"Terminated"}}
{"t":{"$date":"2023-03-15T08:55:00.808+00:00"},"s":"I",  "c":"CONTROL",  "id":23378,   "ctx":"SignalHandler","msg":"Signal was sent by kill(2)","attr":{"pid":1,"uid":0}}
{"t":{"$date":"2023-03-15T08:55:00.808+00:00"},"s":"I",  "c":"CONTROL",  "id":23381,   "ctx":"SignalHandler","msg":"will terminate after current cmd ends"}
{"t":{"$date":"2023-03-15T08:55:00.811+00:00"},"s":"I",  "c":"REPL",     "id":4784900, "ctx":"SignalHandler","msg":"Stepping down the ReplicationCoordinator for shutdown","attr":{"waitTimeMillis":10000}}
{"t":{"$date":"2023-03-15T08:55:00.811+00:00"},"s":"I",  "c":"NETWORK",  "id":22943,   "ctx":"listener","msg":"Connection accepted","attr":{"remote":"10.61.229.248:59082","connectionId":1,"connectionCount":1}}
{"t":{"$date":"2023-03-15T08:55:00.811+00:00"},"s":"I",  "c":"COMMAND",  "id":4784901, "ctx":"SignalHandler","msg":"Shutting down the MirrorMaestro"}
{"t":{"$date":"2023-03-15T08:55:00.811+00:00"},"s":"I",  "c":"REPL",     "id":40441,   "ctx":"SignalHandler","msg":"Stopping TopologyVersionObserver"}
{"t":{"$date":"2023-03-15T08:55:00.811+00:00"},"s":"I",  "c":"REPL",     "id":40447,   "ctx":"TopologyVersionObserver","msg":"Stopped TopologyVersionObserver"}
{"t":{"$date":"2023-03-15T08:55:00.811+00:00"},"s":"I",  "c":"NETWORK",  "id":51800,   "ctx":"conn1","msg":"client metadata","attr":{"remote":"10.61.229.248:59082","client":"conn1","doc":{"driver":{"name":"NetworkInterfaceTL","version":"4.2.23"},"os":{"type":"Linux","name":"Red Hat Enterprise Linux Server release 7.9 (Maipo)","architecture":"x86_64","version":"Kernel 3.10.0-1160.80.1.el7.x86_64"}}}}
{"t":{"$date":"2023-03-15T08:55:00.813+00:00"},"s":"I",  "c":"SHARDING", "id":4784902, "ctx":"SignalHandler","msg":"Shutting down the WaitForMajorityService"}
{"t":{"$date":"2023-03-15T08:55:00.813+00:00"},"s":"I",  "c":"CONTROL",  "id":4784903, "ctx":"SignalHandler","msg":"Shutting down the LogicalSessionCache"}
{"t":{"$date":"2023-03-15T08:55:00.813+00:00"},"s":"I",  "c":"NETWORK",  "id":20562,   "ctx":"SignalHandler","msg":"Shutdown: going to close listening sockets"}
{"t":{"$date":"2023-03-15T08:55:00.831+00:00"},"s":"I",  "c":"NETWORK",  "id":23017,   "ctx":"listener","msg":"removing socket file","attr":{"path":"/tmp/mongodb-27017.sock"}}
{"t":{"$date":"2023-03-15T08:55:00.831+00:00"},"s":"I",  "c":"NETWORK",  "id":4784905, "ctx":"SignalHandler","msg":"Shutting down the global connection pool"}
{"t":{"$date":"2023-03-15T08:55:00.831+00:00"},"s":"I",  "c":"STORAGE",  "id":4784906, "ctx":"SignalHandler","msg":"Shutting down the FlowControlTicketholder"}
{"t":{"$date":"2023-03-15T08:55:00.831+00:00"},"s":"I",  "c":"-",        "id":20520,   "ctx":"SignalHandler","msg":"Stopping further Flow Control ticket acquisitions."}
{"t":{"$date":"2023-03-15T08:55:00.831+00:00"},"s":"I",  "c":"REPL",     "id":4784907, "ctx":"SignalHandler","msg":"Shutting down the replica set node executor"}
{"t":{"$date":"2023-03-15T08:55:00.832+00:00"},"s":"I",  "c":"ASIO",     "id":22582,   "ctx":"ReplNodeDbWorkerNetwork","msg":"Killing all outstanding egress activity."}
{"t":{"$date":"2023-03-15T08:55:00.832+00:00"},"s":"I",  "c":"STORAGE",  "id":4784908, "ctx":"SignalHandler","msg":"Shutting down the PeriodicThreadToAbortExpiredTransactions"}
{"t":{"$date":"2023-03-15T08:55:00.832+00:00"},"s":"I",  "c":"STORAGE",  "id":4784934, "ctx":"SignalHandler","msg":"Shutting down the PeriodicThreadToDecreaseSnapshotHistoryCachePressure"}
{"t":{"$date":"2023-03-15T08:55:00.832+00:00"},"s":"I",  "c":"ACCESS",   "id":20250,   "ctx":"conn1","msg":"Authentication succeeded","attr":{"mechanism":"SCRAM-SHA-1","speculative":false,"principalName":"__system","authenticationDatabase":"local","remote":"10.61.229.248:59082","extraInfo":{}}}
{"t":{"$date":"2023-03-15T08:55:00.832+00:00"},"s":"I",  "c":"REPL",     "id":4784909, "ctx":"SignalHandler","msg":"Shutting down the ReplicationCoordinator"}
{"t":{"$date":"2023-03-15T08:55:00.832+00:00"},"s":"I",  "c":"REPL",     "id":21328,   "ctx":"SignalHandler","msg":"Shutting down replication subsystems"}
{"t":{"$date":"2023-03-15T08:55:30.870+00:00"},"s":"I",  "c":"NETWORK",  "id":20125,   "ctx":"ReplCoord-0","msg":"DBClientConnection failed to receive message","attr":{"connString":"denotsl2716.int.kn:27017","error":"NetworkTimeout: Socket operation timed out"}}
{"t":{"$date":"2023-03-15T08:55:30.870+00:00"},"s":"I",  "c":"NETWORK",  "id":20117,   "ctx":"ReplCoord-0","msg":"Can't authenticate as internal user","attr":{"connString":"denotsl2716.int.kn:27017 failed","error":{"code":6,"codeName":"HostUnreachable","errmsg":"network error while attempting to run command 'ismaster' on host 'denotsl2716.int.kn:27017' "}}}
{"t":{"$date":"2023-03-15T08:55:30.870+00:00"},"s":"I",  "c":"NETWORK",  "id":4834701, "ctx":"ReplCoord-0","msg":"isSelf could not authenticate internal user","attr":{"hostAndPort":"denotsl2716.int.kn:27017","error":{"code":6,"codeName":"HostUnreachable","errmsg":"network error while attempting to run command 'ismaster' on host 'denotsl2716.int.kn:27017' "}}}
{"t":{"$date":"2023-03-15T08:55:30.930+00:00"},"s":"W",  "c":"REPL",     "id":21405,   "ctx":"ReplCoord-0","msg":"Locally stored replica set configuration does not have a valid entry for the current node; waiting for reconfig or remote heartbeat","attr":{"error":{"code":74,"codeName":"NodeNotFound","errmsg":"No host described in new configuration with {version: 5, term: -1} for replica set tea-uat-rs maps to this node"},"localConfig":{"_id":"tea-uat-rs","version":5,"protocolVersion":1,"writeConcernMajorityJournalDefault":true,"members":[{"_id":0,"host":"denotsl2715.int.kn:27017","arbiterOnly":false,"buildIndexes":true,"hidden":false,"priority":1.0,"tags":{},"slaveDelay":0,"votes":1},{"_id":1,"host":"denotsl2716.int.kn:27017","arbiterOnly":false,"buildIndexes":true,"hidden":false,"priority":0.5,"tags":{},"slaveDelay":0,"votes":1},{"_id":2,"host":"denotsl2717.int.kn:27017","arbiterOnly":false,"buildIndexes":true,"hidden":false,"priority":0.5,"tags":{},"slaveDelay":0,"votes":1}],"settings":{"chainingAllowed":true,"heartbeatIntervalMillis":2000,"heartbeatTimeoutSecs":10,"electionTimeoutMillis":10000,"catchUpTimeoutMillis":-1,"catchUpTakeoverDelayMillis":30000,"getLastErrorModes":{},"getLastErrorDefaults":{"w":1,"wtimeout":0},"replicaSetId":{"$oid":"5f7291be834f68d9838b2d8a"}}}}}
{"t":{"$date":"2023-03-15T08:55:30.931+00:00"},"s":"I",  "c":"REPL",     "id":4280509, "ctx":"ReplCoord-0","msg":"Local configuration validated for startup"}
{"t":{"$date":"2023-03-15T08:55:30.931+00:00"},"s":"I",  "c":"REPL",     "id":6015317, "ctx":"ReplCoord-0","msg":"Setting new configuration state","attr":{"newState":"ConfigSteady","oldState":"ConfigStartingUp"}}
{"t":{"$date":"2023-03-15T08:55:30.931+00:00"},"s":"I",  "c":"REPL",     "id":21392,   "ctx":"ReplCoord-0","msg":"New replica set config in use","attr":{"config":{"_id":"tea-uat-rs","version":5,"protocolVersion":1,"writeConcernMajorityJournalDefault":true,"members":[{"_id":0,"host":"denotsl2715.int.kn:27017","arbiterOnly":false,"buildIndexes":true,"hidden":false,"priority":1.0,"tags":{},"slaveDelay":0,"votes":1},{"_id":1,"host":"denotsl2716.int.kn:27017","arbiterOnly":false,"buildIndexes":true,"hidden":false,"priority":0.5,"tags":{},"slaveDelay":0,"votes":1},{"_id":2,"host":"denotsl2717.int.kn:27017","arbiterOnly":false,"buildIndexes":true,"hidden":false,"priority":0.5,"tags":{},"slaveDelay":0,"votes":1}],"settings":{"chainingAllowed":true,"heartbeatIntervalMillis":2000,"heartbeatTimeoutSecs":10,"electionTimeoutMillis":10000,"catchUpTimeoutMillis":-1,"catchUpTakeoverDelayMillis":30000,"getLastErrorModes":{},"getLastErrorDefaults":{"w":1,"wtimeout":0},"replicaSetId":{"$oid":"5f7291be834f68d9838b2d8a"}}}}}
{"t":{"$date":"2023-03-15T08:55:30.931+00:00"},"s":"I",  "c":"REPL",     "id":21394,   "ctx":"ReplCoord-0","msg":"This node is not a member of the config"}
{"t":{"$date":"2023-03-15T08:55:30.931+00:00"},"s":"I",  "c":"REPL",     "id":21358,   "ctx":"ReplCoord-0","msg":"Replica set state transition","attr":{"newState":"REMOVED","oldState":"STARTUP"}}
{"t":{"$date":"2023-03-15T08:55:30.931+00:00"},"s":"I",  "c":"REPL",     "id":21320,   "ctx":"ReplCoord-0","msg":"Updated term","attr":{"term":116}}
{"t":{"$date":"2023-03-15T08:55:30.931+00:00"},"s":"I",  "c":"NETWORK",  "id":22991,   "ctx":"ReplCoord-0","msg":"Skip closing connection for connection","attr":{"connectionId":1}}
{"t":{"$date":"2023-03-15T08:55:30.931+00:00"},"s":"I",  "c":"ASIO",     "id":22582,   "ctx":"ReplNetwork","msg":"Killing all outstanding egress activity."}
{"t":{"$date":"2023-03-15T08:55:30.931+00:00"},"s":"I",  "c":"SHARDING", "id":4784910, "ctx":"SignalHandler","msg":"Shutting down the ShardingInitializationMongoD"}
{"t":{"$date":"2023-03-15T08:55:30.931+00:00"},"s":"I",  "c":"REPL",     "id":4784911, "ctx":"SignalHandler","msg":"Enqueuing the ReplicationStateTransitionLock for shutdown"}
{"t":{"$date":"2023-03-15T08:55:30.931+00:00"},"s":"I",  "c":"-",        "id":4784912, "ctx":"SignalHandler","msg":"Killing all operations for shutdown"}
{"t":{"$date":"2023-03-15T08:55:30.931+00:00"},"s":"I",  "c":"-",        "id":4695300, "ctx":"SignalHandler","msg":"Interrupted all currently running operations","attr":{"opsKilled":5}}
{"t":{"$date":"2023-03-15T08:55:30.931+00:00"},"s":"I",  "c":"COMMAND",  "id":4784913, "ctx":"SignalHandler","msg":"Shutting down all open transactions"}
{"t":{"$date":"2023-03-15T08:55:30.931+00:00"},"s":"I",  "c":"REPL",     "id":4784914, "ctx":"SignalHandler","msg":"Acquiring the ReplicationStateTransitionLock for shutdown"}
{"t":{"$date":"2023-03-15T08:55:30.931+00:00"},"s":"I",  "c":"INDEX",    "id":4784915, "ctx":"SignalHandler","msg":"Shutting down the IndexBuildsCoordinator"}
{"t":{"$date":"2023-03-15T08:55:30.931+00:00"},"s":"I",  "c":"REPL",     "id":4784916, "ctx":"SignalHandler","msg":"Reacquiring the ReplicationStateTransitionLock for shutdown"}
{"t":{"$date":"2023-03-15T08:55:30.931+00:00"},"s":"I",  "c":"REPL",     "id":4784917, "ctx":"SignalHandler","msg":"Attempting to mark clean shutdown"}
{"t":{"$date":"2023-03-15T08:55:30.931+00:00"},"s":"I",  "c":"NETWORK",  "id":4784918, "ctx":"SignalHandler","msg":"Shutting down the ReplicaSetMonitor"}
{"t":{"$date":"2023-03-15T08:55:30.931+00:00"},"s":"I",  "c":"REPL",     "id":4784920, "ctx":"SignalHandler","msg":"Shutting down the LogicalTimeValidator"}
{"t":{"$date":"2023-03-15T08:55:30.932+00:00"},"s":"I",  "c":"SHARDING", "id":4784921, "ctx":"SignalHandler","msg":"Shutting down the MigrationUtilExecutor"}
{"t":{"$date":"2023-03-15T08:55:30.932+00:00"},"s":"I",  "c":"CONTROL",  "id":4784925, "ctx":"SignalHandler","msg":"Shutting down free monitoring"}
{"t":{"$date":"2023-03-15T08:55:30.932+00:00"},"s":"I",  "c":"CONTROL",  "id":20609,   "ctx":"SignalHandler","msg":"Shutting down free monitoring"}
{"t":{"$date":"2023-03-15T08:55:30.932+00:00"},"s":"I",  "c":"STORAGE",  "id":4784927, "ctx":"SignalHandler","msg":"Shutting down the HealthLog"}
{"t":{"$date":"2023-03-15T08:55:30.932+00:00"},"s":"I",  "c":"STORAGE",  "id":4784929, "ctx":"SignalHandler","msg":"Acquiring the global lock for shutdown"}
{"t":{"$date":"2023-03-15T08:55:30.932+00:00"},"s":"I",  "c":"STORAGE",  "id":4784930, "ctx":"SignalHandler","msg":"Shutting down the storage engine"}
{"t":{"$date":"2023-03-15T08:55:30.932+00:00"},"s":"I",  "c":"STORAGE",  "id":22320,   "ctx":"SignalHandler","msg":"Shutting down journal flusher thread"}
{"t":{"$date":"2023-03-15T08:55:30.932+00:00"},"s":"I",  "c":"STORAGE",  "id":22321,   "ctx":"SignalHandler","msg":"Finished shutting down journal flusher thread"}
{"t":{"$date":"2023-03-15T08:55:30.932+00:00"},"s":"I",  "c":"STORAGE",  "id":20282,   "ctx":"SignalHandler","msg":"Deregistering all the collections"}
{"t":{"$date":"2023-03-15T08:55:30.932+00:00"},"s":"I",  "c":"STORAGE",  "id":22372,   "ctx":"OplogVisibilityThread","msg":"Oplog visibility thread shutting down."}
{"t":{"$date":"2023-03-15T08:55:30.932+00:00"},"s":"I",  "c":"STORAGE",  "id":22261,   "ctx":"SignalHandler","msg":"Timestamp monitor shutting down"}
{"t":{"$date":"2023-03-15T08:55:30.932+00:00"},"s":"I",  "c":"STORAGE",  "id":22317,   "ctx":"SignalHandler","msg":"WiredTigerKVEngine shutting down"}
{"t":{"$date":"2023-03-15T08:55:30.932+00:00"},"s":"I",  "c":"STORAGE",  "id":22318,   "ctx":"SignalHandler","msg":"Shutting down session sweeper thread"}
{"t":{"$date":"2023-03-15T08:55:30.933+00:00"},"s":"I",  "c":"STORAGE",  "id":22319,   "ctx":"SignalHandler","msg":"Finished shutting down session sweeper thread"}
{"t":{"$date":"2023-03-15T08:55:30.933+00:00"},"s":"I",  "c":"STORAGE",  "id":22322,   "ctx":"SignalHandler","msg":"Shutting down checkpoint thread"}
{"t":{"$date":"2023-03-15T08:55:30.933+00:00"},"s":"I",  "c":"STORAGE",  "id":22323,   "ctx":"SignalHandler","msg":"Finished shutting down checkpoint thread"}
{"t":{"$date":"2023-03-15T08:55:30.934+00:00"},"s":"I",  "c":"STORAGE",  "id":22324,   "ctx":"SignalHandler","msg":"Closing WiredTiger in preparation for reconfiguring","attr":{"closeConfig":"leak_memory=true,"}}
{"t":{"$date":"2023-03-15T08:55:30.938+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"SignalHandler","msg":"WiredTiger message","attr":{"message":"[1678870530:938790][4500:0x7f1f814a6700], close_ckpt: [WT_VERB_CHECKPOINT_PROGRESS] saving checkpoint snapshot min: 8, snapshot max: 8 snapshot count: 0, oldest timestamp: (1678869729, 1) , meta checkpoint timestamp: (1678869729, 1) base write gen: 8716377"}}
{"t":{"$date":"2023-03-15T08:55:31.040+00:00"},"s":"I",  "c":"STORAGE",  "id":4795905, "ctx":"SignalHandler","msg":"WiredTiger closed","attr":{"durationMillis":106}}
{"t":{"$date":"2023-03-15T08:55:31.058+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"SignalHandler","msg":"WiredTiger message","attr":{"message":"[1678870531:58640][4500:0x7f1f814a6700], txn-recover: [WT_VERB_RECOVERY_PROGRESS] Recovering log 216 through 217"}}
{"t":{"$date":"2023-03-15T08:55:31.112+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"SignalHandler","msg":"WiredTiger message","attr":{"message":"[1678870531:112416][4500:0x7f1f814a6700], txn-recover: [WT_VERB_RECOVERY_PROGRESS] Recovering log 217 through 217"}}
{"t":{"$date":"2023-03-15T08:55:31.205+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"SignalHandler","msg":"WiredTiger message","attr":{"message":"[1678870531:205621][4500:0x7f1f814a6700], txn-recover: [WT_VERB_RECOVERY | WT_VERB_RECOVERY_PROGRESS] Main recovery loop: starting at 216/3200 to 217/256"}}
{"t":{"$date":"2023-03-15T08:55:31.206+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"SignalHandler","msg":"WiredTiger message","attr":{"message":"[1678870531:205996][4500:0x7f1f814a6700], txn-recover: [WT_VERB_RECOVERY_PROGRESS] Recovering log 216 through 217"}}
{"t":{"$date":"2023-03-15T08:55:31.269+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"SignalHandler","msg":"WiredTiger message","attr":{"message":"[1678870531:269720][4500:0x7f1f814a6700], txn-recover: [WT_VERB_RECOVERY_PROGRESS] Recovering log 217 through 217"}}
{"t":{"$date":"2023-03-15T08:55:31.326+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"SignalHandler","msg":"WiredTiger message","attr":{"message":"[1678870531:326519][4500:0x7f1f814a6700], txn-recover: [WT_VERB_RECOVERY | WT_VERB_RECOVERY_PROGRESS] Set global recovery timestamp: (1678869729, 1)"}}
{"t":{"$date":"2023-03-15T08:55:31.326+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"SignalHandler","msg":"WiredTiger message","attr":{"message":"[1678870531:326627][4500:0x7f1f814a6700], txn-recover: [WT_VERB_RECOVERY | WT_VERB_RECOVERY_PROGRESS] Set global oldest timestamp: (1678869729, 1)"}}
{"t":{"$date":"2023-03-15T08:55:31.333+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"SignalHandler","msg":"WiredTiger message","attr":{"message":"[1678870531:333281][4500:0x7f1f814a6700], WT_SESSION.checkpoint: [WT_VERB_CHECKPOINT_PROGRESS] saving checkpoint snapshot min: 8, snapshot max: 8 snapshot count: 0, oldest timestamp: (1678869729, 1) , meta checkpoint timestamp: (1678869729, 1) base write gen: 8716386"}}
{"t":{"$date":"2023-03-15T08:55:31.346+00:00"},"s":"I",  "c":"STORAGE",  "id":4795904, "ctx":"SignalHandler","msg":"WiredTiger re-opened","attr":{"durationMillis":306}}
{"t":{"$date":"2023-03-15T08:55:31.346+00:00"},"s":"I",  "c":"STORAGE",  "id":22325,   "ctx":"SignalHandler","msg":"Reconfiguring","attr":{"newConfig":"compatibility=(release=3.3)"}}
{"t":{"$date":"2023-03-15T08:55:31.472+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"SignalHandler","msg":"WiredTiger message","attr":{"message":"[1678870531:472602][4500:0x7f1f814a6700], WT_SESSION.checkpoint: [WT_VERB_CHECKPOINT_PROGRESS] saving checkpoint snapshot min: 9, snapshot max: 9 snapshot count: 0, oldest timestamp: (1678869729, 1) , meta checkpoint timestamp: (1678869729, 1) base write gen: 8716386"}}
{"t":{"$date":"2023-03-15T08:55:31.487+00:00"},"s":"I",  "c":"STORAGE",  "id":4795903, "ctx":"SignalHandler","msg":"Reconfigure complete","attr":{"durationMillis":141}}
{"t":{"$date":"2023-03-15T08:55:31.487+00:00"},"s":"I",  "c":"STORAGE",  "id":4795902, "ctx":"SignalHandler","msg":"Closing WiredTiger","attr":{"closeConfig":"leak_memory=true,"}}
{"t":{"$date":"2023-03-15T08:55:31.490+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"SignalHandler","msg":"WiredTiger message","attr":{"message":"[1678870531:490069][4500:0x7f1f814a6700], close_ckpt: [WT_VERB_CHECKPOINT_PROGRESS] saving checkpoint snapshot min: 11, snapshot max: 11 snapshot count: 0, oldest timestamp: (1678869729, 1) , meta checkpoint timestamp: (1678869729, 1) base write gen: 8716386"}}
{"t":{"$date":"2023-03-15T08:55:31.506+00:00"},"s":"I",  "c":"STORAGE",  "id":4795901, "ctx":"SignalHandler","msg":"WiredTiger closed","attr":{"durationMillis":19}}
{"t":{"$date":"2023-03-15T08:55:31.506+00:00"},"s":"I",  "c":"STORAGE",  "id":22279,   "ctx":"SignalHandler","msg":"shutdown: removing fs lock..."}
{"t":{"$date":"2023-03-15T08:55:31.506+00:00"},"s":"I",  "c":"-",        "id":4784931, "ctx":"SignalHandler","msg":"Dropping the scope cache for shutdown"}
{"t":{"$date":"2023-03-15T08:55:31.506+00:00"},"s":"I",  "c":"FTDC",     "id":4784926, "ctx":"SignalHandler","msg":"Shutting down full-time data capture"}
{"t":{"$date":"2023-03-15T08:55:31.506+00:00"},"s":"I",  "c":"FTDC",     "id":20626,   "ctx":"SignalHandler","msg":"Shutting down full-time diagnostic data capture"}
{"t":{"$date":"2023-03-15T08:55:31.507+00:00"},"s":"I",  "c":"CONTROL",  "id":20565,   "ctx":"SignalHandler","msg":"Now exiting"}
{"t":{"$date":"2023-03-15T08:55:31.507+00:00"},"s":"I",  "c":"CONTROL",  "id":23138,   "ctx":"SignalHandler","msg":"Shutting down","attr":{"exitCode":0}}

Anyone have ideas?

From the log seems it’s complaining replica set has not been initialize as the below 2 lines in mongod.log

{"t":{"$date":"2023-03-15T08:55:00.805+00:00"},"s":"I",  "c":"CONTROL",  "id":20714,   "ctx":"LogicalSessionCacheRefresh","msg":"Failed to refresh session cache, will try again at the next refresh interval","attr":{"error":"NotYetInitialized: Replication has not yet been configured"}}
{"t":{"$date":"2023-03-15T08:55:00.805+00:00"},"s":"I",  "c":"CONTROL",  "id":20711,   "ctx":"LogicalSessionCacheReap","msg":"Failed to reap transaction table","attr":{"error":"NotYetInitialized: Replication has not yet been configured"}}

But we do have a functional and healthy cluster running as show in rs.status()

tea-uat-rs:PRIMARY> rs.status()
{
        "set" : "tea-uat-rs",
        "date" : ISODate("2023-03-20T08:37:36.640Z"),
        "myState" : 1,
        "term" : NumberLong(114),
        "syncingTo" : "",
        "syncSourceHost" : "",
        "syncSourceId" : -1,
        "heartbeatIntervalMillis" : NumberLong(2000),
        "majorityVoteCount" : 2,
        "writeMajorityCount" : 2,
        "optimes" : {
                "lastCommittedOpTime" : {
                        "ts" : Timestamp(1679301450, 1),
                        "t" : NumberLong(114)
                },
                "lastCommittedWallTime" : ISODate("2023-03-20T08:37:30.514Z"),
                "readConcernMajorityOpTime" : {
                        "ts" : Timestamp(1679301450, 1),
                        "t" : NumberLong(114)
                },
                "readConcernMajorityWallTime" : ISODate("2023-03-20T08:37:30.514Z"),
                "appliedOpTime" : {
                        "ts" : Timestamp(1679301450, 1),
                        "t" : NumberLong(114)
                },
                "durableOpTime" : {
                        "ts" : Timestamp(1679301450, 1),
                        "t" : NumberLong(114)
                },
                "lastAppliedWallTime" : ISODate("2023-03-20T08:37:30.514Z"),
                "lastDurableWallTime" : ISODate("2023-03-20T08:37:30.514Z")
        },
        "lastStableRecoveryTimestamp" : Timestamp(1679301426, 1),
        "lastStableCheckpointTimestamp" : Timestamp(1679301426, 1),
        "electionCandidateMetrics" : {
                "lastElectionReason" : "priorityTakeover",
                "lastElectionDate" : ISODate("2023-03-15T09:08:33.620Z"),
                "electionTerm" : NumberLong(114),
                "lastCommittedOpTimeAtElection" : {
                        "ts" : Timestamp(1678871312, 1),
                        "t" : NumberLong(113)
                },
                "lastSeenOpTimeAtElection" : {
                        "ts" : Timestamp(1678871312, 1),
                        "t" : NumberLong(113)
                },
                "numVotesNeeded" : 2,
                "priorityAtElection" : 1,
                "electionTimeoutMillis" : NumberLong(10000),
                "priorPrimaryMemberId" : 1,
                "numCatchUpOps" : NumberLong(0),
                "newTermStartDate" : ISODate("2023-03-15T09:08:39.988Z"),
                "wMajorityWriteAvailabilityDate" : ISODate("2023-03-15T09:08:40.636Z")
        },
        "electionParticipantMetrics" : {
                "votedForCandidate" : true,
                "electionTerm" : NumberLong(113),
                "lastVoteDate" : ISODate("2023-03-15T09:08:22.867Z"),
                "electionCandidateMemberId" : 1,
                "voteReason" : "",
                "lastAppliedOpTimeAtElection" : {
                        "ts" : Timestamp(1678867807, 2),
                        "t" : NumberLong(112)
                },
                "maxAppliedOpTimeInSet" : {
                        "ts" : Timestamp(1678867807, 2),
                        "t" : NumberLong(112)
                },
                "priorityAtElection" : 1
        },
        "members" : [
                {
                        "_id" : 0,
                        "name" : "denotsl2715.int.kn:27017",
                        "health" : 1,
                        "state" : 1,
                        "stateStr" : "PRIMARY",
                        "uptime" : 430198,
                        "optime" : {
                                "ts" : Timestamp(1679301450, 1),
                                "t" : NumberLong(114)
                        },
                        "optimeDate" : ISODate("2023-03-20T08:37:30Z"),
                        "syncingTo" : "",
                        "syncSourceHost" : "",
                        "syncSourceId" : -1,
                        "infoMessage" : "",
                        "electionTime" : Timestamp(1678871313, 1),
                        "electionDate" : ISODate("2023-03-15T09:08:33Z"),
                        "configVersion" : 5,
                        "self" : true,
                        "lastHeartbeatMessage" : ""
                },
                {
                        "_id" : 1,
                        "name" : "denotsl2716.int.kn:27017",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 430164,
                        "optime" : {
                                "ts" : Timestamp(1679301450, 1),
                                "t" : NumberLong(114)
                        },
                        "optimeDurable" : {
                                "ts" : Timestamp(1679301450, 1),
                                "t" : NumberLong(114)
                        },
                        "optimeDate" : ISODate("2023-03-20T08:37:30Z"),
                        "optimeDurableDate" : ISODate("2023-03-20T08:37:30Z"),
                        "lastHeartbeat" : ISODate("2023-03-20T08:37:36.096Z"),
                        "lastHeartbeatRecv" : ISODate("2023-03-20T08:37:35.811Z"),
                        "pingMs" : NumberLong(1),
                        "lastHeartbeatMessage" : "",
                        "syncingTo" : "denotsl2715.int.kn:27017",
                        "syncSourceHost" : "denotsl2715.int.kn:27017",
                        "syncSourceId" : 0,
                        "infoMessage" : "",
                        "configVersion" : 5
                },
                {
                        "_id" : 2,
                        "name" : "denotsl2717.int.kn:27017",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 430111,
                        "optime" : {
                                "ts" : Timestamp(1679301450, 1),
                                "t" : NumberLong(114)
                        },
                        "optimeDurable" : {
                                "ts" : Timestamp(1679301450, 1),
                                "t" : NumberLong(114)
                        },
                        "optimeDate" : ISODate("2023-03-20T08:37:30Z"),
                        "optimeDurableDate" : ISODate("2023-03-20T08:37:30Z"),
                        "lastHeartbeat" : ISODate("2023-03-20T08:37:34.847Z"),
                        "lastHeartbeatRecv" : ISODate("2023-03-20T08:37:34.967Z"),
                        "pingMs" : NumberLong(0),
                        "lastHeartbeatMessage" : "",
                        "syncingTo" : "denotsl2716.int.kn:27017",
                        "syncSourceHost" : "denotsl2716.int.kn:27017",
                        "syncSourceId" : 1,
                        "infoMessage" : "",
                        "configVersion" : 5
                }
        ],
        "ok" : 1,
        "$clusterTime" : {
                "clusterTime" : Timestamp(1679301450, 1),
                "signature" : {
                        "hash" : BinData(0,"Cw9HLKQnTxCJe0yKQFnK97M2Ue8="),
                        "keyId" : NumberLong("7183697243419967489")
                }
        },
        "operationTime" : Timestamp(1679301450, 1)
}
tea-uat-rs:PRIMARY>

Hi @Daniel_Wan1 and welcome to the community forum!!

The following error message seems to be because the connection was interrupted manually and some operation is trying to kill the process.
Can you confirm if any connection interruption was manually triggered?

Are these logs from the secondary node? As the rs.status() shows all the three nodes are currently in a healthy state, so could you kindly confirm which information is contradictory. Also, would it be possible for you to share the output for rs.status() during the upgrade process?

Also, can you help us understand at what step does the node begins to crash ?

Thanks
Aasawari

1 Like

@Aasawari Thanks for your imput. We didn’t kill the mongod manually because we are using RHEL7 systemctl on bring up service.

We had finally go through the upgrade and instead using the “systemctl start mongod” on the slave nodes we use command “mongod --config /etc/mongod.conf” to bring up slave mongod process.

the /etc/mongod.conf is actually the same systemctl startup options file, so we are still not sure why systemd will kill the mongod process.

We find such solution here

As we had finally upgraded to 6 I believe we no longer require support in this thread, please close it. Thanks