Primary shard replica set & primary config replica set went down, secondary shard aborted afterwards

Hi,

We are running 3 shards with 3 replica sets each (primary,secondary and hidden secondary) and 1 config replica set (primary secondary and hidden secondary) + 2 routers on a total of 6 nodes.

running community version 4.4.13 (latest is 4.4.18)

Yesterday 1 node went down which host primary shard2 replica set and primary config replica set.
The node which hosts secondary shard2 replica set got aborted (fatal assertion) after being elected to PRIMARY.
Could the abort related to primary config being down and the secondary config was not elected to primary on time or not known yet by secondary shard2 replica (elected to primary) ?

secondary shard2 log (replaced our hostname):

{"t":{"$date":"2023-01-31T08:46:18.246+00:00"},"s":"I",  "c":"NETWORK",  "id":4712102, "ctx":"OplogApplier-0","msg":"Host failed in replica set","attr":{"replicaSet":"configReplSet","host":"primary_config:27019","error":{"code":202,"codeName":"NetworkInterfaceExceededTimeLimit","errmsg":"Couldn't get a connection within the time limit of 104ms"},"action":{"dropConnections":false,"requestImmediateCheck":false,"outcome":{"host":"primary_config:27019","success":false,"errorMessage":"NetworkInterfaceExceededTimeLimit: Couldn't get a connection within the time limit of 104ms"}}}}
{"t":{"$date":"2023-01-31T08:46:18.247+00:00"},"s":"I",  "c":"SHARDING", "id":22739,   "ctx":"OplogApplier-0","msg":"Operation timed out","attr":{"error":"NetworkInterfaceExceededTimeLimit: Couldn't get a connection within the time limit of 104ms"}}
{"t":{"$date":"2023-01-31T08:46:18.247+00:00"},"s":"I",  "c":"SHARDING", "id":22079,   "ctx":"OplogApplier-0","msg":"Couldn't create config.changelog collection","attr":{"error":{"code":202,"codeName":"NetworkInterfaceExceededTimeLimit","errmsg":"Couldn't get a connection within the time limit of 104ms"}}}
{"t":{"$date":"2023-01-31T08:46:18.247+00:00"},"s":"F",  "c":"-",        "id":23093,   "ctx":"OplogApplier-0","msg":"Fatal assertion","attr":{"msgid":40107,"error":"NetworkInterfaceExceededTimeLimit: Couldn't get a connection within the time limit of 104ms","file":"src/mongo/db/repl/replication_coordinator_external_state_impl.cpp","line":883}}
{"t":{"$date":"2023-01-31T08:46:18.247+00:00"},"s":"F",  "c":"-",        "id":23094,   "ctx":"OplogApplier-0","msg":"\n\n***aborting after fassert() failure\n\n"}
{"t":{"$date":"2023-01-31T08:46:18.248+00:00"},"s":"F",  "c":"CONTROL",  "id":4757800, "ctx":"OplogApplier-0","msg":"Writing fatal message","attr":{"message":"Got signal: 6 (Aborted).\n"}}
{"t":{"$date":"2023-01-31T08:46:18.441+00:00"},"s":"I",  "c":"CONTROL",  "id":31431,   "ctx":"OplogApplier-0","msg":"BACKTRACE: {bt}","attr":{"bt":{"backtrace":[{"a":"854096E78A","b":"853DB75000","o":"2DF978A","s":"_ZN5mongo18stack_trace_detail12_GLOBAL__N_119printStackTraceImplERKNS1_7OptionsEPNS_14StackTraceSinkE.constprop.606","s+":"1EA"},{"a":"8540970219","b":"853DB75000","o":"2DFB219","s":"_ZN5mongo15printStackTraceEv","s+":"29"},{"a":"854096D5A6","b":"853DB75000","o":"2DF85A6","s":"_ZN5mongo12_GLOBAL__N_116abruptQuitActionEiP7siginfoPv","s+":"66"},{"a":"7FCB960197E0","b":"7FCB9600A000","o":"F7E0","s":"_L_unlock_16","s+":"2D"},{"a":"7FCB95CA84F5","b":"7FCB95C76000","o":"324F5","s":"gsignal","s+":"35"},{"a":"7FCB95CA9CD5","b":"7FCB95C76000","o":"33CD5","s":"abort","s+":"175"},{"a":"853EAC1E45","b":"853DB75000","o":"F4CE45","s":"_ZN5mongo35fassertFailedWithStatusWithLocationEiRKNS_6StatusEPKcj","s+":"178"},{"a":"853E7D1817","b":"853DB75000","o":"C5C817","s":"_ZN5mongo4repl39ReplicationCoordinatorExternalStateImpl34_shardingOnTransitionToPrimaryHookEPNS_16OperationContextE.cold.1173","s+":"4B"},{"a":"853EE63CE7","b":"853DB75000","o":"12EECE7","s":"_ZN5mongo4repl39ReplicationCoordinatorExternalStateImpl21onTransitionToPrimaryEPNS_16OperationContextE","s+":"2F7"},{"a":"853EEA5736","b":"853DB75000","o":"1330736","s":"_ZN5mongo4repl26ReplicationCoordinatorImpl19signalDrainCompleteEPNS_16OperationContextEx","s+":"556"},{"a":"853EF36F2E","b":"853DB75000","o":"13C1F2E","s":"_ZN5mongo4repl16OplogApplierImpl4_runEPNS0_11OplogBufferE","s+":"8DE"},{"a":"853EF8D428","b":"853DB75000","o":"1418428","s":"_ZZN5mongo15unique_functionIFvRKNS_8executor12TaskExecutor12CallbackArgsEEE8makeImplIZNS_4repl12OplogApplier7startupEvEUlS5_E_EEDaOT_EN12SpecificImpl4callES5_","s+":"F8"},{"a":"85402E2E73","b":"853DB75000","o":"276DE73","s":"_ZN5mongo8executor22ThreadPoolTaskExecutor11runCallbackESt10shared_ptrINS1_13CallbackStateEE","s+":"113"},{"a":"85402E3282","b":"853DB75000","o":"276E282","s":"_ZZN5mongo15unique_functionIFvNS_6StatusEEE8makeImplIZNS_8executor22ThreadPoolTaskExecutor23scheduleIntoPool_inlockEPNSt7__cxx114listISt10shared_ptrINS6_13CallbackStateEESaISB_EEERKSt14_List_iteratorISB_ESI_St11unique_lockINS_12latch_detail5LatchEEEUlT_E1_EEDaOSN_EN12SpecificImpl4callEOS1_","s+":"A2"},{"a":"854048BFF2","b":"853DB75000","o":"2916FF2","s":"_ZN5mongo10ThreadPool10_doOneTaskEPSt11unique_lockINS_12latch_detail5LatchEE","s+":"132"},{"a":"854048E636","b":"853DB75000","o":"2919636","s":"_ZN5mongo10ThreadPool13_consumeTasksEv","s+":"86"},{"a":"854048F3E1","b":"853DB75000","o":"291A3E1","s":"_ZN5mongo10ThreadPool17_workerThreadBodyEPS0_RKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE","s+":"E1"},{"a":"854048F710","b":"853DB75000","o":"291A710","s":"_ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJZN5mongo4stdx6threadC4IZNS3_10ThreadPool25_startWorkerThread_inlockEvEUlvE2_JELi0EEET_DpOT0_EUlvE_EEEEE6_M_runEv","s+":"60"},{"a":"8540B1907F","b":"853DB75000","o":"2FA407F","s":"execute_native_thread_routine","s+":"F"},{"a":"7FCB96011AA1","b":"7FCB9600A000","o":"7AA1","s":"start_thread","s+":"D1"},{"a":"7FCB95D5EC4D","b":"7FCB95C76000","o":"E8C4D","s":"clone","s+":"6D"}],"processInfo":{"mongodbVersion":"4.4.13","gitVersion":"df25c71b8674a78e17468f48bcda5285decb9246","compiledModules":[],"uname":{"sysname":"Linux","release":"4.1.12-124.48.6.el6uek.x86_64","version":"#2 SMP Tue Mar 16 15:39:03 PDT 2021","machine":"x86_64"},"somap":[{"b":"853DB75000","elfType":3,"buildId":"781A3955310D52A5503CEA4EAC13DEB84CCF5E2C"}]}}}}
{"t":{"$date":"2023-01-31T08:46:18.442+00:00"},"s":"I",  "c":"CONTROL",  "id":31427,   "ctx":"OplogApplier-0","msg":"  Frame: {frame}","attr":{"frame":{"a":"854096E78A","b":"853DB75000","o":"2DF978A","s":"_ZN5mongo18stack_trace_detail12_GLOBAL__N_119printStackTraceImplERKNS1_7OptionsEPNS_14StackTraceSinkE.constprop.606","s+":"1EA"}}}
{"t":{"$date":"2023-01-31T08:46:18.442+00:00"},"s":"I",  "c":"CONTROL",  "id":31427,   "ctx":"OplogApplier-0","msg":"  Frame: {frame}","attr":{"frame":{"a":"8540970219","b":"853DB75000","o":"2DFB219","s":"_ZN5mongo15printStackTraceEv","s+":"29"}}}
{"t":{"$date":"2023-01-31T08:46:18.442+00:00"},"s":"I",  "c":"CONTROL",  "id":31427,   "ctx":"OplogApplier-0","msg":"  Frame: {frame}","attr":{"frame":{"a":"854096D5A6","b":"853DB75000","o":"2DF85A6","s":"_ZN5mongo12_GLOBAL__N_116abruptQuitActionEiP7siginfoPv","s+":"66"}}}
{"t":{"$date":"2023-01-31T08:46:18.442+00:00"},"s":"I",  "c":"CONTROL",  "id":31427,   "ctx":"OplogApplier-0","msg":"  Frame: {frame}","attr":{"frame":{"a":"7FCB960197E0","b":"7FCB9600A000","o":"F7E0","s":"_L_unlock_16","s+":"2D"}}}
{"t":{"$date":"2023-01-31T08:46:18.442+00:00"},"s":"I",  "c":"CONTROL",  "id":31427,   "ctx":"OplogApplier-0","msg":"  Frame: {frame}","attr":{"frame":{"a":"7FCB95CA84F5","b":"7FCB95C76000","o":"324F5","s":"gsignal","s+":"35"}}}
{"t":{"$date":"2023-01-31T08:46:18.442+00:00"},"s":"I",  "c":"CONTROL",  "id":31427,   "ctx":"OplogApplier-0","msg":"  Frame: {frame}","attr":{"frame":{"a":"7FCB95CA9CD5","b":"7FCB95C76000","o":"33CD5","s":"abort","s+":"175"}}}
{"t":{"$date":"2023-01-31T08:46:18.442+00:00"},"s":"I",  "c":"CONTROL",  "id":31427,   "ctx":"OplogApplier-0","msg":"  Frame: {frame}","attr":{"frame":{"a":"853EAC1E45","b":"853DB75000","o":"F4CE45","s":"_ZN5mongo35fassertFailedWithStatusWithLocationEiRKNS_6StatusEPKcj","s+":"178"}}}
{"t":{"$date":"2023-01-31T08:46:18.442+00:00"},"s":"I",  "c":"CONTROL",  "id":31427,   "ctx":"OplogApplier-0","msg":"  Frame: {frame}","attr":{"frame":{"a":"853E7D1817","b":"853DB75000","o":"C5C817","s":"_ZN5mongo4repl39ReplicationCoordinatorExternalStateImpl34_shardingOnTransitionToPrimaryHookEPNS_16OperationContextE.cold.1173","s+":"4B"}}}
{"t":{"$date":"2023-01-31T08:46:18.442+00:00"},"s":"I",  "c":"CONTROL",  "id":31427,   "ctx":"OplogApplier-0","msg":"  Frame: {frame}","attr":{"frame":{"a":"853EE63CE7","b":"853DB75000","o":"12EECE7","s":"_ZN5mongo4repl39ReplicationCoordinatorExternalStateImpl21onTransitionToPrimaryEPNS_16OperationContextE","s+":"2F7"}}}
{"t":{"$date":"2023-01-31T08:46:18.442+00:00"},"s":"I",  "c":"CONTROL",  "id":31427,   "ctx":"OplogApplier-0","msg":"  Frame: {frame}","attr":{"frame":{"a":"853EEA5736","b":"853DB75000","o":"1330736","s":"_ZN5mongo4repl26ReplicationCoordinatorImpl19signalDrainCompleteEPNS_16OperationContextEx","s+":"556"}}}
{"t":{"$date":"2023-01-31T08:46:18.442+00:00"},"s":"I",  "c":"CONTROL",  "id":31427,   "ctx":"OplogApplier-0","msg":"  Frame: {frame}","attr":{"frame":{"a":"853EF36F2E","b":"853DB75000","o":"13C1F2E","s":"_ZN5mongo4repl16OplogApplierImpl4_runEPNS0_11OplogBufferE","s+":"8DE"}}}
{"t":{"$date":"2023-01-31T08:46:18.442+00:00"},"s":"I",  "c":"CONTROL",  "id":31427,   "ctx":"OplogApplier-0","msg":"  Frame: {frame}","attr":{"frame":{"a":"853EF8D428","b":"853DB75000","o":"1418428","s":"_ZZN5mongo15unique_functionIFvRKNS_8executor12TaskExecutor12CallbackArgsEEE8makeImplIZNS_4repl12OplogApplier7startupEvEUlS5_E_EEDaOT_EN12SpecificImpl4callES5_","s+":"F8"}}}
{"t":{"$date":"2023-01-31T08:46:18.442+00:00"},"s":"I",  "c":"CONTROL",  "id":31427,   "ctx":"OplogApplier-0","msg":"  Frame: {frame}","attr":{"frame":{"a":"85402E2E73","b":"853DB75000","o":"276DE73","s":"_ZN5mongo8executor22ThreadPoolTaskExecutor11runCallbackESt10shared_ptrINS1_13CallbackStateEE","s+":"113"}}}
{"t":{"$date":"2023-01-31T08:46:18.442+00:00"},"s":"I",  "c":"CONTROL",  "id":31427,   "ctx":"OplogApplier-0","msg":"  Frame: {frame}","attr":{"frame":{"a":"85402E3282","b":"853DB75000","o":"276E282","s":"_ZZN5mongo15unique_functionIFvNS_6StatusEEE8makeImplIZNS_8executor22ThreadPoolTaskExecutor23scheduleIntoPool_inlockEPNSt7__cxx114listISt10shared_ptrINS6_13CallbackStateEESaISB_EEERKSt14_List_iteratorISB_ESI_St11unique_lockINS_12latch_detail5LatchEEEUlT_E1_EEDaOSN_EN12SpecificImpl4callEOS1_","s+":"A2"}}}
{"t":{"$date":"2023-01-31T08:46:18.442+00:00"},"s":"I",  "c":"CONTROL",  "id":31427,   "ctx":"OplogApplier-0","msg":"  Frame: {frame}","attr":{"frame":{"a":"854048BFF2","b":"853DB75000","o":"2916FF2","s":"_ZN5mongo10ThreadPool10_doOneTaskEPSt11unique_lockINS_12latch_detail5LatchEE","s+":"132"}}}
{"t":{"$date":"2023-01-31T08:46:18.442+00:00"},"s":"I",  "c":"CONTROL",  "id":31427,   "ctx":"OplogApplier-0","msg":"  Frame: {frame}","attr":{"frame":{"a":"854048E636","b":"853DB75000","o":"2919636","s":"_ZN5mongo10ThreadPool13_consumeTasksEv","s+":"86"}}}
{"t":{"$date":"2023-01-31T08:46:18.442+00:00"},"s":"I",  "c":"CONTROL",  "id":31427,   "ctx":"OplogApplier-0","msg":"  Frame: {frame}","attr":{"frame":{"a":"854048F3E1","b":"853DB75000","o":"291A3E1","s":"_ZN5mongo10ThreadPool17_workerThreadBodyEPS0_RKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE","s+":"E1"}}}
{"t":{"$date":"2023-01-31T08:46:18.442+00:00"},"s":"I",  "c":"CONTROL",  "id":31427,   "ctx":"OplogApplier-0","msg":"  Frame: {frame}","attr":{"frame":{"a":"854048F710","b":"853DB75000","o":"291A710","s":"_ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJZN5mongo4stdx6threadC4IZNS3_10ThreadPool25_startWorkerThread_inlockEvEUlvE2_JELi0EEET_DpOT0_EUlvE_EEEEE6_M_runEv","s+":"60"}}}
{"t":{"$date":"2023-01-31T08:46:18.442+00:00"},"s":"I",  "c":"CONTROL",  "id":31427,   "ctx":"OplogApplier-0","msg":"  Frame: {frame}","attr":{"frame":{"a":"8540B1907F","b":"853DB75000","o":"2FA407F","s":"execute_native_thread_routine","s+":"F"}}}
{"t":{"$date":"2023-01-31T08:46:18.442+00:00"},"s":"I",  "c":"CONTROL",  "id":31427,   "ctx":"OplogApplier-0","msg":"  Frame: {frame}","attr":{"frame":{"a":"7FCB96011AA1","b":"7FCB9600A000","o":"7AA1","s":"start_thread","s+":"D1"}}}
{"t":{"$date":"2023-01-31T08:46:18.442+00:00"},"s":"I",  "c":"CONTROL",  "id":31427,   "ctx":"OplogApplier-0","msg":"  Frame: {frame}","attr":{"frame":{"a":"7FCB95D5EC4D","b":"7FCB95C76000","o":"E8C4D","s":"clone","s+":"6D"}}}

{"t":{"$date":"2023-01-31T09:18:11.591+00:00"},"s":"I",  "c":"CONTROL",  "id":20698,   "ctx":"main","msg":"***** SERVER RESTARTED *****"}

secondary config log

{"t":{"$date":"2023-01-31T08:46:07.101+00:00"},"s":"I",  "c":"ELECTION", "id":21450,   "ctx":"ReplCoord-5820","msg":"Election succeeded, assuming primary role","attr":{"term":56}}
{"t":{"$date":"2023-01-31T08:46:07.102+00:00"},"s":"I",  "c":"REPL",     "id":21358,   "ctx":"ReplCoord-5820","msg":"Replica set state transition","attr":{"newState":"PRIMARY","oldState":"SECONDARY"}}
{"t":{"$date":"2023-01-31T08:46:07.105+00:00"},"s":"I",  "c":"REPL",     "id":21106,   "ctx":"ReplCoord-5820","msg":"Resetting sync source to empty","attr":{"previousSyncSource":":27017"}}
{"t":{"$date":"2023-01-31T08:46:07.106+00:00"},"s":"I",  "c":"REPL",     "id":21359,   "ctx":"ReplCoord-5820","msg":"Entering primary catch-up mode"}
{"t":{"$date":"2023-01-31T08:46:07.960+00:00"},"s":"I",  "c":"CONNPOOL", "id":22576,   "ctx":"ReplicaSetMonitor-TaskExecutor","msg":"Connecting","attr":{"hostAndPort":"primary_shard:27018"}}
{"t":{"$date":"2023-01-31T08:46:09.831+00:00"},"s":"I",  "c":"REPL",     "id":21364,   "ctx":"ReplCoord-5823","msg":"Caught up to the latest optime known via heartbeats after becoming primary","attr":{"targetOpTime":{"ts":{"$timestamp":{"t":1675154745,"i":6}},"t":55},"myLastApplied":{"ts":{"$timestamp":{"t":1675154745,"i":6}},"t":55}}}
{"t":{"$date":"2023-01-31T08:46:09.831+00:00"},"s":"I",  "c":"REPL",     "id":21363,   "ctx":"ReplCoord-5823","msg":"Exited primary catch-up mode"}
{"t":{"$date":"2023-01-31T08:46:09.831+00:00"},"s":"I",  "c":"REPL",     "id":21107,   "ctx":"ReplCoord-5823","msg":"Stopping replication producer"}
{"t":{"$date":"2023-01-31T08:46:09.831+00:00"},"s":"I",  "c":"REPL",     "id":21239,   "ctx":"ReplBatcher","msg":"Oplog buffer has been drained","attr":{"term":56}}
{"t":{"$date":"2023-01-31T08:46:09.832+00:00"},"s":"I",  "c":"REPL",     "id":21343,   "ctx":"RstlKillOpThread","msg":"Starting to kill user operations"}
{"t":{"$date":"2023-01-31T08:46:09.832+00:00"},"s":"I",  "c":"REPL",     "id":21344,   "ctx":"RstlKillOpThread","msg":"Stopped killing user operations"}
{"t":{"$date":"2023-01-31T08:46:09.832+00:00"},"s":"I",  "c":"REPL",     "id":21340,   "ctx":"RstlKillOpThread","msg":"State transition ops metrics","attr":{"metrics":{"lastStateTransition":"stepUp","userOpsKilled":0,"userOpsRunning":30}}}
{"t":{"$date":"2023-01-31T08:46:09.832+00:00"},"s":"I",  "c":"REPL",     "id":4508103, "ctx":"OplogApplier-0","msg":"Increment the config term via reconfig"}
{"t":{"$date":"2023-01-31T08:46:09.832+00:00"},"s":"I",  "c":"REPL",     "id":6015313, "ctx":"OplogApplier-0","msg":"Replication config state is Steady, starting reconfig"}
{"t":{"$date":"2023-01-31T08:46:09.832+00:00"},"s":"I",  "c":"REPL",     "id":6015317, "ctx":"OplogApplier-0","msg":"Setting new configuration state","attr":{"newState":"ConfigReconfiguring","oldState":"ConfigSteady"}}
{"t":{"$date":"2023-01-31T08:46:09.832+00:00"},"s":"I",  "c":"REPL",     "id":21353,   "ctx":"OplogApplier-0","msg":"replSetReconfig config object parses ok","attr":{"numMembers":3}}
{"t":{"$date":"2023-01-31T08:46:09.832+00:00"},"s":"I",  "c":"REPL",     "id":51814,   "ctx":"OplogApplier-0","msg":"Persisting new config to disk"}
{"t":{"$date":"2023-01-31T08:46:09.833+00:00"},"s":"I",  "c":"REPL",     "id":6015315, "ctx":"OplogApplier-0","msg":"Persisted new config to disk"}
{"t":{"$date":"2023-01-31T08:46:09.833+00:00"},"s":"I",  "c":"REPL",     "id":6015317, "ctx":"OplogApplier-0","msg":"Setting new configuration state","attr":{"newState":"ConfigSteady","oldState":"ConfigReconfiguring"}}
{"t":{"$date":"2023-01-31T08:46:09.834+00:00"},"s":"I",  "c":"REPL",     "id":21392,   "ctx":"OplogApplier-0","msg":"New replica set config in use","attr":{"config":{"_id":"configReplSet","version":119991,"term":56,"configsvr":true,"protocolVersion":1,"writeConcernMajorityJournalDefault":true,"members":[{"_id":0,"host":"primary_config:27019","arbiterOnly":false,"buildIndexes":true,"hidden":false,"priority":1.0,"tags":{},"slaveDelay":0,"votes":1},{"_id":1,"host":"secondary_config::27019","arbiterOnly":false,"buildIndexes":true,"hidden":false,"priority":0.5,"tags":{},"slaveDelay":0,"votes":1},{"_id":3,"host":"hidden_config27029","arbiterOnly":false,"buildIndexes":true,"hidden":false,"priority":0.0,"tags":{},"slaveDelay":0,"votes":1}],"settings":{"chainingAllowed":true,"heartbeatIntervalMillis":2000,"heartbeatTimeoutSecs":10,"electionTimeoutMillis":10000,"catchUpTimeoutMillis":-1,"catchUpTakeoverDelayMillis":30000,"getLastErrorModes":{},"getLastErrorDefaults":{"w":1,"wtimeout":0},"replicaSetId":{"$oid":"5773d2d1374047c92751c502"}}}}}
{"t":{"$date":"2023-01-31T08:46:09.834+00:00"},"s":"I",  "c":"REPL",     "id":21393,   "ctx":"OplogApplier-0","msg":"Found self in config","attr":{"hostAndPort":"primary_config:27019"}}
{"t":{"$date":"2023-01-31T08:46:09.835+00:00"},"s":"I",  "c":"REPL",     "id":6015310, "ctx":"OplogApplier-0","msg":"Starting to transition to primary."}
{"t":{"$date":"2023-01-31T08:46:09.838+00:00"},"s":"I",  "c":"REPL",     "id":6015309, "ctx":"OplogApplier-0","msg":"Logging transition to primary to oplog on stepup"}
{"t":{"$date":"2023-01-31T08:46:09.856+00:00"},"s":"I",  "c":"SHARDING", "id":21856,   "ctx":"Balancer","msg":"CSRS balancer is starting"}
{"t":{"$date":"2023-01-31T08:46:09.857+00:00"},"s":"I",  "c":"SHARDING", "id":22049,   "ctx":"PeriodicShardedIndexConsistencyChecker","msg":"Checking consistency of sharded collection indexes across the cluster"}
{"t":{"$date":"2023-01-31T08:46:09.858+00:00"},"s":"I",  "c":"STORAGE",  "id":20657,   "ctx":"OplogApplier-0","msg":"IndexBuildsCoordinator::onStepUp - this node is stepping up to primary"}
{"t":{"$date":"2023-01-31T08:46:09.858+00:00"},"s":"I",  "c":"REPL",     "id":21331,   "ctx":"OplogApplier-0","msg":"Transition to primary complete; database writes are now permitted"}
{"t":{"$date":"2023-01-31T08:46:09.866+00:00"},"s":"I",  "c":"NETWORK",  "id":22943,   "ctx":"listener","msg":"Connection accepted","attr":{"remote":"10.178.96.17:43944","connectionId":1918262,"connectionCount":84}}
{"t":{"$date":"2023-01-31T08:46:09.866+00:00"},"s":"I",  "c":"NETWORK",  "id":51800,   "ctx":"conn1918262","msg":"client metadata","attr":{"remote":"10.178.96.17:43944","client":"conn1918262","doc":{"driver":{"name":"NetworkInterfaceTL","version":"4.4.13"},"os":{"type":"Linux","name":"Oracle Linux Server release 6.9","architecture":"x86_64","version":"Kernel 4.1.12-124.48.6.el6uek.x86_64"}}}}
{"t":{"$date":"2023-01-31T08:46:09.868+00:00"},"s":"I",  "c":"ACCESS",   "id":20250,   "ctx":"conn1918262","msg":"Authentication succeeded","attr":{"mechanism":"SCRAM-SHA-256","speculative":true,"principalName":"__system","authenticationDatabase":"local","remote":"10.178.96.17:43944","extraInfo":{}}}
{"t":{"$date":"2023-01-31T08:46:09.868+00:00"},"s":"I",  "c":"NETWORK",  "id":22943,   "ctx":"listener","msg":"Connection accepted","attr":{"remote":"10.80.10.113:64024","connectionId":1918263,"connectionCount":85}}
{"t":{"$date":"2023-01-31T08:46:09.869+00:00"},"s":"I",  "c":"NETWORK",  "id":51800,   "ctx":"conn1918263","msg":"client metadata","attr":{"remote":"10.80.10.113:64024","client":"conn1918263","doc":{"driver":{"name":"mongo-go-driver","version":"v1.11.1"},"os":{"type":"linux","architecture":"amd64"},"platform":"go1.19","application":{"name":"pbm-agent"}}}}
{"t":{"$date":"2023-01-31T08:46:09.869+00:00"},"s":"I",  "c":"NETWORK",  "id":22943,   "ctx":"listener","msg":"Connection accepted","attr":{"remote":"10.80.10.113:64026","connectionId":1918264,"connectionCount":86}}
{"t":{"$date":"2023-01-31T08:46:09.871+00:00"},"s":"I",  "c":"NETWORK",  "id":51800,   "ctx":"conn1918264","msg":"client metadata","attr":{"remote":"10.80.10.113:64026","client":"conn1918264","doc":{"driver":{"name":"mongo-go-driver","version":"v1.11.1"},"os":{"type":"linux","architecture":"amd64"},"platform":"go1.19","application":{"name":"pbm-agent"}}}}
{"t":{"$date":"2023-01-31T08:46:09.872+00:00"},"s":"I",  "c":"ACCESS",   "id":20250,   "ctx":"conn1918263","msg":"Authentication succeeded","attr":{"mechanism":"SCRAM-SHA-256","speculative":true,"principalName":"pbmuser","authenticationDatabase":"admin","remote":"10.80.10.113:64024","extraInfo":{}}}
{"t":{"$date":"2023-01-31T08:46:09.874+00:00"},"s":"I",  "c":"ACCESS",   "id":20250,   "ctx":"conn1918264","msg":"Authentication succeeded","attr":{"mechanism":"SCRAM-SHA-256","speculative":true,"principalName":"pbmuser","authenticationDatabase":"admin","remote":"10.80.10.113:64026","extraInfo":{}}}
{"t":{"$date":"2023-01-31T08:46:10.115+00:00"},"s":"I",  "c":"NETWORK",  "id":22943,   "ctx":"listener","msg":"Connection accepted","attr":{"remote":"127.0.0.1:57370","connectionId":1918265,"connectionCount":87}}
{"t":{"$date":"2023-01-31T08:46:10.116+00:00"},"s":"I",  "c":"NETWORK",  "id":51800,   "ctx":"conn1918265","msg":"client metadata","attr":{"remote":"127.0.0.1:57370","client":"conn1918265","doc":{"driver":{"name":"PyMongo","version":"3.8.0"},"os":{"type":"Linux","name":"Red Hat Enterprise Linux Server 6.9 Santiago","architecture":"x86_64","version":"4.1.12-124.48.6.el6uek.x86_64"},"platform":"CPython 2.7.16.final.0"}}}
{"t":{"$date":"2023-01-31T08:46:10.148+00:00"},"s":"I",  "c":"NETWORK",  "id":22943,   "ctx":"listener","msg":"Connection accepted","attr":{"remote":"127.0.0.1:57371","connectionId":1918266,"connectionCount":88}}
{"t":{"$date":"2023-01-31T08:46:10.203+00:00"},"s":"I",  "c":"NETWORK",  "id":51800,   "ctx":"conn1918266","msg":"client metadata","attr":{"remote":"127.0.0.1:57371","client":"conn1918266","doc":{"driver":{"name":"PyMongo","version":"3.8.0"},"os":{"type":"Linux","name":"Red Hat Enterprise Linux Server 6.9 Santiago","architecture":"x86_64","version":"4.1.12-124.48.6.el6uek.x86_64"},"platform":"CPython 2.7.16.final.0"}}}
{"t":{"$date":"2023-01-31T08:46:10.435+00:00"},"s":"I",  "c":"ACCESS",   "id":20250,   "ctx":"conn1918266","msg":"Authentication succeeded","attr":{"mechanism":"SCRAM-SHA-256","speculative":false,"principalName":"datadog","authenticationDatabase":"admin","remote":"127.0.0.1:57371","extraInfo":{}}}
{"t":{"$date":"2023-01-31T08:46:10.827+00:00"},"s":"I",  "c":"NETWORK",  "id":22943,   "ctx":"listener","msg":"Connection accepted","attr":{"remote":"127.0.0.1:57372","connectionId":1918267,"connectionCount":89}}
@

Hello @Kin_Wai_Cheung ,

Welcome back to The MongoDB Community Forums! :wave:

I notice you haven’t had a response to this topic yet - were you able to find a reason for the error?
If not, can you please share more details for me to understand your situation better?

  • Does this mean that the primary shard2 and primary config server share the same hardware? Could you share more details of your deployment, e.g. if any node is running multiple mongod processes, are you using Docker or similar, or any other details that may help troubleshooting?
  • Do other nodes also sharing hardware with another replica set nodes?
  • Are these errors frequent or was it just this one event?
  • Also share whole output of below from all shards, including config server replica set
  1. sh.status()
  2. rs.status()
  3. rs.comf()

Regards,
Tarun

That’s correct.

  • primary shard2 and primary config are running on the same host and that node went down due to hw issue.
    We have 3 data bearing shards (each with a primary, secondary + hidden secondary replica sets)
    and1 config replica set (primary , secondary,hidden secondary) →
    no docker used
  • other nodes host atleast 2 mongod/mongos processes
    node 1: secondary databearing shard1 , hidden secondary databearing shard 3 + router2
    node 2: secondary databearing shard 2, hidden secondary shard1 , secondary config replica set
    node 3: primary databearing shard1, router1
    node 4: primary databearing shard2, primary config replica (so this went down)
    node 5: secondary databearing shard3, hidden secondary shard2
    node 6: primary databearing shard3, hidden secondary config replica set
  • just this one event

output will be shared once I reviewed it

mongod processes on the same hosts use a different port. (there is no hardware constraints atm)

My followup questions:

Should a mongodb cluster still work if both primary config replica set and any of the primary data bearing shard is unavailable at the same time?

I believe the fatal assertion happened on the secondary data bearing shard which is already elected to be primary is unable to connect the primary config replica set ?

cluster_status_redacted.txt (23.7 KB)

Hello @Tarun_Gaur do you need more info?