Config server operation exceeded time limit

Hi fellow mongodb guru:

I ran into an issue with a sharded cluster. Someone accidentially deleted the data from one of the sharded groups. Since it is a QA db, no big issue. I figured I can stop all mongos. deleted all the data in all data nodes in all sharded groups. Deleted all db files on the config server replicaset. Start from scratch.

I re-initilized the config server replicaset. Did the same for each of the data node sharded replicasets. started mongos. and added the following shard config:

sh.addShard(“qa_group1/mongo-qa-vm1.fra1.framework:27018,mongo-qa-vm2.fra1.framework:27018”)
sh.addShard(“qa_group2/mongo-qa-vm3.fra1.framework:27018,mongo-qa-vm3.fra1.framework:27018”)

For some reason, I continue to get the following errors. I think because of the error, shard distribution is not working. Data stays on only one sharded groups.

So I want to see if anyone out there can help me to troubleshoot what is going on.

Thanks in advance.
Eric

Logs from config server primary (all nodes have the same error)

`2023-09-30T08:32:31.177Z I COMMAND [conn343] Command on database admin timed out waiting for read concern to be satisfied. Command: { find: "system.keys", filter: { purpose: "HMAC", expiresAt: { $gt: Timestamp(1696062714, 2) } }, sort: { expiresAt: 1 }, readConcern: { level: "majority", afterOpTime: { ts: Timestamp(1696003640, 1), t: 109 } }, maxTimeMS: 30000, $readPreference: { mode: "nearest" }, $replData: 1, $clusterTime: { clusterTime: Timestamp(1696062714, 2), signature: { hash: BinData(0, 0000000000000000000000000000000000000000), keyId: 0 } }, $configServerState: { opTime: { ts: Timestamp(1696003640, 1), t: 109 } }, $db: "admin" }. Info: ExceededTimeLimit: Error waiting for snapshot not less than { ts: Timestamp(1696003640, 1), t: 109 }, current relevant optime is { ts: Timestamp(1696062748, 1), t: 1 }. :: caused by :: operation exceeded time limit`
 

`2023-09-30T08:32:31.177Z I COMMAND [conn343] command admin.$cmd command: find { find: "system.keys", filter: { purpose: "HMAC", expiresAt: { $gt: Timestamp(1696062714, 2) } }, sort: { expiresAt: 1 }, readConcern: { level: "majority", afterOpTime: { ts: Timestamp(1696003640, 1), t: 109 } }, maxTimeMS: 30000, $readPreference: { mode: "nearest" }, $replData: 1, $clusterTime: { clusterTime: Timestamp(1696062714, 2), signature: { hash: BinData(0, 0000000000000000000000000000000000000000), keyId: 0 } }, $configServerState: { opTime: { ts: Timestamp(1696003640, 1), t: 109 } }, $db: "admin" } numYields:0 reslen:683 locks:{} protocol:op_msg 30009ms`
`2023-09-30T08:32:41.445Z I COMMAND [conn339] Command on database config timed out waiting for read concern to be satisfied. Command: { find: "collections", filter: { _id: /^config\./ }, readConcern: { level: "majority", afterOpTime: { ts: Timestamp(1696003640, 1), t: 109 } }, maxTimeMS: 30000, $readPreference: { mode: "nearest" }, $replData: 1, $clusterTime: { clusterTime: Timestamp(1696062730, 1), signature: { hash: BinData(0, E6E45379DFB713828A3A464463950B385DC87F98), keyId: 7284513431665770523 } }, $configServerState: { opTime: { ts: Timestamp(1696003640, 1), t: 109 } }, $db: "config" }. Info: ExceededTimeLimit: Error waiting for snapshot not less than { ts: Timestamp(1696003640, 1), t: 109 }, current relevant optime is { ts: Timestamp(1696062748, 1), t: 1 }. :: caused by :: operation exceeded time limit`
2023-09-30T08:32:41.445Z I COMMAND [conn339] command config.$cmd command: find { find: "collections", filter: { _id: /^config\./ }, readConcern: { level: "majority", afterOpTime: { ts: Timestamp(1696003640, 1), t: 109 } }, maxTimeMS: 30000, $readPreference: { mode: "nearest" }, $replData: 1, $clusterTime: { clusterTime: Timestamp(1696062730, 1), signature: { hash: BinData(0, E6E45379DFB713828A3A464463950B385DC87F98), keyId: 7284513431665770523 } }, $configServerState: { opTime: { ts: Timestamp(1696003640, 1), t: 109 } }, $db: "config" } numYields:0 reslen:683 locks:{} protocol:op_msg 30007ms

I finally found the root cause.

one mongos that was started by an user was opening the connection to the config server.