Change stream resume point lost after adding a new shard

Oded_Raiches · June 20, 2023, 10:19am

Using MongoDB 6.0.6.
Having a DB with sharded collections.
The DB has a few shards, every shard is in a P-S-A (2 node replicas and an arbiter).
My application is using change-streams to gather statistics, and is constantly reading for operations, while also maintaining the clusterTime timestamp in case the the app crashes, so we can continue from the last handled point.

For the shards, we are self-maintaining the storage, and decide on adding a shard from time to time.
Been noticing that when a shard is added the following happens:

added shard at ~9:15 AM, shard comes up, rebalancing its data with other shards
few hours later at ~12:30 PM, Resume of change stream was not possible errors start showing up, coming from the port of the new added shard.
also, the first available event in the new shard is delayed very much, here we added the shard around ~9:30 AM, but the first available event was only at ~14:30 PM, many hours later, but for other shards the first even is before that, so not sure how this makes sense because this is a totally new shard.

In this case, the client, which is connected to the mongos router, is unable to proceed unless we move the start_at_operation_time pointer to 14:30 PM so it can continue reading (and this also makes us lose all the update from ~12:30 PM to ~14:30 PM which is not acceptable).

Why is this happening? isn’t the change-stream suppose to continue regularly and add the new shard updates once its ready? failing like this does not look like normal behavior.
Is there a safe way to add a shard and keep reading incoming updates for the other shards through the mongos router without getting stopped by this un-synced shard?
Is this happening because of the P-S-A configuration, and will not occur in P-S-S? if yes, why is that?

Also observed but not answered in this thread: