Resume/Restart a change stream after adding a new shard

Andrey_N · July 9, 2021, 3:56pm

Hey guys!

Our use case is pretty simple - we have a sharded MongoDB cluster with replicas and multiple shards. Currently, we are watching the changes (by using .watch() and connecting to the mongos). These changes are streamed into other parts of our data pipeline.

We are using MongoDB 4.2 community.

When we added a new shard (because our data grows up), I saw an error “Error on remote shard mongoprodnew:27020 :: caused by :: Resume of change stream was not possible, as the resume point may no longer be in the oplog.le, as the resu…” (I guess the last was truncated). And our replication script crashed, as well as the whole feature.

I tried both resumeAfter and startAtOperationTime params to set the starting point. Both caused that error, “Resume of change stream was not possible” - but hey I don’t need to resume, just re-create it for me please?

So whenever we need to add/replace a shard now, we have to completely stop the whole logical replication process, add a shard, wait until it fetches the data chunks, and then start the replication again. What’s even worse, we can’t really write anything into the DB because the changes will be lost - we won’t be able to resume from the point that’s in the past, before the shard is really up and running.

Is there any way to do that without such an unpleasant downtime?

Thanks

Oded_Raiches · June 19, 2023, 7:11am

Hi @Andrey_N ,
I noticed this in my application as well.
Did you find any fix for this?

Garaudy_Etienne · June 21, 2023, 1:16am

SERVER-42232 should have fixed this issue on MongoDB 4.2.

Oded_Raiches · June 21, 2023, 8:18am

Thanks @Garaudy_Etienne for the response,
I am seeing this issue again in MongoDB 6.0.
Opened a different thread with the info: