Resume of change stream was not possible, as the resume point may no longer be in the oplog

MaBeuLux88_xxx · December 7, 2021, 7:54am

No because you can’t keep ALL the oplog entries for ever. Else, it means that your oplog will contain every single write operation since the beginning of time and its size will increase indefinitely. You have to draw the line somewhere. It can be 1GB or 50GB or 100GB, but you need to draw a line somewhere and support this with the appropriate hardware.
I’d say that if you can find an oplog size that guarantees 1 or 2 weeks of “log length” (see rs.printReplicationInfo()), then it’s OK. But this is use case dependant so more or less could also be completely fine.

Another way to deal with this is to use new option (new in 4.4): --oplogMinRetentionHours

I never played with the MongoDB Source Connector yet, but here is some doc:

There is also a page in the doc dedicated to recover from an invalid resume token:

A resume token is the _id value of a change event. Doc here.

Here is an example in Python 3:

import pymongo
from bson.json_util import dumps

client = pymongo.MongoClient()
resume_token = {'_data': '8261AF0EDD000000022B022C0100296E5A10046803462D32FA458F8D539C1AEC72C0FC46645F6964006461AF0EDD151BABD5ABA613FA0004'}
change_stream = client.test.coll.watch(resume_after=resume_token)
for change in change_stream:
    print(dumps(change))
    print()
    print(change_stream.resume_token)
    print()

Cheers,
Maxime.