Error occurred - ''CollectionScan died due to position in capped collection being deleted'

Our Prod Mongo is a PSA running on version 4.2 (Read Concern Majority disabled). I currently have a service that is reading off the change stream for a specific collection. Suddenly today outta nowhere I see this error:

  • Error occurred and seen within onError, error=[{}]
    com.mongodb.MongoQueryException: Query failed with error code 136 and error message 'CollectionScan died due to position in capped collection being deleted. Last seen record id: {number} on server

After this, the stream tried to restart from last persisted token which also gives an error:

  • Command failed with error 280 (ChangeStreamFatalError): ‘cannot resume stream; the resume token was not found. {_data: “{resume-token}”}’ on server. The full response is {“errorLabels”: [“NonResumableChangeStreamError”], “operationTime”: {"$timestamp": {“t”: 1590353747, “i”: 3}}, “ok”: 0.0, “errmsg”: "cannot resume stream; the resume token was not found.

Note:

  • The last persisted resume token was atmost 5mins old.
  • The oldest record in the oplog for the collection I was watching is still 2 days old.

Not sure how the resume token got invalidated. Do you think using a timestamp to resume is a better idea?
Is there any best practice that the service would need to follow, to handle this kind of error?
Would be a of great help to even get some knowledge about these errors and some best practices.

I have found an answer to this. It is as the error says. Actually our service that read from the change stream uses backpressure and was way behind on consumption (latency was much more than what was expected). Due to which, when the oplog got slashed (it being a capped collection), the position the change stream cursor currently on, was no more present. We fixed the throughput and now all is good.
Though I’ll share.

7 Likes

@Atil_Pai thanks for coming back and posting the solution to your original issue. This will help others who might come across this same issue in the future.

2 Likes

This topic was automatically closed 24 hours after the last reply. New replies are no longer allowed.