I’m evaluating MongoDB change stream or trigger for supporting auditing. Any changes to the database MUST be logged into an audit history collection. How reliable is it if using change streams or Atlas triggers to log changes to the audit history? In what circumstances it may cause data loss such that committed data changes would not able to be logged?
In terms of change stream, it should be very reliable. It works using the oplog as a data source.
In terms of Atlas triggers, it also should fire when specified.
In my opinion, I would encourage you to test both solutions thoroughly with your workload. If you find that there are missing events, you should definitely report it as a possible bug.
If the operation identified by the resume token passed to the
startAfteroption has already dropped off the oplog,
db.collection.watch()cannot resume the change stream.
So it is theoretically possible to lose events since opLogs are not persisted forever. For example, your change stream connection is broken and before you get a chance to reestablish it, the oplog is gone in the too busy server.
Correct, but this is not the issue of change stream or Atlas triggers. This basically means that the oplog was not sized correctly for the workload, or if it does, there was a catastrophic issue that went unresolved for a long time, enough time until the oplog gets rolled over. At this point, the whole replica set would have issues, not only the change stream
that’s a fair enough point.
But I would personally prefer ingesting the change stream event to a message queue (e.g. kafka) and set up consumers on the other end of the queue. (and look like mongodb already has its support)
By doing this, the event will not be lost before the application processes it, and the “change stream traffic” will never swamp the observer side (imagine there are too many events and the simple change stream cursor from mongo shell is not able to catch up with it)
Thank you all for the great suggestions!