Hi!
I am evaluating MongoDB as the primary operational database for a microservice. This microservice processes events that may be duplicates. An event is considered a duplicate if it is identical to another event that has already been processed by the microservice.
The microservice needs to handle these potentially duplicate events without generating any side effects (considering only database updates as side effects).
Assumptions: All events have a “message-id” field that can be used to determine if two messages are the same.
With a relational database, this problem can be solved quite easily: we can bundle both the updates to the business entities and the insertion of a record containing the “message-id” of the processed event as the primary key in the same transaction. When the microservice processes duplicate events, the transaction aborts due to the unique key violation, allowing the application to catch this case and detect the duplicate.
How would you approach the same problem with MongoDB? While we can use transactions in MongoDB, extensive use of transactions might affect its performance.
Another solution is to enrich all documents with a “message-ids” array and append the “message-id” of the event we are processing when updating that document. The updates can check if the current “message-id” is already present in the “message-ids” array, detecting if the event was already processed. However, I find this approach quite invasive as it changes the structure of business entities.
Am I missing any other possible solutions? How do you handle this kind of problem when using MongoDB as your primary operational database?