Compare 2 documents of the same collection in MongoDB

How could we compare 2 documents of the same collection and find which all field changed between the documents.

for example if there is a document in a collection like
{
“property1” : “A”,
“property2” : “B”,
}

which gets changed to
{
“property1” : “A1”,
“property2” : “B”,
}

then I wish to produce output like
{
“propertyChanged” : “property1”,
“oldValue”: “A”
“newValue”: “A1”
}

@MaBeuLux88

Hello, welcome to the MongoDB community.

I believe this link can help you understand, let me know your questions.

Hi @Samuel_84194 ,
Thanks for replying , the post has lot of ideation around how to store data so that we get history of changes , but there is no mongo query to produce output like i want .

Hi @Lakhan_SINGH and welcome in the MongoDB Community :muscle: !

I think it’s a use case for Change Streams, no?

Cheers,
Maxime.

Hi @MaBeuLux88 ,
thanks for replying.
We are limited mongo 4.4 where we only get the updated document and not the changed document.
In mongo 6+ we get this out of the box.

Is it possible for you to guide me to an aggregate pipeline which could determine the changed properties of 2 documents in a collection ?

Unless you have stored the changes you cant use a pipeline to check a document from yesterday against what it looks like today.
We currently store a document history on a document when a data fix is done so we can review any field that changed on any document.
The better way now (that we’re moving to) is a change stream to track changes, via an Atlas trigger.

If you don’t have anything place currently to track changes, you can’t.

We do have scripts in place that can rollup changes for a document, but to re-create a document from history is non-trivial, which is why we were also waiting to upgrade to later versions to get pre-image.

I’m not sure how you could do it in an aggregation but it’s not that hard to do in a script.

Change Streams where introduced in 3.6 so if you are (still) running in 4.4, you have them available as long as you are running a Replica Set which should be the default for a production environment.

As you are reading the docs from the change stream, you can create a side collection and update the documents to keep track of the old value as you update them in the main collection.