Updates to documents in aggregation pipelines affecting results?

We have an aggregation pipeline that takes roughly 15-20 minutes to fully complete. During this pipeline we are calculating aggregated values based on the properties on the documents. In the event a value on one of these documents is changed via an update from another process, how does that affect the results of the pipeline currently in progress?

In general when programming objects use references to each other. In the event this object changes, all references to the object are affected. This is why we have to worry about Deep vs Shallow copies in some programming languages.

So my question is; if during this 10 minute aggregation pipeline a document is changed and its values updated, does that change the results that we WOULD have gotten if the document itself had not been changed?

I hope that makes sense.

1 Like

Hi @Wyatt_Baggett,

Simple question but not so easy to answer :smiley:.

I guess it’s theoretically possible to read from a consistent snapshot if you are using the read concern “snapshot” in a multi-document transaction which is itself part of a causal consistent session… But multi-document transactions are limited to 60s by default so not very useful in your case, unless you increase transactionLifetimeLimitSeconds to some value > 20 min… but running such transactions would generate a non-negligible cache pressure I guess on the mongod server which would have to keep in memory all the different snapshots and it’s most probably not worth it.

So my simple answer is “yes”. The final value will be impacted if some write operations are interfering with data used in the aggregation during its execution.

That being said, once you have passed the first few (match and sort) steps of your aggregation, your mongod is then working on an in-memory copy of your documents so at that point, your final result wouldn’t be impacted anymore I guess as the copy can’t possibly be in sync with the initial document.

My little finger is also telling me that you should keep an eye on the MongoDB 5.0 version which should come out in a couple of months now and COULD contain a solution to this problem…

You didn’t hear that from me.

Cheers,
Maxime.

PS: Dear judge, this post can’t be a proof. I didn’t write this. I swear. It wasn’t me. I was at the cinema at that time.

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.