EventGet 50% off your ticket to MongoDB.local London on October 2. Use code WEB50Learn more >>
MongoDB Developer
MongoDB
plus
Sign in to follow topics
MongoDB Developer Centerchevron-right
Developer Topicschevron-right
Productschevron-right
MongoDBchevron-right

How to Maintain Multiple Versions of a Record in MongoDB (2024 Updates)

John Page6 min read • Published Aug 12, 2024 • Updated Aug 12, 2024
Aggregation FrameworkMongoDB
Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Over the years, there have been various methods proposed for versioning data in MongoDB. Versioning data means being able to easily get not just the latest version of a document or documents but also view and query the way the documents were at a given point in time.
There was the blog post from Asya Kamsky written roughly 10 years ago, an update from Paul Done (author of Practical MongoDB Aggregations), and also information on the MongoDB website about the version pattern from 2019.
These variously maintain two distinct collections of data — one with the latest version and one with prior versions or updates, allowing you to reconstruct them.
Since then, however, there have been seismic, low-level changes in MongoDB's update and aggregation capabilities. Here, I will show you a relatively simple way to maintain a document history when updating without maintaining any additional collections.
To do this, we use expressive updates, also sometimes called aggregation pipeline updates. Rather than pass an object with update operators as the second argument to update, things like $push and $set, we express our update as an aggregation pipeline, with an ordered set of changes. By doing this, we can not only make changes but take the previous values of any fields we change and record those in a different field as a history.
The simplest example of this would be to use the following as the update parameter for an updateOne operation.
This would explicitly set a to 5 but also set previous_a to whatever a was before the update. This would only give us a history look-back of a single change, though.
Before:
After:
What we want to do is take all the fields we change and construct an object with those prior values, then push it into an array — theoretically, like this:
The above does not work because the $push part in bold is an update operator, not aggregation syntax, so it gives a syntax error. What we instead need to do is rewrite push as an array operation, like so:
To talk through what's happening here, I want to add an object, { _updateTime: "$$NOW", a:"$a",b:"$b"}, to the array at the beginning. I cannot use $push as that is update syntax and expressive syntax is about generating a document with new versions for fields, effectively, just $set. So I need to set the array to the previous array with nym new value prepended.
We use $concatArrays to join two arrays, so I wrap my single document containing the old values for fields in an array. Then, the new array is my array of one concatenated with the old array.
I use $ifNUll to say if the value previously was null or missing, treat it as an empty array instead, so the first time, it actually does history = [{ _updateTime: "$$NOW", a:"$a",b:"$b"}] + [].
Before:
After:
That's a little hard to write but if we actually write out the code to demonstrate this and declare it as separate objects, it should be a lot clearer. The following is a script you can run in the MongoDB shell either by pasting it in or loading it with load("versioning.js").
This code first generates some simple records:
(index)_idfield_1field_2field_3field_4dateUpdated
00344919742024-04-15T13:30:12.788Z
111394342024-04-15T13:30:12.836Z
22513096932024-04-15T13:30:12.849Z
33294421852024-04-15T13:30:12.860Z
4441351572024-04-15T13:30:12.866Z
5508556282024-04-15T13:30:12.874Z
66855624782024-04-15T13:30:12.883Z
77272396252024-04-15T13:30:12.895Z
88704040302024-04-15T13:30:12.905Z
9969131392024-04-15T13:30:12.914Z
Then, we modify the data recording the history as part of the update operation.
We now have records that look like this — with the current values but also an array reflecting any changes.
We can now use an aggregation pipeline to retrieve any prior version of each document. To do this, we first filter the history to include only changes up to the point in time we want. We then merge them together in order:
This technique came about through discussing the needs of a MongoDB customer. They had exactly this use case to retain both current and history and to be able to query and retrieve any of them without having to maintain a full copy of the document. It is an ideal choice if changes are relatively small. It could also be adapted to only record a history entry if the field value is different, allowing you to compute deltas even when overwriting the whole record.
As a cautionary note, versioning inside a document like this will make the documents larger. It also means an ever-growing array of edits. If you believe there may be hundreds or thousands of changes, this technique is not suitable and the history should be written to a second document using a transaction. To do that, perform the update with findOneAndUpdate and return the fields you are changing from that call to then insert into a history collection.
This isn't intended as a step-by-step tutorial, although you can try the examples above and see how it works. It's one of many sophisticated data modeling
techniques you can use to build high-performance services on MongoDB and MongoDB Atlas. If you have a need for record versioning, you can use this. If not, then perhaps spend a little more time seeing what you can create with the aggregation pipeline, a Turing-complete data processing engine that runs alongside your data, saving you the time and cost of fetching it to the client to process. Learn more about aggregation.
Top Comments in Forums
There are no comments on this article yet.
Start the Conversation

Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Related
Tutorial

Secure your API with Spring Data MongoDB and Microsoft EntraID


Apr 02, 2024 | 8 min read
Article

The Six Principles for Building Robust Yet Flexible Shared Data Applications


Sep 23, 2022 | 3 min read
Tutorial

Optimizing $lookup Performance Using the Power of Indexing


Aug 30, 2024 | 7 min read
Quickstart

Java - Mapping POJOs


Mar 01, 2024 | 5 min read