Coming in MongoDB 4.2: Pipeline Powered Updates and More Expressive Queries

Dj Walker-Morgan


In this series, we are introducing you to some of the features that will be making it into MongoDB 4.2. For many of you, this will be all the information you need to be ready for when 4.2 goes generally available (GA). We've already looked at wildcard indexing, compression and configuration and this week, we’re looking at enhancements to updates and to the aggregation pipeline.

When we work with MongoDB, for anything more than simple CRUD operations we used to find ourselves reaching for the aggregation framework. It's there that we can assemble a powerful pipeline of operations which can perform transformations of documents. In MongoDB 4.2, that pipeline power has now been brought to the update command, bringing a massive boost to the capabilities of the command. We'll show you how that works and then look at the new aggregation operators and expressions in 4.2 which give you even more options - trigonometry, regular expressions and the current time.

Pipelines Everywhere

There's been an important shift in where you can use aggregation pipelines in MongoDB. Previously, pipelines were tied to the aggregate function. With 4.2, you can now also create and use pipelines as part of update and findAndModify commands. Let's show the changes with an example. Let's get ourselves a document:

> db.exp.insertOne({ _id:"x", val1: 100, val2: 200 });
{ "acknowledged" : true, "insertedId" : "x" }

In the past, if we wanted to get a total of val1 and val2, because we couldn't refer to the fields in the update we would get the document, add them together and then update the document.

> db.exp.update({ _id:"x" }, { $set: { total: 300 } })
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.exp.findOne()
{ "_id" : "x", "val1" : 100, "val2" : 200, "total" : 300 }

Of course, there's a round trip involved and it would likely, in a non-example world, be part of other updates. Smart MongoDB users might never calculate the total and save it, knowing the aggregation pipeline has a $sum operator which can add an array of values together and refer to fields using $ notation like so:

> db.exp.aggregate( [ { $set: { total: { $sum:[ "$val1","$val2" ] } } } ])
{ "_id" : "x", "val1" : 100, "val2" : 200, "total" : 300 }

This isn't persistent though:

> db.exp.findOne()
{ "_id" : "x", "val1" : 100, "val2" : 200 }

And as it stands, that aggregation command would do the entire collection. We'd need to add a $match stage to set its scope down to one document...

If only we could bring the power of the aggregation framework to the update command. Well, with MongoDB 4.2 we can:

> db.exp.update({"_id" : "x" }, [ { $set: { total: { $sum:[  "$val1","$val2" ] } } } ] )
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.exp.findOne();
{ "_id" : "x", "val1" : 100, "val2" : 200, "total" : 300 }

We moved our aggregation pipeline into our update and its changes take place on the selected document. When we set a field value, it is written into the document. And it all takes place on the server with no round trips. That includes the aggregation framework's ability to perform conditional statements on the server like this:

> db.exp.update({ _id:"x" }, [ { $set: { status: { $cond: { if: { $gt: [ "$total",200 ] }, then: "Over Limit", else: "Clear" } } } } ] )

Which looks a little clearer formatted as:

> db.exp.update({ _id:"x" }, [
    "$set": {
      "status": {
        "$cond": {
          "if": { "$gt": ["$total", 200] },
          "then": "Over Limit",
          "else": "Clear"

This is part of an ongoing task for MongoDB's developers, to unify the language used in queries and aggregation and offer the same capabilities everywhere. When we talk about improvements in the aggregation framework, they could also apply to the update and findAndModify commands when they use aggregation pipelines.

If you are familiar with the aggregation framework, you may wonder where the $set aggregation stage came from. It too is new in 4.2, but not that new; it's an alias for the existing $addFields stage and exists to make that language unification seamless. It is one of the three aggregation stages that are appropriate to use in an update. Each has a new alias and its original aggregation stage name - $set/$addFields, $unset/$project and $replaceWith/$replaceRoot. These stages let you add fields, remove fields or completely replace the entire document; everything you'd ever want to do when updating a document.

Smooth Math Operators

Being able to calculate with common trigonometry functions is one of those things that you only miss when it isn't available. With MongoDB 4.2, that risk of missing out is averted with a set of trig expressions being added to the aggregation framework. Sin, cos, tan, their inverses and their inverse hyperbolic variants are all supported with $sin, $cos, $tan, $asin, $acos, $atan, $atan2, $asinh, $acosh and $atanh. All of these expressions work with radians, so there's also the $degreesToRadians and $radiansToDegrees conversion expressions.

There's also a new $round expression which will round numerical values to a specific digit or decimal place. Note that the older $trunc expression which truncated values to an integer has now been enhanced to truncate numerical values to a specific digit or decimal place, but it keeps its old behavior when used with it's older syntax.

Let's put those together into a worked query on our previous document:

>db.exp.update({"_id" : "x" }, [ { $set: { sin: { $round: [ { $sin:"$val1" } , 5 ] } } } ] )
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.exp.findOne();
	"_id" : "x",
	"val1" : 100,
	"val2" : 200,
	"total" : 300,
	"sin" : -0.50637

We're getting the sine of our val1 field and then rounding it to 5 decimal places, and writing it back into the document as the new sin field.

Regular Expressions for all

Up until MongoDB 4.2, you could only use the $regex operator in the $match stage in aggregations. That meant you could only use it for matching and not for parsing and extracting parts of a string. With 4.2, that all changes with three new operators, $regexFind, $regexFindAll and $regexMatch. Let's demonstrate with a quick example. First, lets throw some strings into documents:

> db.reg.insertOne( { text:"Looking for 100 numbers" } )
> db.reg.insertOne( { text:"Digging around in 256 digits" } )
> db.reg.insertOne( { text:"Filtering through 65 characters" } )

Now, what we want to capture is the number that appears in that phrase, but only when it's followed by a reference to numbers or digits. We'll use ([0-9]+) (numbers|digits) as our expression. This will capture the numbers and the word following using parentheses in the regular expression. Now, let's run that in an aggregation and see our results:

> db.reg.aggregate( { $set: { found: { $regexFind: { regex: "([0-9]+) (numbers|digits)", input:"$text" } } } } )
{ "_id" : ObjectId("5d3f059c7baafd6bbd862d47"), "text" : "Looking for 100 numbers", "found" : { "match" : "100 numbers", "idx" : 12, "captures" : [ "100", "numbers" ] } }
{ "_id" : ObjectId("5d3f059c7baafd6bbd862d48"), "text" : "Digging around in 256 digits", "found" : { "match" : "256 digits", "idx" : 18, "captures" : [ "256", "digits" ] } }
{ "_id" : ObjectId("5d3f076f7baafd6bbd862d49"), "text" : "Filtering through 65 characters", "found" : null }

If we look at the our resulting found field, we'll see we get back more than a simple true/false for the match:

"found" : { "match" : "100 numbers", "idx" : 12, "captures" : [ "100", "numbers" ] }

Here we see the returned match field gives us the exact string that the regular expression engine matched with. The idx field indicates how far into the source string that match was. Finally, the captures array returns each captured part of the matched string - the first element is the number from the string, the second element, the word "numbers" or "digits". It's ideal for complex string parsing. If there is no match, the $regexFind returns a null.

With $regexFind you just get the first match and captures back. If you have many matches to find, then $regexFindAll will flush out all matching patterns and extract all of them to an array of results like the results from $regexFind. In this case, if there's no match it returns an empty array.

If all you want is a true or false as to whether something matches a regular expression then $regexMatch will do what you need.

Finally, $$NOW

We've heard many times that people may want to inject timestamps into their documents without round-tripping to the client to get what the time was. In 4.2, we're happy to include $$NOW, a variable that can be accessed in aggregation pipelines which returns the current time as an ISODate.

And that's it for this "Coming in 4.2". In the next part, we'll be looking at On-Demand Materialized Views, which can take advantage of many of the features we explored here.

If you are eager and want to explore the very latest in MongoDB technology — and you have a safe testing area away from production — then we’d like to introduce you to MongoDB development releases and release candidates.

Working with a release candidate

There’s an abundance of new features coming in the next release of MongoDB and they are appearing in the development build as each feature is ready. Your essential reference will be the release notes, which you’ll find on the documentation site under its own version tab. In the current case, that’s the Release Notes for MongoDB 4.2 (Release Candidate). This is a living document and regularly modified - always check it.

Downloading the development build.

You can find the development build in the download center. Just select the latest development release in the Version drop down menu and continue as usual, selecting OS, package format and clicking on download.

Try it out on MongoDB Atlas

MongoDB 4.2 release candidates are also available for evaluation on MongoDB Atlas. We have a separate article on how to get started with MongoDB 4.2 in the cloud available and how to get a $200 credit so you can spin up a testing cluster.

Giving feedback and reporting Issues

If you want to give general feedback on MongoDB 4.2 RC4 and later, head over to the MongoDB User Google Group. For bug reporting, follow these instructions on how to submit a bug to the MongoDB JIRA.