Why do we need $addFields in $facet stage for Chapter 2?

Hi everyone,

I was just hoping to understand why exact we need the following:

movies: [
  {
    $addFields: {
      title: "$title"
    }
  }
]

in the $facet stage for the ticket Faceted Search? Are we doing this simply to get a movies field with all the movie objects? If so, is there another way to do the same thing that makes it very clear instead?

Hey @Sangeeth_48021

I do believe you are correct with this. Since

Adds new fields to documents. $addFields outputs documents that contain all existing fields from the input documents and newly added fields.

----------------------------------MY ERROR--------------------------------------- With apologies

There are other ways to do the same thing. One that comes to mind is to use $group with passing null for _id.

The _id field is mandatory ; however, you can specify an _id value of null, or any other constant value, to calculate accumulated values for all the input documents as a whole.

However I am not sure if this is more clear as you have asked. I think that would be a subjective question and answer based off someones experience with Mongodb and the aggregation pipeline.

1 Like

Adding it to the @natac13’s reply, we just want the movies to pass through without modifying them.

Group stage wouldn’t be ideal in this situation as it would combine them all into one document and force you to push the movie documents into an array, causing more work after the $facet stage.

Kanika

1 Like

Hi i feel that strange too. It looks more like an hack that like a clean way to get all movies in a property.

Is there a cleanest way to do that ?

An alternative could be:

   "$match": {
      "_id": {$exists: true}
   }

Why should we iterate on each document to group them all in a property ?

It looks like a latency generator.

Where’s the iteration you’re referring to?

A facet are individual pipelines of stages. Each pipeline must contain at least one stage. There’s no special property for $facet to say “bring in all the fields from the parent”.

@007_jb

when we do a match stage, we run a comparison on each record.

when we do addFields stage, we run a function on each record.

The goal in the facet we are talking about is just to copy all records in a property.
Iterate on each record to copy them a not good for performance.
A different stage to “copy” the list without any action on the record level would be the best solution.
But as you said it seems there is no special property/stage to tell mongo to only copy the list in a dedicated property.

That’s an oversimplified assumption of the internals of the complex algorithms that drive the aggregation framework. Bear in mind that there are query optimisations that happen internally.

The $exists example does an existence check on the field, not the value. In other words, it checks whether a specific key/field exists in each record, it doesn’t do a comparison of values.

In spite of that, what’s most important in the usage of $addFields: {title: "$title"} in this lab is whether it meets the brief; which it does. And for the following reasons:

  1. Does it return all the relevant documents? Yes
  2. Does it alter the shape of any of the documents? No
    i). If title does not exist in a document, does it include it? No it doesn’t because it uses the field substitution syntax.
    ii). If title exists, does the value change? No. The substituted field is the same name as the substitute field so it’s like-for-like.
    iii). Will title appear at the bottom? No. Its position remains unchanged.
  3. Any performance impact? No. The $facet stage is limited to 20 documents so the execution time for $addFields in milliseconds is ~0.

Below are some execution stats for comparison. I’ve also included another alternative using $limit:


The output of a $facet pipeline is an array which inherently is limited to the 16MB document limit. As a result, the intended use of facets is for faceting aggregated results. And yes it would be nice to have something like $facet: {pipeline: []} to return all input documents (which is in-line with aggregate([])), but in the meantime we don’t and we have acceptable methods.

Therefore if you’re worried about performance, the only way to know how your pipeline is performing is by running and examining the Explain Plan.

3 Likes