What are some of the biggest mistakes people make in aggregation pipelines?

steevej · November 18, 2020, 5:59pm

One thing I do to make it more readable is to assign each stage a variable and have the pipeline be an array of my variables. For example:

match = { "$match" : { ... } } ;
sort = { "$sort" : { ... } } ;
lookup = { "$lookup" : { ... } } ;
group = { "$group" : { ... } } ;
pipeline = [ match , lookup , sort , group ] ;
db.collection.aggregate( pipeline ) ;

I find it is easier to modify a stage because it is by itself rather than being embedded in a myriad of braces. I can also easily remove a stage from the pipeline. As for indentation, being as old as I am, I prefer the K&R/Allman braces style. So it would be, taking Prasad_Saya example:

group = 
{ 
    $group:
    {
        _id: "$workers.order", 
        order_avg:
        { 
            $avg: { $subtract: [ "$workers.endAt", "$workers.startAt"  ] } 
        },
        global_values:
        { 
            $addToSet:
            {
                some_id: "$_id",  
                duration: { $subtract: [ "$endAt", "$startAt" ] } 
            } 
        },
        another_field:
        { 
              ... ... ...
        }
    } 
} ;

And I like to indent with tabs, which makes me not like yaml and python very much. B-)