Help with Aggregation - self-referencing join

Hi,

First off, I am mid-level in terms of aggregations. Not an expert, but not a Noob either. But, I’m stuck.

Here’s my scenario. I’ve got a collection of test documents in an education app. Those documents contain a score, and a percentile. They also contain a term (Fall, Winter, Spring) and a grade level. I also have a collection of test definitions (dictionary) - that contain templates for indicator bands (my best attempt at describing this). For example, 0 to 50 is low, 51 to 89 is proficient, 90-100 is advanced.

The collections look like this:

//Test
{
    _id: '12345',
    student_id: '3350235',
    dictionary_id: '12345',
    test_score: 35,
    test_percentile: 56,
    term_name: 'fall',
    test_grade: '4',
    school_year: 2024,
    risk_level: 'someRisk'
}

//Dictionary
{
    _id: '12345',
    default_bands: [
        {
            _id: '1',
            term_name: "*", //wildcard
            grade: "*", //wildcard
            field: "percentile",
            bands: [
                {
                    _id:'1',
                    low:0,
                    high:50,
                    background_color:'red'
               },
               {
                    _id:'2',
                    low:51,
                    high:89,
                    background_color:'green'
               },
               {
                    _id:'3',
                    low:90,
                    high:100,
                    background_color:'blue'
               }
            ]
        },
        {
            _id: '2',
            term_name: "fall", 
            grade: "4", 
            field: "score",
            bands: [
                {
                    _id:'1',
                    low:0,
                    high:335,
                    background_color:'red'
               },
               {
                    _id:'2',
                    low:336,
                    high:768,
                    background_color:'green'
               },
               {
                    _id:'3',
                    low:769,
                    high:1000,
                    background_color:'blue'
               }
            ]
        },
        {
            _id: '3',
            term_name: "fall", 
            grade: "*", //wildcard 
            field: "risk_level",
            bands: [
                {
                    _id:'1',
                    low:'highRisk',
                    high:'highRisk',
                    background_color:'red'
               },
               {
                    _id:'2',
                    low:'someRisk',
                    high:'someRisk',
                    background_color:'green'
               },
               {
                    _id:'3',
                    low:'noRisk',
                    high:'noRisk',
                    background_color:'blue'
               }
            ]
        }
    ]
}

I’m trying to write an aggregation that will output the following:

//Output
{
    school_year:2024,
    test_grade: '4',
    term_name: 'fall',
    band_color: 'green',
    students:35, //Number of students who fall into this band
}

So, the goal here is to take the test record, lookup the associated dictionary record, find the proper default band group to use, and then evaluate the score to determine what background color to use.

If you notice, in the dictionary, there are two fields (term_name and test_grade) that can either have a value, or a wildcard (*). I have the aggregation working perfectly if both fields have the wildcard, or both fields have a value - separately. No problem. However, once I try to include mixed case scenarios - it stops working, and I’m not sure why.

This is what I’ve got so far.

db.tests.aggregate([
    { $match: { dictionary_id: "12345", school_year: 2024, test_grade: '4', term_name: 'fall', student_id: "3350235" } }, //filtered down to one student for testing
    {
        $lookup: {
            from: "dictionary",
            localField: "dictionary_id",
            foreignField: "_id",
            as: "ad"
        }
    },
    { $unwind: "$ad" },
    { $unwind: "$ad.default_bands" },
    {
        $match: { //This is the stage in question
            $expr: { //[v,v]
                $and: [
                    { $ne: ["$ad.default_bands.term_name", { $literal: "*" }] },
                    { $ne: ["$ad.default_bands.term_grade", { $literal: "*" }] },
                    { $eq: ["$ad.default_bands.field", { $literal: "percentile" }] }
                ]
            }
        }
    },
    { $unwind: "$ad.default_bands.bands" },
    {
        $match: {
            $expr: {
                $and: [
                    { $gte: ["$test_percentile", { $toDecimal: "$ad.default_bands.bands.low" }] },
                    { $lt: ["$test_percentile", { $toDecimal: "$ad.default_bands.bands.high" }] }
                ]
            }

        }
    },
    {
        $project: {
            _id: 0,
            student_id: "$student_id",
            school_year: "$school_year",
            test_grade: "$test_grade",
            term_name: "$term_name",
            test_score: "$test_score",
            test_percentile: "$test_percentile",
            background_color: "$ad.default_bands.bands.color",
            dictionary_id: "$dictionary_id"
        }
    },
    {
        $group: {
            _id: {
                test_grade: "$test_grade",
                term_name: "$term_name",
                school_year: "$school_year",
                background_color: "$background_color",
                dictionary_id: "$dictionary_id"
            },
            count: { $sum: 1 }
        }
    },
    {
        $project: {
            _id: 0,
            school_year: "$_id.school_year",
            test_grade: "$_id.test_grade",
            term_name: "$_id.term_name",
            background_color: "$_id.background_color",
            students: "$count"
        }
    }
])

This is fantastic and fast. However, in the “stage in question” above, I need to account for all possibilities - including no possibilities.

Combination Structure (in my head) [term_name, grade]
#1 [*,*] //both term and grade are wildcard
#2 [v,*] //term has a value ('fall', etc), and grade is a wildcard
#3 [*,v] //term is wildcard and grade has a value
#4 [v,v] //both term and grade have values
#5 [null,null] //This shouldn't happen - but try to account for it?

I’ve tried to use the $cond operator to set up a series of $if statements to run through the possibilities.

{
        $match: {
            $expr: { 
                $cond: {
                                      
                    if: { //[*,*]
                        $and: [
                            {$eq: ["$ad.default_bands.term_name",{$literal:"*"}]},
                            {$eq: ["$ad.default_bands.term_grade",{$literal:"*"}]},
                            {$eq: ["$ad.default_bands.field",{$literal:"percentile"}]}
                        ]},
                    then: {
                             $and: [
                                  {$eq: [{$literal:"*"}, "$ad.default_bands.term_name"]},
                                  {$eq: [{$literal:"*"}, "$ad.default_bands.grade"]},
                                  {$eq: [{$literal:"percentile"}, "$ad.default_bands.field"]}
                              ]
                          
                        },
                    else:
                        { 
                            if: { //[v,*]
                                $and: [
                                    {$ne: ["$ad.default_bands.term_name",{$literal:"*"}]},
                                    {$eq: ["$ad.default_bands.term_grade",{$literal:"*"}]},
                                    {$eq: ["$ad.default_bands.field",{$literal:"percentile"}]}
                                ]},
                            then: {
                                     $and: [
                                          {$eq: ["$term_name", "$ad.default_bands.term_name"]},
                                          {$eq: [{$literal:"*"}, "$ad.default_bands.grade"]},
                                          {$eq: [{$literal:"percentile"}, "$ad.default_bands.field"]}
                                      ]
                                  
                                },
                            else:
                                { 
                                    if: { //[*,v]
                                        $and: [
                                            {$eq: ["$ad.default_bands.term_name",{$literal:"*"}]},
                                            {$ne: ["$ad.default_bands.term_grade",{$literal:"*"}]},
                                            {$eq: ["$ad.default_bands.field",{$literal:"percentile"}]}
                                        ]},
                                    then: {
                                             $and: [
                                                  {$eq: [{$literal:"*"}, "$ad.default_bands.term_name"]},
                                                  {$eq: ["$test_grade", "$ad.default_bands.grade"]},
                                                  {$eq: [{$literal:"percentile"}, "$ad.default_bands.field"]}
                                              ]
                                          
                                        },
                                    else:
                                        { 
                                            if: { //[v,v]
                                                $and: [
                                                    {$ne: ["$ad.default_bands.term_name",{$literal:"*"}]},
                                                    {$ne: ["$ad.default_bands.term_grade",{$literal:"*"}]},
                                                    {$eq: ["$ad.default_bands.field",{$literal:"percentile"}]}
                                                ]},
                                            then: {
                                                     $and: [
                                                          {$eq: ["$term_name", "$ad.default_bands.term_name"]},
                                                          {$eq: ["$test_grade", "$ad.default_bands.grade"]},
                                                          {$eq: [{$literal:"percentile"}, "$ad.default_bands.field"]}
                                                      ]
                                                  
                                                },
                                            else:
                                                { 
                                                  
                                                } 
                                        } 
                                } 
                        } 
                   }
                } 
        
            }
        },

However, it just skips everything. I’m trying to get to “if x conditions exist, use y set of criteria for the match.”

It’s not working.

I suspect I’m close - or very very far off.

Adding the bands as a subdocument on the test record is not an option. These are flexible and can change. I need them applied at runtime.

Any thoughts or insights would be greatly appreciated!

OK. I figured it out - it’s quite nice.

Facets to the rescue!

{$facet: {
        ww: [
            {$match: {
                $expr:{
                $and: [
                        {$eq:["$ad.default_bands.term_name",{$literal:"*"}]},
                        {$eq:["$ad.default_bands.grade",{$literal:"*"}]},
                        {$eq:["$ad.default_bands.field", {$literal:"percentile"}]}
                    ]
                }
            }}
        ],
        wv: [
            {$match: {
                $expr:{
                $and: [
                        {$eq:["$ad.default_bands.term_name","$term_name"]},
                        {$eq:["$ad.default_bands.grade",{$literal:"*"}]},
                        {$eq:["$ad.default_bands.field", {$literal:"percentile"}]}
                    ]
                }
            }}
        ],
        vw: [
            {$match: {
                $expr:{
                $and: [
                        {$eq:["$ad.default_bands.term_name",{$literal:"*"}]},
                        {$eq:["$ad.default_bands.grade","$test_grade"]},
                        {$eq:["$ad.default_bands.field", {$literal:"percentile"}]}
                    ]
                }
            }}
        ],
        vv: [
            {$match: {
                $expr:{
                $and: [
                        {$eq:["$ad.default_bands.term_name","$term_name"]},
                        {$eq:["$ad.default_bands.grade","$test_grade"]},
                        {$eq:["$ad.default_bands.field", {$literal:"percentile"}]}
                    ]
                }
            }}
        ]
    }},

    {
        $project: {
            result: {
                $setUnion: [ "$ww", "$wv", "$vw", "$ww" ]
            }
        }
    },  
    {$unwind:"$result"},

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.