Filtering a nested array of subdocuments without

I have a collection with a nested array of subdocuments. I’d like to filter out the subdocuments in this nested array that have the field status: REMOVED, and then return the original document unchanged aside from the filtered subdocument array. I have this working in the following aggregate pipeline:

   db.getCollection('inventory').aggregate([
    {
        $match: {
            updatedat: {
                $gte: ISODate("2021-05-10T12:00:00Z"),
                $lte: ISODate("2021-05-11T12:00:00Z")
            }
        }   
    },
    { $unwind: "$sizes" },
    {
        $match: {
            "sizes.status": { $ne: "REMOVED" }
        }
    },
    {
        $group: {
           _id: {
               item: "$item",
               price: "$price",
               updatedat: "$updatedat",
               fees: "$fees"
           },
           sizes: { $push: "$sizes" } 
        }
    },
    {
        $project: { _id: 0, sizes: 1, item: "$_id.item", price: "$_id.price", updatedat: "$_id.updatedat", fees: "$_id.fees" }
    }

Here is an example document in my collection:

    {
        "_id" : ObjectId("60996db251b4a0b97ee405ba"),
        "item" : "A",
        "price" : NumberDecimal("80"),
        "updatedat" : ISODate("2021-05-10T12:00:00.000Z"),
        "fees" : {
            "texttext" : "sold in!",
            "taxes" : [ 
                {
                    "type" : 1.0,
                    "description" : "QC/CA#1234"
                }, 
                {
                    "type" : 2.0,
                    "description" : "QC/CA#2231"
                }
            ]
        },
        "sizes" : [ 
            {
                "size" : "S",
                "status" : "AVAILABLE"
            }, 
            {
                "size" : "M",
                "status" : "REMOVED"
            }, 
            {
                "size" : "L",
                "status" : "AVAILABLE"
            }
        ]
    }

This returns what I need, but managing each root level field by grouping them inside _id, and then projecting them in the final stage is tedious. This is also a test dataset, in reality the documents I’ll be manipulating are much more complex.

I was wondering if there was a better way to handle this than my solution above.

1 Like

I’ve found a better way to handle this, using the $filter operator:

    db.getCollection('inventory').aggregate([
    {
        $match: {
            updatedat: {
                $gte: ISODate("2021-05-10T12:00:00Z"),
                $lte: ISODate("2021-05-11T12:00:00Z")
            }
        }   
    },
    {
        $project: {
            item: 1,
            price: 1,
            sizes: {
                $filter: {
                    input: "$sizes",
                    as: "size",
                    cond: { 
                            $ne: [ "$$size.status", "REMOVED" ] 
                    }
                }
            }
        }
    },
    {
        $match: {
            $nor: [
                { sizes: { $exists: false } },
                { sizes: { $size: 0 } }
            ]
        }
    }
])

I am now wondering if there is a way to prject all root level fields in my $project stage. instead of writting

item: 1,
price: 1,

As I mentioned above, the data I will be actually using is much more complex, and this will be a very large list.

Yes, you can. Use the $addFields stage instead of the $project:

Adds new fields to documents. $addFields outputs documents that contain all existing fields from the input documents and newly added fields.

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.