How complex should the aggregate query be?

O have 3 Collections; Task, Pollutant, UOM. I aggregate Task then $lookup Pollutant, then $lookup UOM. This is the query that I have;

db.getCollection("Task")
.aggregate([{
    $match: {
        _id: ObjectId('6220257fd2c66616112409d2')
    }
}, {
    "$unwind": {
        "path": "$actions",
        "preserveNullAndEmptyArrays": True
    }
}, {
    "$lookup": {
        "from": "Pollutant",
        "localField": "actions.custom_record_data._id",
        "foreignField": "_id",
        "pipeline": [{
            "$project": {
                "name": 1,
                "category": 1,
                "selected_unit": 1,
                "uom": 1,
                "variables": {
                    "$ifNull": ["$variables", []]
                },
                "build_equation": 1
            }
        }],
        "as": "_custom_record_data"
    }
}, {
    "$addFields": {
        "actions.custom_record_data": "$_custom_record_data"
    }
}, {
    "$project": {
        "_custom_record_data": 0
    }
}, {
    "$addFields": {
        "vars": {
            "$reduce": {
                "input": "$actions.custom_record_data",
                "initialValue": [],
                "in": {
                    "$concatArrays": ["$$value", "$$this.variables.uom._id"]
                }
            }
        }
    }
}, {
    "$group": {
        "_id": "$_id",
        "root": {
            "$first": "$$ROOT"
        },
        "actions": {
            "$push": "$actions"
        }
    }
}, {
    "$addFields": {
        "root.actions": "$actions"
    }
}, {
    "$replaceWith": "$root"
}, {
    "$addFields": {
        "pollutant_ids": {
            "$reduce": {
                "input": "$actions",
                "initialValue": [],
                "in": {
                    "$concatArrays": ["$$value", "$$this.regulatory_limit.name._id"]
                }
            }
        },
        "uom_ids": {
            "$reduce": {
                "input": "$actions",
                "initialValue": [],
                "in": {
                    "$concatArrays": ["$$value", "$$this.custom_record_data.uom._id", "$vars"]
                }
            }
        }
    }
}, {
    "$lookup": {
        "from": "Pollutant",
        "localField": "pollutant_ids",
        "foreignField": "_id",
        "pipeline": [{
            "$project": {
                "name": 1,
                "category": 1
            }
        }],
        "as": "pollutant_info"
    }
}, {
    "$lookup": {
        "from": "UOM",
        "localField": "uom_ids",
        "foreignField": "_id",
        "pipeline": [{
            "$project": {
                "name": 1,
                "units": 1
            }
        }],
        "as": "uom_info"
    }
}, {
    "$addFields": {
        "actions": {
            "$map": {
                "input": "$actions",
                "in": {
                    "$mergeObjects": ["$$this", {
                        "regulatory_limit": {
                            "$map": {
                                "input": "$$this.regulatory_limit",
                                "in": {
                                    "$mergeObjects": ["$$this", {
                                        "name": {
                                            "$arrayElemAt": ["$pollutant_info", {
                                                "$indexOfArray": ["$pollutant_ids", "$$this.name._id"]
                                            }]
                                        }
                                    }]
                                }
                            }
                        },
                        "custom_record_data": {
                            "$map": {
                                "input": "$$this.custom_record_data",
                                "in": {
                                    "$mergeObjects": ["$$this", {
                                        "uom": {
                                            "$arrayElemAt": ["$uom_info", {
                                                "$indexOfArray": ["$uom_ids", "$$this.uom._id"]
                                            }]
                                        },
                                        "variables": {
                                            "$map": {
                                                "input": "$$this.variables",
                                                "in": {
                                                    "$mergeObjects": ["$$this", {
                                                        "uom": {
                                                            "$arrayElemAt": ["$uom_info", {
                                                                "$indexOfArray": ["$uom_ids", "$$this.uom._id"]
                                                            }]
                                                        }
                                                    }]
                                                }
                                            }
                                        }
                                    }]
                                }
                            }
                        }
                    }]
                }
            }
        }
    }
}])

Is this acceptable query or is it a complex and I should change my data model?

Hi @Aldo_Osm,

Is this acceptable query or is it a complex and I should change my data model?

It’s a difficult question to answer with limited context and information regarding your use case and environment. Additionally, there is no information on the document(s) within each collection.

In saying so, from an initial glance, it does appear to look like there are some stages which can be removed and combined together. For example, the $addFields$project$addFields stages could maybe be combined into a single $addFields stage, but without an example input & output documents, it’s hard to tell for sure.

However, please provide the following information:

  1. Further context regarding the query and the use case(s)
  2. MongoDB Version in use
  3. Sample document(s) from each of the collections mentioned
  4. The current output from the query
  5. Clarify what you mean by “acceptable” and “complex” in this context. This is to help understand if your main concern is the length of the query itself or the performance etc.

Please redact any personal or sensitive information from your answers to the above request before posting here

Regards,
Jason

1 Like