Problem with a graphLookup

Hi everyone, I’m currently working on an interesting query.
In the database I am using the classic parent-child relationship like this:
{“_id”: ObjectId(…), “parent_id”: str, “name”: “elem1”}, {“_id”: ObjectId(…), “parent_id”: “elem1_id”, “name”: “elem2”}, {“_id”: ObjectId(…), “parent_id”: “elem2_id”, “name”: “elem3”}
The WBS type indicates the root element, the WBEs are intermediate elements and the WPs are leaf elements.

My query is currently structured like this:

db.getCollection("mf-budgeting").aggregate([
    {
        "$match": {
            "type": {"$eq": "WBE"}, 
            "root_id": "671220dd5dc4694e9edee501"
        }
    },
    {
        "$addFields": {
            "id": {"$toString": "$_id"}  // Convertiamo l'_id in stringa se parent_id è stringa
        }
    },
    {
        "$graphLookup": {
            "from": "mf-budgeting",
            "startWith": "$id",               // Inizia con l'id (padre)
            "connectFromField": "id",         // Collega l'_id del nodo padre
            "connectToField": "parent_id",    // Con il parent_id dei figli
            "as": "subtree",                  // Il risultato verrà messo in "subtree"
            "maxDepth": 1000,                 // Imposta un limite di profondità adeguato
            "depthField": "depth"             // Aggiungi il campo "depth" per tracciare la profondità
        }
    },
    {
        "$match": {
            "$expr": {
                "$gt": [{"$size": "$subtree"}, 0]  // Filtro per includere solo documenti con un sottoalbero
            }
        }
    },
    {
        "$project": {
            "type": 1,
            "root_id": 1,
            "name": "$anagraphic_section.name",
            "subtree": 1,
            "depth": 1,
            "id": 1,
            "parent_id": 1
        }
    }
])

The problem is that I expected that in the result I would have something like this:
{“_id”: ObjectId(…), “parent_id”: str, “name”: “elem1”, “subtree”: [{“_id”: ObjectId(…), “parent_id”: “elem1_id”, “name”: “elem2”}, {“_id”: ObjectId(…), “parent_id”: “elem2_id”, “name”: “elem3”}]

and so on, where am I going wrong?

Please format your sample documents and expected results as you did for your code.

You should not be storing objectid as strings. Strings take more space are slower to compare and force you to convert every time.

After looking at that more deeply I think the issue stems from

You do add the id field to the original documents, the one that goes thru the $match and $addFields stages. However, the documents lookup-ed by $graphLookup (except the start of the tree which uses startWidth:) do not have the connectFromField: id. They don’t because they do not go thru the $addField. The documents lookup-ed from: mf-budgeting do not have the field id.

If the field parent_id would store the id as an objectid rather than a string, the $graphLookup would work with connectFromField: “_id”.

It looks like it is one more reason to store objectid as objectid rather than their string representation.

1 Like

I will definitely save everything as ObjectId in the future. If I remove the type condition in $match, all documents should have the field created with $addFields, right? Or does $graphLookup in the recursion lose the changes made in the pipeline?

Thanks for the advice, next time I use only ObjectId.
My documents are something like:

{“_id”: ObjectId(…),
“parent_id”: str,
“name”: “elem1”}, 
{“_id”: ObjectId(…),
“parent_id”: “elem1_id”,
“name”: “elem2”}, 
{“_id”: ObjectId(…),
“parent_id”: “elem2_id”,
“name”: “elem3”}

And i want something:

{"_id": ObjectId(...),
"name": "elem1",
"subtree": {
    [
     {“_id”: ObjectId(…), “parent_id”: “elem1_id”, “name”: “elem2”},
     {“_id”: ObjectId(…), “parent_id”: “elem2_id”, “name”: “elem3”}
    ]
}

``
I don't understand why the $graphLookup doesn't go recursively and I receive only the first-level children.

It does. But the documents recursively processed by $graphLookup do not have the id field that you create in $addFields. Said otherwise:

Your pipeline would work if parent_id was an ObjectId and connectFromField: was _id rather than id.

As a temporarily hack until you permanently convert parent_id from string to ObjectId, you could create a view of mf-budgeting that does the $addField and do $graphLookup from the view.

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.