graphLookup missing documents that appear at multiple levels

For others that stumble upon it. I created a much smaller example of this that can be observed @ Mongo playground

But now I’m thinking it’s my understanding of $graphLookup that was incorrect. $graphLookup traverses the collection ‘connectFromField’ with ‘connectToField’ only once for each connection – which now that I’ve returned to this I think makes sense.

The problem is that there could be duplicate children at different depths of the graph. When that’s the case in the example provided, information is omitted. Can anyone provide an explanation for why $graphLookup would not provide an output that includes all depths even if repeated?

Referencing a couple of SO posts on the issue:

[mine] - aggregation framework - MongoDB graphLookup results different from manual recursive find() queries - Stack Overflow
<<<< ORIGINAL POST BELOW >>>>

Looking for help determining why I see different results for $graphLookup compared to recursive find() queries.

I tested this with MongoDB 5.0.15, 6.0.11, and 7.0.2 and the results are the same.

I’ve created a sample dataset that displays the problem. I tried to use mongoplayground, but my document count was larger than the 100 document / collection limit.

There is a github repo @ GitHub - vendop/mdb_graphlookup that contains the data and python code for running the comparison.

In short, here is my graphLookup query:

coll.aggregate([
        {
            '$match':
                {
                    'name': "BXHzrbVjlJ",
                },
        },
        {
            '$graphLookup':
                {
                    'from': "graphlookup",
                    'startWith': "$name_child",
                    'connectFromField': "name_child",
                    'connectToField': "name",
                    'as': "children",
                    'depthField': "level",
                    'restrictSearchWithMatch': {
                        'name_finished_good': "TL6100NCB",
                    },
                },
        },
    ])

And here is the python code that has the expected output (there is one record missing from graphLookup at depth level 10).

def children(docs):
    for doc in docs:
        print('{} {} {}'.format(doc['name'], doc['name_child']))
        children(coll.find({'name': doc['name_child']}))
children(coll.find({'name': 'BXHzrbVjlJ'}))
  • In the recursive find method, REQaEjncVH shows up as a child the two times cHAJOAjUij is referenced.

  • In the graphLookup method, REQaEjncVH shows up as a child only once for cHAJOAjUij. cHAJOAjUij correctly shows up twice.

I would expect $graphLookup and recursive find() to return the same information. I would much prefer to use $graphLookup because it’s more performant than recursive find() queries