Changed in version 3.4.
Definition
$graphLookupPerforms a recursive search on a collection, with options for restricting the search by recursion depth and query filter.
The
$graphLookupsearch process is summarized below:Input documents flow into the
$graphLookupstage of an aggregation operation.$graphLookuptargets the search to the collection designated by thefromparameter (see below for full list of search parameters).For each input document, the search begins with the value designated by
startWith.$graphLookupmatches thestartWithvalue against the field designated byconnectToFieldin other documents in thefromcollection.For each matching document,
$graphLookuptakes the value of theconnectFromFieldand checks every document in thefromcollection for a matchingconnectToFieldvalue. For each match,$graphLookupadds the matching document in thefromcollection to an array field named by theasparameter.This step continues recursively until no more matching documents are found, or until the operation reaches a recursion depth specified by the
maxDepthparameter.$graphLookupthen appends the array field to the input document.$graphLookupreturns results after completing its search on all input documents.
$graphLookuphas the following prototype form:{ $graphLookup: { from: <collection>, startWith: <expression>, connectFromField: <string>, connectToField: <string>, as: <string>, maxDepth: <number>, depthField: <string>, restrictSearchWithMatch: <document> } } $graphLookuptakes a document with the following fields:FieldDescriptionfromTarget collection for the
$graphLookupoperation to search, recursively matching theconnectFromFieldto theconnectToField. Thefromcollection cannot be sharded and must be in the same database as any other collections used in the operation. For information, see Sharded Collections.startWithExpression that specifies the value of the
connectFromFieldwith which to start the recursive search. Optionally,startWithmay be array of values, each of which is individually followed through the traversal process.connectFromFieldField name whose value
$graphLookupuses to recursively match against theconnectToFieldof other documents in the collection. If the value is an array, each element is individually followed through the traversal process.connectToFieldField name in other documents against which to match the value of the field specified by the
connectFromFieldparameter.asName of the array field added to each output document. Contains the documents traversed in the
$graphLookupstage to reach the document.Note
Documents returned in the
asfield are not guaranteed to be in any order.maxDepthOptional. Non-negative integral number specifying the maximum recursion depth.
depthFieldOptional. Name of the field to add to each traversed document in the search path. The value of this field is the recursion depth for the document, represented as a
NumberLong. Recursion depth value starts at zero, so the first lookup corresponds to zero depth.restrictSearchWithMatchOptional. A document specifying additional conditions for the recursive search. The syntax is identical to query filter syntax.
Note
You cannot use any aggregation expression in this filter. For example, a query document such as
{ lastName: { $ne: "$lastName" } } will not work in this context to find documents in which the
lastNamevalue is different from thelastNamevalue of the input document, because"$lastName"will act as a string literal, not a field path.
Considerations
Sharded Collections
The collection specified in from cannot be
sharded. However, the collection on which you run the
aggregate() method can be sharded. That is, in
the following:
db.collection.aggregate([ { $graphLookup: { from: "fromCollection", ... } } ])
The
collectioncan be sharded.The
fromCollectioncannot be sharded.
To join multiple sharded collections, consider:
Modifying client applications to perform manual lookups instead of using the
$graphLookupaggregation stage.If possible, using an embedded data model that removes the need to join collections.
Max Depth
Setting the maxDepth field to 0 is equivalent to a
non-recursive $graphLookup search stage.
Memory
The $graphLookup stage must stay within the 100 megabyte
memory limit. If allowDiskUse: true is specified for the
aggregate() operation, the
$graphLookup stage ignores the option. If there are other
stages in the aggregate() operation,
allowDiskUse: true option is in effect for these other stages.
See aggregration pipeline limitations for more information.
Views and Collation
If performing an aggregation that involves multiple views, such as
with $lookup or $graphLookup, the views must
have the same collation.
Examples
Within a Single Collection
A collection named employees has the following documents:
{ "_id" : 1, "name" : "Dev" } { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" } { "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot" } { "_id" : 4, "name" : "Andrew", "reportsTo" : "Eliot" } { "_id" : 5, "name" : "Asya", "reportsTo" : "Ron" } { "_id" : 6, "name" : "Dan", "reportsTo" : "Andrew" }
The following $graphLookup operation recursively matches
on the reportsTo and name fields in the employees
collection, returning the reporting hierarchy for each person:
db.employees.aggregate( [ { $graphLookup: { from: "employees", startWith: "$reportsTo", connectFromField: "reportsTo", connectToField: "name", as: "reportingHierarchy" } } ] )
The operation returns the following:
{ "_id" : 1, "name" : "Dev", "reportingHierarchy" : [ ] } { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev", "reportingHierarchy" : [ { "_id" : 1, "name" : "Dev" } ] } { "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot", "reportingHierarchy" : [ { "_id" : 1, "name" : "Dev" }, { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" } ] } { "_id" : 4, "name" : "Andrew", "reportsTo" : "Eliot", "reportingHierarchy" : [ { "_id" : 1, "name" : "Dev" }, { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" } ] } { "_id" : 5, "name" : "Asya", "reportsTo" : "Ron", "reportingHierarchy" : [ { "_id" : 1, "name" : "Dev" }, { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }, { "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot" } ] } { "_id" : 6, "name" : "Dan", "reportsTo" : "Andrew", "reportingHierarchy" : [ { "_id" : 1, "name" : "Dev" }, { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }, { "_id" : 4, "name" : "Andrew", "reportsTo" : "Eliot" } ] }
The following table provides a traversal path for the
document { "_id" : 5, "name" : "Asya", "reportsTo" : "Ron" }:
Start value | The | |
Depth 0 | | |
Depth 1 | | |
Depth 2 | |
The output generates the hierarchy
Asya -> Ron -> Eliot -> Dev.
Across Multiple Collections
Like $lookup, $graphLookup can access
another collection in the same database.
For example, create a database with two collections:
An
airportscollection with the following documents:db.airports.insertMany( [ { "_id" : 0, "airport" : "JFK", "connects" : [ "BOS", "ORD" ] }, { "_id" : 1, "airport" : "BOS", "connects" : [ "JFK", "PWM" ] }, { "_id" : 2, "airport" : "ORD", "connects" : [ "JFK" ] }, { "_id" : 3, "airport" : "PWM", "connects" : [ "BOS", "LHR" ] }, { "_id" : 4, "airport" : "LHR", "connects" : [ "PWM" ] } ] ) A
travelerscollection with the following documents:db.travelers.insertMany( [ { "_id" : 1, "name" : "Dev", "nearestAirport" : "JFK" }, { "_id" : 2, "name" : "Eliot", "nearestAirport" : "JFK" }, { "_id" : 3, "name" : "Jeff", "nearestAirport" : "BOS" } ] )
For each document in the travelers collection, the following
aggregation operation looks up the nearestAirport value in the
airports collection and recursively matches the connects
field to the airport field. The operation specifies a maximum
recursion depth of 2.
db.travelers.aggregate( [ { $graphLookup: { from: "airports", startWith: "$nearestAirport", connectFromField: "connects", connectToField: "airport", maxDepth: 2, depthField: "numConnections", as: "destinations" } } ] )
The operation returns the following results:
{ "_id" : 1, "name" : "Dev", "nearestAirport" : "JFK", "destinations" : [ { "_id" : 3, "airport" : "PWM", "connects" : [ "BOS", "LHR" ], "numConnections" : NumberLong(2) }, { "_id" : 2, "airport" : "ORD", "connects" : [ "JFK" ], "numConnections" : NumberLong(1) }, { "_id" : 1, "airport" : "BOS", "connects" : [ "JFK", "PWM" ], "numConnections" : NumberLong(1) }, { "_id" : 0, "airport" : "JFK", "connects" : [ "BOS", "ORD" ], "numConnections" : NumberLong(0) } ] } { "_id" : 2, "name" : "Eliot", "nearestAirport" : "JFK", "destinations" : [ { "_id" : 3, "airport" : "PWM", "connects" : [ "BOS", "LHR" ], "numConnections" : NumberLong(2) }, { "_id" : 2, "airport" : "ORD", "connects" : [ "JFK" ], "numConnections" : NumberLong(1) }, { "_id" : 1, "airport" : "BOS", "connects" : [ "JFK", "PWM" ], "numConnections" : NumberLong(1) }, { "_id" : 0, "airport" : "JFK", "connects" : [ "BOS", "ORD" ], "numConnections" : NumberLong(0) } ] } { "_id" : 3, "name" : "Jeff", "nearestAirport" : "BOS", "destinations" : [ { "_id" : 2, "airport" : "ORD", "connects" : [ "JFK" ], "numConnections" : NumberLong(2) }, { "_id" : 3, "airport" : "PWM", "connects" : [ "BOS", "LHR" ], "numConnections" : NumberLong(1) }, { "_id" : 4, "airport" : "LHR", "connects" : [ "PWM" ], "numConnections" : NumberLong(2) }, { "_id" : 0, "airport" : "JFK", "connects" : [ "BOS", "ORD" ], "numConnections" : NumberLong(1) }, { "_id" : 1, "airport" : "BOS", "connects" : [ "JFK", "PWM" ], "numConnections" : NumberLong(0) } ] }
The following table provides a traversal path for the recursive
search, up to depth 2, where the starting airport is JFK:
Start value | The | ||
Depth 0 | | ||
Depth 1 | | ||
Depth 2 | |
With a Query Filter
The following example uses a collection with a set
of documents containing names of people along with arrays of their
friends and their hobbies. An aggregation operation finds one
particular person and traverses her network of connections to find
people who list golf among their hobbies.
A collection named people contains the following documents:
{ "_id" : 1, "name" : "Tanya Jordan", "friends" : [ "Shirley Soto", "Terry Hawkins", "Carole Hale" ], "hobbies" : [ "tennis", "unicycling", "golf" ] } { "_id" : 2, "name" : "Carole Hale", "friends" : [ "Joseph Dennis", "Tanya Jordan", "Terry Hawkins" ], "hobbies" : [ "archery", "golf", "woodworking" ] } { "_id" : 3, "name" : "Terry Hawkins", "friends" : [ "Tanya Jordan", "Carole Hale", "Angelo Ward" ], "hobbies" : [ "knitting", "frisbee" ] } { "_id" : 4, "name" : "Joseph Dennis", "friends" : [ "Angelo Ward", "Carole Hale" ], "hobbies" : [ "tennis", "golf", "topiary" ] } { "_id" : 5, "name" : "Angelo Ward", "friends" : [ "Terry Hawkins", "Shirley Soto", "Joseph Dennis" ], "hobbies" : [ "travel", "ceramics", "golf" ] } { "_id" : 6, "name" : "Shirley Soto", "friends" : [ "Angelo Ward", "Tanya Jordan", "Carole Hale" ], "hobbies" : [ "frisbee", "set theory" ] }
The following aggregation operation uses three stages:
$matchmatches on documents with anamefield containing the string"Tanya Jordan". Returns one output document.$graphLookupconnects the output document'sfriendsfield with thenamefield of other documents in the collection to traverseTanya Jordan'snetwork of connections. This stage uses therestrictSearchWithMatchparameter to find only documents in which thehobbiesarray containsgolf. Returns one output document.$projectshapes the output document. The names listed inconnections who play golfare taken from thenamefield of the documents listed in the input document'sgolfersarray.
db.people.aggregate( [ { $match: { "name": "Tanya Jordan" } }, { $graphLookup: { from: "people", startWith: "$friends", connectFromField: "friends", connectToField: "name", as: "golfers", restrictSearchWithMatch: { "hobbies" : "golf" } } }, { $project: { "name": 1, "friends": 1, "connections who play golf": "$golfers.name" } } ] )
The operation returns the following document:
{ "_id" : 1, "name" : "Tanya Jordan", "friends" : [ "Shirley Soto", "Terry Hawkins", "Carole Hale" ], "connections who play golf" : [ "Joseph Dennis", "Tanya Jordan", "Angelo Ward", "Carole Hale" ] }