$graphLookup (aggregation)
Definition
$graphLookup
Changed in version 5.1.
Performs a recursive search on a collection, with options for restricting the search by recursion depth and query filter.
The
$graphLookup
search process is summarized below:Input documents flow into the
$graphLookup
stage of an aggregation operation.$graphLookup
targets the search to the collection designated by thefrom
parameter (see below for full list of search parameters).For each input document, the search begins with the value designated by
startWith
.$graphLookup
matches thestartWith
value against the field designated byconnectToField
in other documents in thefrom
collection.For each matching document,
$graphLookup
takes the value of theconnectFromField
and checks every document in thefrom
collection for a matchingconnectToField
value. For each match,$graphLookup
adds the matching document in thefrom
collection to an array field named by theas
parameter.This step continues recursively until no more matching documents are found, or until the operation reaches a recursion depth specified by the
maxDepth
parameter.$graphLookup
then appends the array field to the input document.$graphLookup
returns results after completing its search on all input documents.
$graphLookup
has the following prototype form:{ $graphLookup: { from: <collection>, startWith: <expression>, connectFromField: <string>, connectToField: <string>, as: <string>, maxDepth: <number>, depthField: <string>, restrictSearchWithMatch: <document> } } $graphLookup
takes a document with the following fields:FieldDescriptionfrom
Target collection for the
$graphLookup
operation to search, recursively matching theconnectFromField
to theconnectToField
. Thefrom
collection must be in the same database as any other collections used in the operation.Starting in MongoDB 5.1, the collection specified in the
from
parameter can be sharded.startWith
Expression that specifies the value of theconnectFromField
with which to start the recursive search. Optionally,startWith
may be array of values, each of which is individually followed through the traversal process.connectFromField
Field name whose value$graphLookup
uses to recursively match against theconnectToField
of other documents in the collection. If the value is an array, each element is individually followed through the traversal process.connectToField
Field name in other documents against which to match the value of the field specified by theconnectFromField
parameter.as
Name of the array field added to each output document. Contains the documents traversed in the
$graphLookup
stage to reach the document.Note
Documents returned in the
as
field are not guaranteed to be in any order.maxDepth
Optional. Non-negative integral number specifying the maximum recursion depth.depthField
Optional. Name of the field to add to each traversed document in the search path. The value of this field is the recursion depth for the document, represented as aNumberLong
. Recursion depth value starts at zero, so the first lookup corresponds to zero depth.restrictSearchWithMatch
Optional. A document specifying additional conditions for the recursive search. The syntax is identical to query filter syntax.
Note
You cannot use any aggregation expression in this filter. For example, a query document such as
{ lastName: { $ne: "$lastName" } } will not work in this context to find documents in which the
lastName
value is different from thelastName
value of the input document, because"$lastName"
will act as a string literal, not a field path.
Considerations
Sharded Collections
Starting in MongoDB 5.1, you can specify sharded collections in the from
parameter of
$graphLookup
stages.
You cannot use the $graphLookup
stage within a transaction while
targeting a sharded collection.
Max Depth
Setting the maxDepth
field to 0
is equivalent to a
non-recursive $graphLookup
search stage.
Memory
The $graphLookup
stage must stay within the 100 megabyte
memory limit. If allowDiskUse: true
is specified for the
aggregate()
operation, the
$graphLookup
stage ignores the option. If there are other
stages in the aggregate()
operation,
allowDiskUse: true
option is in effect for these other stages.
See aggregration pipeline limitations for more information.
Views and Collation
If performing an aggregation that involves multiple views, such as
with $lookup
or $graphLookup
, the views must
have the same collation.
Examples
Within a Single Collection
A collection named employees
has the following documents:
{ "_id" : 1, "name" : "Dev" } { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" } { "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot" } { "_id" : 4, "name" : "Andrew", "reportsTo" : "Eliot" } { "_id" : 5, "name" : "Asya", "reportsTo" : "Ron" } { "_id" : 6, "name" : "Dan", "reportsTo" : "Andrew" }
The following $graphLookup
operation recursively matches
on the reportsTo
and name
fields in the employees
collection, returning the reporting hierarchy for each person:
db.employees.aggregate( [ { $graphLookup: { from: "employees", startWith: "$reportsTo", connectFromField: "reportsTo", connectToField: "name", as: "reportingHierarchy" } } ] )
The operation returns the following:
{ "_id" : 1, "name" : "Dev", "reportingHierarchy" : [ ] } { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev", "reportingHierarchy" : [ { "_id" : 1, "name" : "Dev" } ] } { "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot", "reportingHierarchy" : [ { "_id" : 1, "name" : "Dev" }, { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" } ] } { "_id" : 4, "name" : "Andrew", "reportsTo" : "Eliot", "reportingHierarchy" : [ { "_id" : 1, "name" : "Dev" }, { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" } ] } { "_id" : 5, "name" : "Asya", "reportsTo" : "Ron", "reportingHierarchy" : [ { "_id" : 1, "name" : "Dev" }, { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }, { "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot" } ] } { "_id" : 6, "name" : "Dan", "reportsTo" : "Andrew", "reportingHierarchy" : [ { "_id" : 1, "name" : "Dev" }, { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }, { "_id" : 4, "name" : "Andrew", "reportsTo" : "Eliot" } ] }
The following table provides a traversal path for the
document { "_id" : 5, "name" : "Asya", "reportsTo" : "Ron" }
:
Start value | The
| |
Depth 0 |
| |
Depth 1 |
| |
Depth 2 |
|
The output generates the hierarchy
Asya -> Ron -> Eliot -> Dev
.
Across Multiple Collections
Like $lookup
, $graphLookup
can access
another collection in the same database.
For example, create a database with two collections:
An
airports
collection with the following documents:db.airports.insertMany( [ { "_id" : 0, "airport" : "JFK", "connects" : [ "BOS", "ORD" ] }, { "_id" : 1, "airport" : "BOS", "connects" : [ "JFK", "PWM" ] }, { "_id" : 2, "airport" : "ORD", "connects" : [ "JFK" ] }, { "_id" : 3, "airport" : "PWM", "connects" : [ "BOS", "LHR" ] }, { "_id" : 4, "airport" : "LHR", "connects" : [ "PWM" ] } ] ) A
travelers
collection with the following documents:db.travelers.insertMany( [ { "_id" : 1, "name" : "Dev", "nearestAirport" : "JFK" }, { "_id" : 2, "name" : "Eliot", "nearestAirport" : "JFK" }, { "_id" : 3, "name" : "Jeff", "nearestAirport" : "BOS" } ] )
For each document in the travelers
collection, the following
aggregation operation looks up the nearestAirport
value in the
airports
collection and recursively matches the connects
field to the airport
field. The operation specifies a maximum
recursion depth of 2
.
db.travelers.aggregate( [ { $graphLookup: { from: "airports", startWith: "$nearestAirport", connectFromField: "connects", connectToField: "airport", maxDepth: 2, depthField: "numConnections", as: "destinations" } } ] )
The operation returns the following results:
{ "_id" : 1, "name" : "Dev", "nearestAirport" : "JFK", "destinations" : [ { "_id" : 3, "airport" : "PWM", "connects" : [ "BOS", "LHR" ], "numConnections" : NumberLong(2) }, { "_id" : 2, "airport" : "ORD", "connects" : [ "JFK" ], "numConnections" : NumberLong(1) }, { "_id" : 1, "airport" : "BOS", "connects" : [ "JFK", "PWM" ], "numConnections" : NumberLong(1) }, { "_id" : 0, "airport" : "JFK", "connects" : [ "BOS", "ORD" ], "numConnections" : NumberLong(0) } ] } { "_id" : 2, "name" : "Eliot", "nearestAirport" : "JFK", "destinations" : [ { "_id" : 3, "airport" : "PWM", "connects" : [ "BOS", "LHR" ], "numConnections" : NumberLong(2) }, { "_id" : 2, "airport" : "ORD", "connects" : [ "JFK" ], "numConnections" : NumberLong(1) }, { "_id" : 1, "airport" : "BOS", "connects" : [ "JFK", "PWM" ], "numConnections" : NumberLong(1) }, { "_id" : 0, "airport" : "JFK", "connects" : [ "BOS", "ORD" ], "numConnections" : NumberLong(0) } ] } { "_id" : 3, "name" : "Jeff", "nearestAirport" : "BOS", "destinations" : [ { "_id" : 2, "airport" : "ORD", "connects" : [ "JFK" ], "numConnections" : NumberLong(2) }, { "_id" : 3, "airport" : "PWM", "connects" : [ "BOS", "LHR" ], "numConnections" : NumberLong(1) }, { "_id" : 4, "airport" : "LHR", "connects" : [ "PWM" ], "numConnections" : NumberLong(2) }, { "_id" : 0, "airport" : "JFK", "connects" : [ "BOS", "ORD" ], "numConnections" : NumberLong(1) }, { "_id" : 1, "airport" : "BOS", "connects" : [ "JFK", "PWM" ], "numConnections" : NumberLong(0) } ] }
The following table provides a traversal path for the recursive
search, up to depth 2
, where the starting airport
is JFK
:
Start value | The
| ||
Depth 0 |
| ||
Depth 1 |
| ||
Depth 2 |
|
With a Query Filter
The following example uses a collection with a set
of documents containing names of people along with arrays of their
friends and their hobbies. An aggregation operation finds one
particular person and traverses her network of connections to find
people who list golf
among their hobbies.
A collection named people
contains the following documents:
{ "_id" : 1, "name" : "Tanya Jordan", "friends" : [ "Shirley Soto", "Terry Hawkins", "Carole Hale" ], "hobbies" : [ "tennis", "unicycling", "golf" ] } { "_id" : 2, "name" : "Carole Hale", "friends" : [ "Joseph Dennis", "Tanya Jordan", "Terry Hawkins" ], "hobbies" : [ "archery", "golf", "woodworking" ] } { "_id" : 3, "name" : "Terry Hawkins", "friends" : [ "Tanya Jordan", "Carole Hale", "Angelo Ward" ], "hobbies" : [ "knitting", "frisbee" ] } { "_id" : 4, "name" : "Joseph Dennis", "friends" : [ "Angelo Ward", "Carole Hale" ], "hobbies" : [ "tennis", "golf", "topiary" ] } { "_id" : 5, "name" : "Angelo Ward", "friends" : [ "Terry Hawkins", "Shirley Soto", "Joseph Dennis" ], "hobbies" : [ "travel", "ceramics", "golf" ] } { "_id" : 6, "name" : "Shirley Soto", "friends" : [ "Angelo Ward", "Tanya Jordan", "Carole Hale" ], "hobbies" : [ "frisbee", "set theory" ] }
The following aggregation operation uses three stages:
$match
matches on documents with aname
field containing the string"Tanya Jordan"
. Returns one output document.$graphLookup
connects the output document'sfriends
field with thename
field of other documents in the collection to traverseTanya Jordan's
network of connections. This stage uses therestrictSearchWithMatch
parameter to find only documents in which thehobbies
array containsgolf
. Returns one output document.$project
shapes the output document. The names listed inconnections who play golf
are taken from thename
field of the documents listed in the input document'sgolfers
array.
db.people.aggregate( [ { $match: { "name": "Tanya Jordan" } }, { $graphLookup: { from: "people", startWith: "$friends", connectFromField: "friends", connectToField: "name", as: "golfers", restrictSearchWithMatch: { "hobbies" : "golf" } } }, { $project: { "name": 1, "friends": 1, "connections who play golf": "$golfers.name" } } ] )
The operation returns the following document:
{ "_id" : 1, "name" : "Tanya Jordan", "friends" : [ "Shirley Soto", "Terry Hawkins", "Carole Hale" ], "connections who play golf" : [ "Joseph Dennis", "Tanya Jordan", "Angelo Ward", "Carole Hale" ] }