$graphLookup(聚合)
定义
$graphLookup
5.1 版本中进行了更改。
对集合执行递归搜索,并提供按照递归深度和查询筛选器限制搜索的选项。
$graphLookup
搜索过程总结如下:输入文档流入聚合操作的
$graphLookup
阶段。$graphLookup
将搜索定位到由from
参数指定的集合(有关搜索参数的完整列表,请参见下文)。对于每份输入文档,搜索从
startWith
指定的值开始。$graphLookup
将startWith
值与from
集合中其他文档的connectToField
指定字段进行匹配。对于每份匹配文档,
$graphLookup
获取connectFromField
的值,并检查from
集合中的每份文档是否有匹配的connectToField
值。对于每次匹配,$graphLookup
将from
集合中的匹配文档添加到由as
参数命名的数组字段中。此步骤以递归方式继续,直到找不到更多匹配文档,或者直到操作达到
maxDepth
参数指定的递归深度。然后,$graphLookup
将数组字段追加到输入文档。$graphLookup
在完成对所有输入文档的搜索后返回结果。
$graphLookup
具有以下原型形式:{ $graphLookup: { from: <collection>, startWith: <expression>, connectFromField: <string>, connectToField: <string>, as: <string>, maxDepth: <number>, depthField: <string>, restrictSearchWithMatch: <document> } } $graphLookup
接受包含以下字段的文档:字段说明from
$graphLookup
操作要搜索的目标集合,从而以递归方式将connectFromField
与connectToField
进行匹配。from
集合必须与此操作中使用的所有其他集合位于同一数据库中。从 MongoDB 5.1 开始,可以对
from
参数中指定的集合进行分片。startWith
connectFromField
字段名,其值$graphLookup
用于以递归方式匹配集合中其他文档的connectToField
。如果值是数组,则每个元素都会单独完成遍历进程。connectToField
其他文档中的字段名称,用于与connectFromField
参数指定的字段值相匹配。as
添加到每个输出文档的数组字段的名称。包含在
$graphLookup
阶段为访问文档而遍历的文档。不保证
as
字段中返回的文档按任何顺序排列。maxDepth
可选。指定最大递归深度的非负整数。depthField
可选。要添加到搜索路径中每个已遍历文档的字段的名称。该字段的值为文档的递归深度,并用NumberLong
表示。递归深度值从零开始,因此第一次查找对应于零深度。restrictSearchWithMatch
Considerations
分片集合
从 MongoDB 5.1 开始,可以在 $graphLookup
阶段的 from
参数中指定分片集合。
当以分片集合为目标时,您无法在事务中使用 $graphLookup
阶段。
最大深度
将 maxDepth
字段设置为 0
相当于一个非递归的 $graphLookup
搜索阶段。
内存
$graphLookup
阶段必须保持在 100 兆字节的内存限制之内。如果为 aggregate()
操作指定了 allowDiskUse: true
,则 $graphLookup
阶段将忽略该选项。如果 aggregate()
操作中还有其他阶段,则 allowDiskUse: true
选项对这些其他阶段有效。
请参阅聚合管道限制,获取更多信息。
未排序的结果
$graphLookup
阶段不返回排序结果。要对结果进行排序,请使用$sortArray
操作符。
视图和排序规则
如果执行的聚合涉及多个视图(如使用 $lookup
或 $graphLookup
),则这些视图必须具有相同的排序规则。
示例
单个集合内
名为 employees
的集合包含以下文档:
{ "_id" : 1, "name" : "Dev" } { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" } { "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot" } { "_id" : 4, "name" : "Andrew", "reportsTo" : "Eliot" } { "_id" : 5, "name" : "Asya", "reportsTo" : "Ron" } { "_id" : 6, "name" : "Dan", "reportsTo" : "Andrew" }
以下 $graphLookup
操作递归匹配 employees
集合中的 reportsTo
和 name
字段,返回每个人员的报告层次结构:
db.employees.aggregate( [ { $graphLookup: { from: "employees", startWith: "$reportsTo", connectFromField: "reportsTo", connectToField: "name", as: "reportingHierarchy" } } ] )
输出结果如下:
{ "_id" : 1, "name" : "Dev", "reportingHierarchy" : [ ] } { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev", "reportingHierarchy" : [ { "_id" : 1, "name" : "Dev" } ] } { "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot", "reportingHierarchy" : [ { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }, { "_id" : 1, "name" : "Dev" } ] } { "_id" : 4, "name" : "Andrew", "reportsTo" : "Eliot", "reportingHierarchy" : [ { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }, { "_id" : 1, "name" : "Dev" } ] } { "_id" : 5, "name" : "Asya", "reportsTo" : "Ron", "reportingHierarchy" : [ { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }, { "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot" }, { "_id" : 1, "name" : "Dev" } ] } { "_id" : 6, "name" : "Dan", "reportsTo" : "Andrew", "reportingHierarchy" : [ { "_id" : 4, "name" : "Andrew", "reportsTo" : "Eliot" }, { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }, { "_id" : 1, "name" : "Dev" } ] }
下表提供文档 { "_id" : 5, "name" : "Asya", "reportsTo" : "Ron" }
的遍历路径:
起始值 | 文档的
| |
深度 0 |
| |
深度 1 |
| |
深度 2 |
|
输出生成层次结构Asya -> Ron -> Eliot -> Dev
。
跨多个集合
与 $lookup
一样,$graphLookup
可以访问同一数据库中的另一个集合。
例如,创建一个包含两个集合的数据库:
包含以下文档的
airports
集合:db.airports.insertMany( [ { "_id" : 0, "airport" : "JFK", "connects" : [ "BOS", "ORD" ] }, { "_id" : 1, "airport" : "BOS", "connects" : [ "JFK", "PWM" ] }, { "_id" : 2, "airport" : "ORD", "connects" : [ "JFK" ] }, { "_id" : 3, "airport" : "PWM", "connects" : [ "BOS", "LHR" ] }, { "_id" : 4, "airport" : "LHR", "connects" : [ "PWM" ] } ] ) 包含以下文档的
travelers
集合:db.travelers.insertMany( [ { "_id" : 1, "name" : "Dev", "nearestAirport" : "JFK" }, { "_id" : 2, "name" : "Eliot", "nearestAirport" : "JFK" }, { "_id" : 3, "name" : "Jeff", "nearestAirport" : "BOS" } ] )
对于travelers
集合中的每个文档,以下聚合操作会在airports
集合中查找nearestAirport
值,并以递归方式将connects
字段与airport
字段进行匹配。该操作指定最大递归深度为2
。
db.travelers.aggregate( [ { $graphLookup: { from: "airports", startWith: "$nearestAirport", connectFromField: "connects", connectToField: "airport", maxDepth: 2, depthField: "numConnections", as: "destinations" } } ] )
输出结果如下:
{ "_id" : 1, "name" : "Dev", "nearestAirport" : "JFK", "destinations" : [ { "_id" : 3, "airport" : "PWM", "connects" : [ "BOS", "LHR" ], "numConnections" : NumberLong(2) }, { "_id" : 2, "airport" : "ORD", "connects" : [ "JFK" ], "numConnections" : NumberLong(1) }, { "_id" : 1, "airport" : "BOS", "connects" : [ "JFK", "PWM" ], "numConnections" : NumberLong(1) }, { "_id" : 0, "airport" : "JFK", "connects" : [ "BOS", "ORD" ], "numConnections" : NumberLong(0) } ] } { "_id" : 2, "name" : "Eliot", "nearestAirport" : "JFK", "destinations" : [ { "_id" : 3, "airport" : "PWM", "connects" : [ "BOS", "LHR" ], "numConnections" : NumberLong(2) }, { "_id" : 2, "airport" : "ORD", "connects" : [ "JFK" ], "numConnections" : NumberLong(1) }, { "_id" : 1, "airport" : "BOS", "connects" : [ "JFK", "PWM" ], "numConnections" : NumberLong(1) }, { "_id" : 0, "airport" : "JFK", "connects" : [ "BOS", "ORD" ], "numConnections" : NumberLong(0) } ] } { "_id" : 3, "name" : "Jeff", "nearestAirport" : "BOS", "destinations" : [ { "_id" : 2, "airport" : "ORD", "connects" : [ "JFK" ], "numConnections" : NumberLong(2) }, { "_id" : 3, "airport" : "PWM", "connects" : [ "BOS", "LHR" ], "numConnections" : NumberLong(1) }, { "_id" : 4, "airport" : "LHR", "connects" : [ "PWM" ], "numConnections" : NumberLong(2) }, { "_id" : 0, "airport" : "JFK", "connects" : [ "BOS", "ORD" ], "numConnections" : NumberLong(1) }, { "_id" : 1, "airport" : "BOS", "connects" : [ "JFK", "PWM" ], "numConnections" : NumberLong(0) } ] }
下表提供了递归搜索的遍历路径,深度为 2
,其中起始 airport
为 JFK
:
起始值 |
| ||
深度 0 |
| ||
深度 1 |
| ||
深度 2 |
|
使用查询过滤器
以下示例使用一个包含一组文档的集合,其中包含人员姓名及其朋友和爱好的数组。聚合操作找到一个特定的人,并遍历她的人际网络,以找到在其爱好中列出golf
的人。
一个名为 people
的集合包含以下文档:
{ "_id" : 1, "name" : "Tanya Jordan", "friends" : [ "Shirley Soto", "Terry Hawkins", "Carole Hale" ], "hobbies" : [ "tennis", "unicycling", "golf" ] } { "_id" : 2, "name" : "Carole Hale", "friends" : [ "Joseph Dennis", "Tanya Jordan", "Terry Hawkins" ], "hobbies" : [ "archery", "golf", "woodworking" ] } { "_id" : 3, "name" : "Terry Hawkins", "friends" : [ "Tanya Jordan", "Carole Hale", "Angelo Ward" ], "hobbies" : [ "knitting", "frisbee" ] } { "_id" : 4, "name" : "Joseph Dennis", "friends" : [ "Angelo Ward", "Carole Hale" ], "hobbies" : [ "tennis", "golf", "topiary" ] } { "_id" : 5, "name" : "Angelo Ward", "friends" : [ "Terry Hawkins", "Shirley Soto", "Joseph Dennis" ], "hobbies" : [ "travel", "ceramics", "golf" ] } { "_id" : 6, "name" : "Shirley Soto", "friends" : [ "Angelo Ward", "Tanya Jordan", "Carole Hale" ], "hobbies" : [ "frisbee", "set theory" ] }
以下聚合操作使用三个阶段:
$match
会对name
字段包含字符串"Tanya Jordan"
的文档进行匹配。返回一个输出文档。$graphLookup
将输出文档的friends
字段与集合中其他文档的name
字段连接,以遍历Tanya Jordan's
连接网络。该阶段使用restrictSearchWithMatch
参数,只查找hobbies
数组包含golf
的文档。返回一个输出文档。$project
会确定输出文档的形状。connections who play golf
中列出的名称取自输入文档的golfers
数组中所列文档的name
字段。
db.people.aggregate( [ { $match: { "name": "Tanya Jordan" } }, { $graphLookup: { from: "people", startWith: "$friends", connectFromField: "friends", connectToField: "name", as: "golfers", restrictSearchWithMatch: { "hobbies" : "golf" } } }, { $project: { "name": 1, "friends": 1, "connections who play golf": "$golfers.name" } } ] )
该操作将返回以下文档:
{ "_id" : 1, "name" : "Tanya Jordan", "friends" : [ "Shirley Soto", "Terry Hawkins", "Carole Hale" ], "connections who play golf" : [ "Joseph Dennis", "Tanya Jordan", "Angelo Ward", "Carole Hale" ] }