聚合(Aggregation)

在此页面上

Overview

比较聚合与查找操作
实用参考资料
可运行示例
导入类
基本聚合示例
解释聚合示例
聚合表达式示例

此版本的文档已存档，不再提供支持。查看最新文档，学习；了解如何升级您的Java驱动程序版本。

Overview

在本指南中，您可以了解如何使用 Java 驱动程序来执行聚合操作。

聚合操作会对 MongoDB 集合中的数据进行处理，并返回计算结果。MongoDB 聚合框架是查询 API 的一部分，是基于数据处理管道的概念进行建模的。文档通过一个或多个阶段组成的管道流转，该管道将文档转化为聚合结果。

聚合操作类似于汽车工厂。汽车工厂有一条装配线，其中包含配备专用工具的装配站，用于完成特定的工作，例如钻机和焊机。毛坯零件会进入工厂，然后装配线将其转换并组装为成品。

聚合管道是装配线，聚合阶段是装配站，操作符表达式则是专用工具。

比较聚合与查找操作

您可以使用查找操作执行以下动作：

选择要返回哪些文档
选择要返回哪些字段
对结果进行排序

您可以使用聚合操作执行以下动作：

执行查找操作
重命名字段
计算字段
汇总数据
对值进行分组

聚合操作存在一些限制，您必须牢记：

返回的文档不得违反 BSON 文档大小限制（16 兆字节）。
默认情况下，管道阶段的内存限制为 100 MB。如果需要，可使用 allowDiskUse 方法超过此限制。
重要
$graphLookup 异常
$graphLookup阶段有100 MB 的严格内存限制，并将忽略 allowDiskUse 。

实用参考资料

可运行示例

导入类

创建一个名为 AggTour.java 的新 Java 文件并包含以下导入语句：

import com.mongodb.client.MongoClient;
import com.mongodb.client.MongoClients;
import com.mongodb.client.MongoCollection;
import com.mongodb.client.MongoDatabase;
import com.mongodb.ExplainVerbosity;
import com.mongodb.client.model.Accumulators;
import com.mongodb.client.model.Aggregates;
import com.mongodb.client.model.Filters;
import com.mongodb.client.model.Projections;
import org.bson.Document;
import org.bson.json.JsonWriterSettings;
import java.util.Arrays;
import java.util.List;

连接到 MongoDB 部署

public class AggTour {
    public static void main(String[] args) {
        // Replace the uri string with your MongoDB deployment's connection string
        String uri = "<connection string>";
        MongoClient mongoClient = MongoClients.create(uri);
        MongoDatabase database = mongoClient.getDatabase("aggregation");
        MongoCollection<Document> collection = database.getCollection("restaurants");
        // Paste the aggregation code here
    }
}

提示

如需了解有关连接 MongoDB 的更多信息，请参阅“连接指南”。

插入样本数据

collection.insertMany(Arrays.asList(
    new Document("name", "Sun Bakery Trattoria").append("contact", new Document().append("phone", "386-555-0189").append("email", "SunBakeryTrattoria@example.org").append("location", Arrays.asList(-74.0056649, 40.7452371))).append("stars", 4).append("categories", Arrays.asList("Pizza", "Pasta", "Italian", "Coffee", "Sandwiches")),
    new Document("name", "Blue Bagels Grill").append("contact", new Document().append("phone", "786-555-0102").append("email", "BlueBagelsGrill@example.com").append("location", Arrays.asList(-73.92506, 40.8275556))).append("stars", 3).append("categories", Arrays.asList("Bagels", "Cookies", "Sandwiches")),
    new Document("name", "XYZ Bagels Restaurant").append("contact", new Document().append("phone", "435-555-0190").append("email", "XYZBagelsRestaurant@example.net").append("location", Arrays.asList(-74.0707363, 40.59321569999999))).append("stars", 4).append("categories", Arrays.asList("Bagels", "Sandwiches", "Coffee")),
    new Document("name", "Hot Bakery Cafe").append("contact", new Document().append("phone", "264-555-0171").append("email", "HotBakeryCafe@example.net").append("location", Arrays.asList(-73.96485799999999, 40.761899))).append("stars", 4).append("categories", Arrays.asList("Bakery", "Cafe", "Coffee", "Dessert")),
    new Document("name", "Green Feast Pizzeria").append("contact", new Document().append("phone", "840-555-0102").append("email", "GreenFeastPizzeria@example.com").append("location", Arrays.asList(-74.1220973, 40.6129407))).append("stars", 2).append("categories", Arrays.asList("Pizza", "Italian")),
    new Document("name", "ZZZ Pasta Buffet").append("contact", new Document().append("phone", "769-555-0152").append("email", "ZZZPastaBuffet@example.com").append("location", Arrays.asList(-73.9446421, 40.7253944))).append("stars", 0).append("categories", Arrays.asList("Pasta", "Italian", "Buffet", "Cafeteria")),
    new Document("name", "XYZ Coffee Bar").append("contact", new Document().append("phone", "644-555-0193").append("email", "XYZCoffeeBar@example.net").append("location", Arrays.asList(-74.0166091, 40.6284767))).append("stars", 5).append("categories", Arrays.asList("Coffee", "Cafe", "Bakery", "Chocolates")),
    new Document("name", "456 Steak Restaurant").append("contact", new Document().append("phone", "990-555-0165").append("email", "456SteakRestaurant@example.com").append("location", Arrays.asList(-73.9365108, 40.8497077))).append("stars", 0).append("categories", Arrays.asList("Steak", "Seafood")),
    new Document("name", "456 Cookies Shop").append("contact", new Document().append("phone", "604-555-0149").append("email", "456CookiesShop@example.org").append("location", Arrays.asList(-73.8850023, 40.7494272))).append("stars", 4).append("categories", Arrays.asList("Bakery", "Cookies", "Cake", "Coffee")),
    new Document("name", "XYZ Steak Buffet").append("contact", new Document().append("phone", "229-555-0197").append("email", "XYZSteakBuffet@example.org").append("location", Arrays.asList(-73.9799932, 40.7660886))).append("stars", 3).append("categories", Arrays.asList("Steak", "Salad", "Chinese"))
));

基本聚合示例

如需执行聚合，请向 MongoCollection.aggregate() 方法传递聚合阶段列表。

Java 驱动程序提供 Aggregates 助手类，该类包含用于聚合阶段的构建器。

在以下示例中，聚合管道：

使用 $match 阶段来过滤其 categories 数组字段包含 Bakery 元素的文档。该示例使用 Aggregates.match 来构建 $match 阶段。
使用 $group 阶段根据 stars 字段对匹配的文档进行分组，从而为每个不同的 stars 值累积文档数。

注意

您可以使用聚合构建器构建此示例中使用的表达式。

collection.aggregate(
    Arrays.asList(
        Aggregates.match(Filters.eq("categories", "Bakery")),
        Aggregates.group("$stars", Accumulators.sum("count", 1))
    )
// Prints the result of the aggregation operation as JSON
).forEach(doc -> System.out.println(doc.toJson()));

上述聚合应该生成以下结果：

{"_id": 4, "count": 2}
{"_id": 5, "count": 1}

有关本节中提到的方法和类的详情，请参阅以下 API 文档：

解释聚合示例

要查看有关 MongoDB 如何执行您的操作的信息，请使用 AggregateIterable 类的 explain() 方法。explain() 方法返回执行计划和性能统计信息。执行计划是 MongoDB 完成操作的一种潜在方式。explain() 方法提供获胜计划（即 MongoDB 执行的计划）以及任何被拒绝的计划。

提示

要了解有关查询计划和执行统计信息的更多信息，请参阅服务器手册中的解释结果。

您可以通过将详细程度传递给 explain() 方法来指定说明的详细程度。

下表显示了说明的所有详细级别及其预计使用案例：

详细程度	用例(Use Case)
ALL_PLANS_EXECUTIONS	你想知道 MongoDB 会选择哪个计划来运行查询。
EXECUTION_STATS	您想知道您的查询是否表现良好。
QUERY_PLANNER	您的查询有问题，需要尽可能多的信息来诊断问题。

下面的示例打印了生成执行计划的任何聚合阶段的获胜计划的 JSON 表示：

Document explanation = collection.aggregate(
        Arrays.asList(
                Aggregates.match(Filters.eq("categories", "Bakery")),
                Aggregates.group("$stars", Accumulators.sum("count", 1))
        )
).explain(ExplainVerbosity.EXECUTION_STATS);
String winningPlans = explanation
    .getEmbedded(
        Arrays.asList("queryPlanner", "winningPlan", "queryPlan"),
        Document.class
    )
    .toJson(JsonWriterSettings.builder().indent(true).build());
System.out.println(winningPlans);

该示例生成以下输出，因为 $group 阶段是生成执行计划的唯一阶段：

{
  "stage": "GROUP",
  "planNodeId": 2,
  "inputStage": {
    "stage": "COLLSCAN",
    "planNodeId": 1,
    "filter": {
      "categories": {
        "$eq": "Bakery"
      }
    },
    "direction": "forward"
  }
}

有关本节提及主题的更多信息，请参阅以下资源：

解释输出服务器手册条目
查询计划服务器手册条目
ExplainVerbosity API 文档
解释（） API 文档
AggregateIterable API 文档

聚合表达式示例

Java 驱动程序提供与 $group 一起使用的累加器表达式的构建器。您必须以 JSON 格式或兼容的文档格式声明所有其他表达式。

提示

下面任一示例中的语法都将定义一个 $arrayElemAt 表达式。

“类别”之前的 $ 指示 MongoDB 这是一个字段路径，它使用了输入文档中的 categories 字段。

new Document("$arrayElemAt", Arrays.asList("$categories", 0))

Document.parse("{ $arrayElemAt: ['$categories', 0] }")

或者，您可以使用聚合表达式操作 API 来构造表达式。要了解详情，请参见聚合表达式操作。

在以下示例中，聚合管道使用 $project 阶段和各种 Projections 来返回 name 字段和计算字段 firstCategory，其值是 categories 字段中的第一个元素。

collection.aggregate(
    Arrays.asList(
        Aggregates.project(
            Projections.fields(
                Projections.excludeId(),
                Projections.include("name"),
                Projections.computed(
                    "firstCategory",
                    new Document(
                        "$arrayElemAt", 
                        Arrays.asList("$categories", 0)
                    )
                )
            )
        )
    )
).forEach(doc -> System.out.println(doc.toJson()));