/ /

Aggregation Builder

Overview

In this guide, you can learn how to perform aggregations and construct pipelines by using the Laravel Integration aggregation builder. The aggregation builder lets you use a type-safe syntax to construct a MongoDB aggregation pipeline.

An aggregation pipeline is a data processing pipeline that sequentially performs transformations and computations on data from the MongoDB database, then outputs the results as a new document or set of documents.

An aggregation pipeline is composed of aggregation stages. Aggregation stages use operators to process input data and produce data that the next stage uses as its input.

The Laravel MongoDB aggregation builder lets you build aggregation stages and aggregation pipelines. The following sections show examples of how to use the aggregation builder to create the stages of an aggregation pipeline:

Create Aggregation Stages
Build Aggregation Pipelines
Create a Custom Operator Factory

Tip

The aggregation builder feature is available only in Laravel MongoDB versions 4.3 and later. To learn more about running aggregations without using the aggregation builder, see Aggregations in the Query Builder guide.

Create Aggregation Stages

To start an aggregation pipeline, call the Model::aggregate() method. Then, chain aggregation stage methods and specify the necessary parameters for the stage. For example, you can call the sort() operator method to build a $sort stage.

The aggregation builder includes the following namespaces that you can import to build aggregation stages:

MongoDB\Builder\Accumulator
MongoDB\Builder\Expression
MongoDB\Builder\Query
MongoDB\Builder\Type

This section features the following examples that show how to use common aggregation stages:

Match Stage Example
Group Stage Example
Sort Stage Example
Project Stage Example

To learn more about MongoDB aggregation operators, see Aggregation Stages in the Server manual.

Sample Documents

The following examples run aggregation pipelines on a collection represented by the User model. You can add the sample data by running the following insert() method:

User::insert([
    ['name' => 'Alda Gröndal', 'occupation' => 'engineer', 'birthday' => new UTCDateTime(new DateTimeImmutable('2002-01-01'))],
    ['name' => 'Francois Soma', 'occupation' => 'engineer', 'birthday' => new UTCDateTime(new DateTimeImmutable('1998-02-02'))],
    ['name' => 'Janet Doe', 'occupation' => 'designer', 'birthday' => new UTCDateTime(new DateTimeImmutable('1987-03-03'))],
    ['name' => 'Eliud Nkosana', 'occupation' => 'engineer', 'birthday' => new UTCDateTime(new DateTimeImmutable('1984-04-04'))],
    ['name' => 'Bran Steafan', 'occupation' => 'engineer', 'birthday' => new UTCDateTime(new DateTimeImmutable('1998-05-05'))],
    ['name' => 'Ellis Lee', 'occupation' => 'designer', 'birthday' => new UTCDateTime(new DateTimeImmutable('1996-06-06'))],
]);

Match Stage Example

You can chain the match() method to your aggregation pipeline to specify a query filter. If you omit this stage, the aggregate() method outputs all the documents in the model's collection for the following stage.

This aggregation stage is often placed first to retrieve the data by using available indexes and reduce the amount of data the subsequent stages process.

Tip

If you omit the match() method, the aggregation pipeline matches all documents in the collection that correspond to the model before other aggregation stages.

This example constructs a query filter for a match aggregation stage by using the MongoDB\Builder\Query builder. The match stage includes the following criteria:

Returns results that match either of the query filters by using the Query::or() function
Matches documents that contain an occupation field with a value of "designer" by using the Query::query() and Query::eq() functions
Matches documents that contain a name field with a value of "Eliud Nkosana" by using the Query::query() and Query::eq() functions

Click the VIEW OUTPUT button to see the documents returned by running the code:

$pipeline = User::aggregate()
    ->match(Query::or(
        Query::query(occupation: Query::eq('designer')),
        Query::query(name: Query::eq('Eliud Nkosana')),
    ));
$result = $pipeline->get();

[
  {
    "_id": ...,
    "name": "Janet Doe",
    "occupation": "designer",
    "birthday": {
      "$date": {
        "$numberLong": "541728000000"
      }
    }
  },
  {
    "_id": ...,
    "name": "Eliud Nkosana",
    "occupation": "engineer",
    "birthday": {
      "$date": {
        "$numberLong": "449884800000"
      }
    }
  },
  {
    "_id": ...,
    "name": "Ellis Lee",
    "occupation": "designer",
    "birthday": {
      "$date": {
        "$numberLong": "834019200000"
      }
    }
  }
]

Tip

The Query::or() function corresponds to the $or MongoDB query operator. To learn more about this operator, see $or in the Server manual.

Group Stage Example

You can chain the group() method to your aggregation pipeline to modify the structure of the data by performing calculations and grouping it by common field values.

This aggregation stage is often placed immediately after a match stage to reduce the data subsequent stages process.

This example uses the MongoDB\Builder\Expression builder to define the group keys in a group aggregation stage. The group stage specifies the following grouping behavior:

Sets the value of the group key, represented by the _id field, to the field value defined by the Expression builder
References the document values in the occupation field by calling the Expression::fieldPath() function

Click the VIEW OUTPUT button to see the documents returned by running the code:

$pipeline = User::aggregate()
    ->group(_id: Expression::fieldPath('occupation'));
$result = $pipeline->get();

[
  { "_id": "engineer" },
  { "_id": "designer" }
]

Tip

This example stage performs a similar task as the distinct() query builder method. To learn more about the distinct() method, see the Retrieve Distinct Field Values usage example.

Sort Stage Example

You can chain the sort() method to your aggregation pipeline to specify the documents' output order.

You can add this aggregation stage anywhere in the pipeline. It is often placed after the group stage since it can depend on the grouped data. We recommend placing the sort stage as late as possible in the pipeline to limit the data it processes.

To specify an sort, set the field value to the Sort::Asc enum for an ascending sort or the Sort::Desc enum for a descending sort.

This example shows a sort() aggregation pipeline stage that sorts the documents by the name field to Sort::Desc, corresponding to reverse alphabetical order. Click the VIEW OUTPUT button to see the documents returned by running the code:

$pipeline = User::aggregate()
    ->sort(name: Sort::Desc);
$result = $pipeline->get();

[
  {
    "_id": ...,
    "name": "Janet Doe",
    "occupation": "designer",
    "birthday": {
      "$date": {
        "$numberLong": "541728000000"
      }
    }
  },
  {
    "_id": ...,
    "name": "Francois Soma",
    "occupation": "engineer",
    "birthday": {
      "$date": {
        "$numberLong": "886377600000"
      }
    }
  },
  {
    "_id": ...,
    "name": "Ellis Lee",
    "occupation": "designer",
    "birthday": {
      "$date": {
        "$numberLong": "834019200000"
      }
    }
  },
  {
    "_id": ...,
    "name": "Eliud Nkosana",
    "occupation": "engineer",
    "birthday": {
      "$date": {
        "$numberLong": "449884800000"
      }
    }
  },
  {
    "_id": ...,
    "name": "Bran Steafan",
    "occupation": "engineer",
    "birthday": {
      "$date": {
        "$numberLong": "894326400000"
      }
    }
  },
  {
    "_id": ...,
    "name": "Alda Gröndal",
    "occupation": "engineer",
    "birthday": {
      "$date": {
        "$numberLong": "1009843200000"
      }
    }
  }
]

Project Stage Example

You can chain the project() method to your aggregation pipeline to specify which fields from the documents to display by this stage.

To specify fields to include, pass the name of a field and a truthy value, such as 1 or true. All other fields are omitted from the output.

Alternatively, to specify fields to exclude, pass each field name and a falsy value, such as 0 or false. All other fields are included in the output.

Tip

When you specify fields to include, the _id field is included by default. To exclude the _id field, explicitly exclude it in the projection stage.

This example shows how to use the project() method aggregation stage to include only the name field and exclude all other fields from the output. Click the VIEW OUTPUT button to see the data returned by running the code:

$pipeline = User::aggregate()
    ->project(_id: 0, name: 1);
$result = $pipeline->get();

[
  { "name": "Alda Gröndal" },
  { "name": "Francois Soma" },
  { "name": "Janet Doe" },
  { "name": "Eliud Nkosana" },
  { "name": "Bran Steafan" },
  { "name": "Ellis Lee" }
]

Build Aggregation Pipelines

To build an aggregation pipeline, call the Model::aggregate() method, then chain the aggregation stages in the sequence you want them to run. The examples in this section are adapted from the Server manual. Each example provides a link to the sample data that you can insert into your database to test the aggregation operation.

This section features the following examples, which show how to use common aggregation stages:

Filter and Group Example
Unwind Embedded Arrays Example
Single Equality Join Example

Filter and Group Example

This example uses the sample data given in the Calculate Count, Sum, and Average section of the $group stage reference in the Server manual.

The following code example calculates the total sales amount, average sales quantity, and sale count for each day in the year 2014. To do so, it uses an aggregation pipeline that contains the following stages:

$match stage to filter for documents that contain a date field in which the year is 2014
$group stage to group the documents by date and calculate the total sales amount, average sales quantity, and sale count for each group
$sort stage to sort the results by the total sale amount for each group in descending order

Click the VIEW OUTPUT button to see the data returned by running the code:

$pipeline = Sale::aggregate()
    ->match(
        date: [
            Query::gte(new UTCDateTime(new DateTimeImmutable('2014-01-01'))),
            Query::lt(new UTCDateTime(new DateTimeImmutable('2015-01-01'))),
        ],
    )
    ->group(
        _id: Expression::dateToString(Expression::dateFieldPath('date'), '%Y-%m-%d'),
        totalSaleAmount: Accumulator::sum(
            Expression::multiply(
                Expression::numberFieldPath('price'),
                Expression::numberFieldPath('quantity'),
            ),
        ),
        averageQuantity: Accumulator::avg(
            Expression::numberFieldPath('quantity'),
        ),
        count: Accumulator::sum(1),
    )
    ->sort(
        totalSaleAmount: Sort::Desc,
    );

[
  { "_id": "2014-04-04", "totalSaleAmount": { "$numberDecimal": "200" }, "averageQuantity": 15, "count": 2 },
  { "_id": "2014-03-15", "totalSaleAmount": { "$numberDecimal": "50" }, "averageQuantity": 10, "count": 1 },
  { "_id": "2014-03-01", "totalSaleAmount": { "$numberDecimal": "40" }, "averageQuantity": 1.5, "count": 2 }
]

Unwind Embedded Arrays Example

This example uses the sample data given in the Unwind Embedded Arrays section of the $unwind stage reference in the Server manual.

The following code example groups sold items by their tags and calculates the total sales amount for each tag. To do so, it uses an aggregation pipeline that contains the following stages:

$unwind stage to output a separate document for each element in the items array
$unwind stage to output a separate document for each element in the items.tags arrays
$group stage to group the documents by the tag value and calculate the total sales amount of items that have each tag

Click the VIEW OUTPUT button to see the data returned by running the code:

$pipeline = Sale::aggregate()
    ->unwind(Expression::arrayFieldPath('items'))
    ->unwind(Expression::arrayFieldPath('items.tags'))
    ->group(
        _id: Expression::fieldPath('items.tags'),
        totalSalesAmount: Accumulator::sum(
            Expression::multiply(
                Expression::numberFieldPath('items.price'),
                Expression::numberFieldPath('items.quantity'),
            ),
        ),
    );

[
  { "_id": "school", "totalSalesAmount": { "$numberDecimal": "104.85" } },
  { "_id": "electronics", "totalSalesAmount": { "$numberDecimal": "800.00" } },
  { "_id": "writing", "totalSalesAmount": { "$numberDecimal": "60.00" } },
  { "_id": "office", "totalSalesAmount": { "$numberDecimal": "1019.60" } },
  { "_id": "stationary", "totalSalesAmount": { "$numberDecimal": "264.45" } }
]

Single Equality Join Example

This example uses the sample data given in the Perform a Single Equality Join with $lookup section of the $lookup stage reference in the Server manual.

The following code example joins the documents from the orders collection with the documents from the inventory collection by using the item field from the orders collection and the sku field from the inventory collection.

To do so, the example uses an aggregation pipeline that contains a $lookup stage that specifies the collection to retrieve data from and the local and foreign field names.

Click the VIEW OUTPUT button to see the data returned by running the code:

$pipeline = Order::aggregate()
    ->lookup(
        from: 'inventory',
        localField: 'item',
        foreignField: 'sku',
        as: 'inventory_docs',
    );

[
  { "_id": 1, "item": "almonds", "price": 12, "quantity": 2, "inventory_docs": [
      { "_id": 1, "sku": "almonds", "description": "product 1", "instock": 120 }
  ] },
  { "_id": 2, "item": "pecans", "price": 20, "quantity": 1, "inventory_docs": [
      { "_id": 4, "sku": "pecans", "description": "product 4", "instock": 70 }
  ] },
  { "_id": 3, "inventory_docs": [
      { "_id": 5, "sku": null, "description": "Incomplete" },
      { "_id": 6 }
  ] }
]

Create a Custom Operator Factory

When using the aggregation builder to create an aggregation pipeline, you can define operations or stages in a custom operator factory. A custom operator factory is a function that returns expressions or stages of an aggregation pipeline. You can create these functions to improve code readability and reuse.

This example shows how to create and use a custom operator factory that returns expressions that extract the year from a specified date field.

The following function accepts the name of a field that contains a date and returns an expression that extracts the year from the date:

public function yearFromField(string $dateFieldName): YearOperator
{
    return Expression::year(
        Expression::dateFieldPath($dateFieldName),
    );
}

The example aggregation pipeline includes the following stages:

addFields(), which calls the custom operator factory function to extract the year from the birthday field and assign it to the birth_year field
project(), which includes only the name and birth_year fields in its output

Click the VIEW OUTPUT button to see the data returned by running the code:

$pipeline = User::aggregate()
    ->addFields(birth_year: $this->yearFromField('birthday'))
    ->project(_id: 0, name: 1, birth_year: 1);

[
  {
    "name": "Alda Gröndal",
    "birth_year": 2002
  },
  {
    "name": "Francois Soma",
    "birth_year": 1998
  },
  {
    "name": "Janet Doe",
    "birth_year": 1987
  },
  {
    "name": "Eliud Nkosana",
    "birth_year": 1984
  },
  {
    "name": "Bran Steafan",
    "birth_year": 1998
  },
  {
    "name": "Ellis Lee",
    "birth_year": 1996
  }
]

Back

Delete

MongoDB Search