Aggregation Builder
On this page
Overview
In this guide, you can learn how to perform aggregations and construct pipelines by using the Laravel Integration aggregation builder. The aggregation builder lets you use a type-safe syntax to construct a MongoDB aggregation pipeline.
An aggregation pipeline is a data processing pipeline that sequentially performs transformations and computations on data from the MongoDB database, then outputs the results as a new document or set of documents.
An aggregation pipeline is composed of aggregation stages. Aggregation stages use operators to process input data and produce data that the next stage uses as its input.
The Laravel MongoDB aggregation builder lets you build aggregation stages and aggregation pipelines. The following sections show examples of how to use the aggregation builder to create the stages of an aggregation pipeline:
Tip
The aggregation builder feature is available only in Laravel MongoDB versions 4.3 and later. To learn more about running aggregations without using the aggregation builder, see Aggregations in the Query Builder guide.
Add the Aggregation Builder Dependency
The aggregation builder is part of the mongodb/builder package. You must add this package as a dependency to your project to use it. Run the following command to add the aggregation builder dependency to your application:
composer require mongodb/builder:^0.2
When the installation completes, verify that the composer.json
file
includes the following line in the require
object:
"mongodb/builder": "^0.2",
Create an Aggregation Pipeline
To start an aggregation pipeline, call the Model::aggregate()
method.
Then, chain the aggregation stage methods in the sequence you want them to
run.
The aggregation builder includes the following namespaces that you can import to build aggregation stages:
MongoDB\Builder\Accumulator
MongoDB\Builder\Expression
MongoDB\Builder\Query
MongoDB\Builder\Type
Tip
To learn more about builder classes, see the mongodb/mongodb-php-builder GitHub repository.
This section features the following examples, which show how to use common aggregation stages and combine stages to build an aggregation pipeline:
To learn more about MongoDB aggregation operators, see Aggregation Stages in the Server manual.
Sample Documents
The following examples run aggregation pipelines on a collection represented
by the User
model. You can add the sample data by running the following
insert()
method:
User::insert([ ['name' => 'Alda Gröndal', 'occupation' => 'engineer', 'birthday' => new UTCDateTime(new DateTimeImmutable('2002-01-01'))], ['name' => 'Francois Soma', 'occupation' => 'engineer', 'birthday' => new UTCDateTime(new DateTimeImmutable('1998-02-02'))], ['name' => 'Janet Doe', 'occupation' => 'designer', 'birthday' => new UTCDateTime(new DateTimeImmutable('1987-03-03'))], ['name' => 'Eliud Nkosana', 'occupation' => 'engineer', 'birthday' => new UTCDateTime(new DateTimeImmutable('1984-04-04'))], ['name' => 'Bran Steafan', 'occupation' => 'engineer', 'birthday' => new UTCDateTime(new DateTimeImmutable('1998-05-05'))], ['name' => 'Ellis Lee', 'occupation' => 'designer', 'birthday' => new UTCDateTime(new DateTimeImmutable('1996-06-06'))], ]);
Match Stage Example
You can chain the match()
method to your aggregation pipeline to specify
a query filter. If you omit this stage, the aggregate()
method outputs
all the documents in the model's collection for the following stage.
This aggregation stage is often placed first to retrieve the data by using available indexes and reduce the amount of data the subsequent stages process.
Tip
If you omit the match()
method, the aggregation pipeline matches all
documents in the collection that correspond to the model before other
aggregation stages.
This example constructs a query filter for a match aggregation stage by
using the MongoDB\Builder\Query
builder. The match stage includes the
following criteria:
Returns results that match either of the query filters by using the
Query::or()
functionMatches documents that contain an
occupation
field with a value of"designer"
by using theQuery::query()
andQuery::eq()
functionsMatches documents that contain a
name
field with a value of"Eliud Nkosana"
by using theQuery::query()
andQuery::eq()
functions
Click the VIEW OUTPUT button to see the documents returned by running the code:
$pipeline = User::aggregate() ->match(Query::or( Query::query(occupation: Query::eq('designer')), Query::query(name: Query::eq('Eliud Nkosana')), )); $result = $pipeline->get();
[ { "_id": ..., "name": "Janet Doe", "occupation": "designer", "birthday": { "$date": { "$numberLong": "541728000000" } } }, { "_id": ..., "name": "Eliud Nkosana", "occupation": "engineer", "birthday": { "$date": { "$numberLong": "449884800000" } } }, { "_id": ..., "name": "Ellis Lee", "occupation": "designer", "birthday": { "$date": { "$numberLong": "834019200000" } } } ]
Tip
The Query::or()
function corresponds to the $or
MongoDB query operator.
To learn more about this operator, see $or
in the Server manual.
Group Stage Example
You can chain the group()
method to your aggregation pipeline to modify the
structure of the data by performing calculations and grouping it by common
field values.
This aggregation stage is often placed immediately after a match stage to reduce the data subsequent stages process.
This example uses the MongoDB\Builder\Expression
builder to define the group keys in a
group aggregation stage. The group stage specifies the following grouping
behavior:
Sets the value of the group key, represented by the
_id
field, to the field value defined by theExpression
builderReferences the document values in the
occupation
field by calling theExpression::fieldPath()
function
Click the VIEW OUTPUT button to see the documents returned by running the code:
$pipeline = User::aggregate() ->group(_id: Expression::fieldPath('occupation')); $result = $pipeline->get();
[ { "_id": "engineer" }, { "_id": "designer" } ]
Tip
This example stage performs a similar task as the distinct()
query
builder method. To learn more about the distinct()
method, see the
Retrieve Distinct Field Values usage example.
Sort Stage Example
You can chain the sort()
method to your aggregation pipeline to specify
the documents' output order.
You can add this aggregation stage anywhere in the pipeline. It is often placed after the group stage since it can depend on the grouped data. We recommend placing the sort stage as late as possible in the pipeline to limit the data it processes.
To specify an sort, set the field value to the Sort::Asc
enum for an
ascending sort or the Sort::Desc
enum for a descending sort.
This example shows a sort()
aggregation pipeline stage that sorts the
documents by the name
field to Sort::Desc
, corresponding to reverse
alphabetical order. Click the VIEW OUTPUT button to see
the documents returned by running the code:
$pipeline = User::aggregate() ->sort(name: Sort::Desc); $result = $pipeline->get();
[ { "_id": ..., "name": "Janet Doe", "occupation": "designer", "birthday": { "$date": { "$numberLong": "541728000000" } } }, { "_id": ..., "name": "Francois Soma", "occupation": "engineer", "birthday": { "$date": { "$numberLong": "886377600000" } } }, { "_id": ..., "name": "Ellis Lee", "occupation": "designer", "birthday": { "$date": { "$numberLong": "834019200000" } } }, { "_id": ..., "name": "Eliud Nkosana", "occupation": "engineer", "birthday": { "$date": { "$numberLong": "449884800000" } } }, { "_id": ..., "name": "Bran Steafan", "occupation": "engineer", "birthday": { "$date": { "$numberLong": "894326400000" } } }, { "_id": ..., "name": "Alda Gröndal", "occupation": "engineer", "birthday": { "$date": { "$numberLong": "1009843200000" } } } ]
Project Stage Example
You can chain the project()
method to your aggregation pipeline to specify
which fields from the documents to display by this stage.
To specify fields to include, pass the name of a field and a truthy value,
such as 1
or true
. All other fields are omitted from the output.
Alternatively, to specify fields to exclude, pass each field name and
a falsy value, such as 0
or false
. All other fields are included in
the output.
Tip
When you specify fields to include, the _id
field is included by default.
To exclude the _id
field, explicitly exclude it in the projection stage.
This example shows how to use the project()
method aggregation stage to
include only the name
field and exclude all other fields from the output.
Click the VIEW OUTPUT button to see the data returned by
running the code:
$pipeline = User::aggregate() ->project(_id: 0, name: 1); $result = $pipeline->get();
[ { "name": "Alda Gröndal" }, { "name": "Francois Soma" }, { "name": "Janet Doe" }, { "name": "Eliud Nkosana" }, { "name": "Bran Steafan" }, { "name": "Ellis Lee" } ]
Aggregation Pipeline Example
This aggregation pipeline example chains multiple stages. Each stage runs on the output retrieved from each preceding stage. In this example, the stages perform the following operations sequentially:
Add the
birth_year
field to the documents and set the value to the year extracted from thebirthday
field.Group the documents by the value of the
occupation
field and compute the average value ofbirth_year
for each group by using theAccumulator::avg()
function. Assign the result of the computation to thebirth_year_avg
field.Sort the documents by the group key field in ascending order.
Create the
profession
field from the value of the group key field, include thebirth_year_avg
field, and omit the_id
field.
Click the VIEW OUTPUT button to see the data returned by running the code:
$pipeline = User::aggregate() ->addFields( birth_year: Expression::year( Expression::dateFieldPath('birthday'), ), ) ->group( _id: Expression::fieldPath('occupation'), birth_year_avg: Accumulator::avg(Expression::numberFieldPath('birth_year')), ) ->sort(_id: Sort::Asc) ->project(profession: Expression::fieldPath('_id'), birth_year_avg: 1, _id: 0);
[ { "birth_year_avg": 1991.5, "profession": "designer" }, { "birth_year_avg": 1995.5, "profession": "engineer" } ]
Note
Since this pipeline omits the match()
stage, the input for the initial
stage consists of all the documents in the collection.
Create a Custom Operator Factory
When using the aggregation builder to create an aggregation pipeline, you can define operations or stages in a custom operator factory. A custom operator factory is a function that returns expressions or stages of an aggregation pipeline. You can create these functions to improve code readability and reuse.
This example shows how to create and use a custom operator factory that returns expressions that extract the year from a specified date field.
The following function accepts the name of a field that contains a date and returns an expression that extracts the year from the date:
public function yearFromField(string $dateFieldName): YearOperator { return Expression::year( Expression::dateFieldPath($dateFieldName), ); }
The example aggregation pipeline includes the following stages:
addFields()
, which calls the custom operator factory function to extract the year from thebirthday
field and assign it to thebirth_year
fieldproject()
, which includes only thename
andbirth_year
fields in its output
Click the VIEW OUTPUT button to see the data returned by running the code:
$pipeline = User::aggregate() ->addFields(birth_year: $this->yearFromField('birthday')) ->project(_id: 0, name: 1, birth_year: 1);
[ { "name": "Alda Gröndal", "birth_year": 2002 }, { "name": "Francois Soma", "birth_year": 1998 }, { "name": "Janet Doe", "birth_year": 1987 }, { "name": "Eliud Nkosana", "birth_year": 1984 }, { "name": "Bran Steafan", "birth_year": 1998 }, { "name": "Ellis Lee", "birth_year": 1996 } ]