Docs Home → Develop Applications → MongoDB Drivers → Java Sync
Aggregates Builders
On this page
Overview
In this guide, you can learn how to use the Aggregates class which provides static factory methods that build aggregation pipeline stages in the MongoDB Java driver.
For a more thorough introduction to Aggregation, see our Aggregation guide.
Tip
For brevity, you may choose to import the methods of the following classes statically to make your queries more succinct:
Aggregates
Filters
Projections
Sorts
Accumulators
import static com.mongodb.client.model.Aggregates.*; import static com.mongodb.client.model.Filters.*; import static com.mongodb.client.model.Projections.*; import static com.mongodb.client.model.Sorts.*; import static com.mongodb.client.model.Accumulators.*; import static java.util.Arrays.asList;
The examples on this page assume these static imports, in addition to
statically importing the asList()
method.
Use these methods to construct pipeline stages and specify them in your aggregation as a list:
Bson matchStage = match(eq("some_field", "some_criteria")); Bson sortByCountStage = sortByCount("some_field"); collection.aggregate(asList(matchStage, sortByCountStage)).forEach(doc -> System.out.println(doc));
Match
Use the match()
method to create a $match
pipeline stage that matches incoming documents against the specified
query filter, filtering out documents that do not match.
Tip
The filter can be an instance of any class that implements Bson
, but it's
convenient to combine with use of the Filters class.
The following example creates a pipeline stage that matches all documents where the
title
field is equal to "The Shawshank Redemption":
match(eq("title", "The Shawshank Redemption"));
Project
Use the project()
method to create a $project
pipeline stage that project specified document fields. Field projection
in aggregation follows the same rules as field projection in queries.
Tip
Though the projection can be an instance of any class that implements Bson
,
it's convenient to combine with use of Projections.
The following example creates a pipeline stage that excludes the _id
field but
includes the title
and plot
fields:
project(fields(include("title", "plot"), excludeId()));
Projecting Computed Fields
The $project
stage can project computed fields as well.
The following example creates a pipeline stage that projects the rated
field
into a new field called rating
, effectively renaming the field.
project(fields(computed("rating", "$rated"), excludeId()));
Sample
Use the sample()
method to create a $sample
pipeline stage to randomly select documents from input.
The following example creates a pipeline stage that randomly selects 5 documents:
sample(5);
Sort
Use the sort()
method to create a $sort
pipeline stage to sort by the specified criteria.
Tip
Though the sort criteria can be an instance of any class that
implements Bson
, it's convenient to combine with use of Sorts.
The following example creates a pipeline stage that sorts in descending order according
to the value of the year
field and then in ascending order according to the
value of the title
field:
sort(orderBy(descending("year"), ascending("title")));
Skip
Use the skip()
method to create a $skip
pipeline stage to skip over the specified number of documents before
passing documents into the next stage.
The following example creates a pipeline stage that skips the first 5
documents:
skip(5);
Limit
Use the $limit pipeline stage to limit the number of documents passed to the next stage.
The following example creates a pipeline stage that limits the number of documents to 10
:
limit(10);
Lookup
Use the lookup()
method to create a $lookup
pipeline stage to perform joins and uncorrelated subqueries between two collections.
Left Outer Join
The following example creates a pipeline stage that performs a left outer
join between the movies
and comments
collections:
It joins the
_id
field frommovies
to themovie_id
field incomments
It outputs the results in the
joined_comments
field:
lookup("comments", "_id", "movie_id", "joined_comments");
Full Join and Uncorrelated SubQueries
The following example creates a pipeline stage that joins two collections, orders
and warehouses
, by the item and whether the available quantity is enough
to fulfill the ordered quantity:
List<Variable<String>> variables = asList(new Variable<>("order_item", "$item"), new Variable<>("order_qty", "$ordered")); List<Bson> pipeline = asList( match(expr(new Document("$and", asList(new Document("$eq", asList("$$order_item", "$stock_item")), new Document("$gte", asList("$instock", "$$order_qty")))))), project(fields(exclude("stock_item"), excludeId()))); List<Bson> innerJoinLookup = lookup("warehouses", variables, pipeline, "stockdata");
Group
Use the group()
method to create a $group
pipeline stage to group documents by a specified expression and output a document
for each distinct grouping.
Tip
The driver includes the Accumulators class with static factory methods for each of the supported accumulators.
The following example creates a pipeline stage that groups documents by the value
of the customerId
field. Each group accumulates the sum and average
of the values of the quantity
field into the totalQuantity
and
averageQuantity
fields.
group("$customerId", sum("totalQuantity", "$quantity"), avg("averageQuantity", "$quantity"));
Unwind
Use the unwind()
method to create an $unwind
pipeline stage to deconstruct an array field from input documents, creating
an output document for each array element.
The following example creates a document for each element in the sizes
array:
unwind("$sizes");
To preserve documents that have missing or null
values for the array field, or where array is empty:
unwind("$sizes", new UnwindOptions().preserveNullAndEmptyArrays(true));
To include the array index, in this example in a field called "position"
:
unwind("$sizes", new UnwindOptions().includeArrayIndex("position"));
Out
Use the out()
method to create an $out
pipeline stage that writes all documents to the specified collection in
the same database.
Important
The $out
stage must be the last stage in any aggregation pipeline.
The following example writes the results of the pipeline to the authors
collection:
out("authors");
Merge
Use the merge()
method to create a $merge
pipeline stage that merges all documents into the specified collection.
Important
The $merge
stage must be the last stage in any aggregation pipeline.
The following example merges the pipeline into the authors
collection using the default
options:
merge("authors");
The following example merges the pipeline into the customers
collection in the
reporting
database using some options that specify to replace
the document if both date
and customerId
match, otherwise insert the
document:
merge(new MongoNamespace("reporting", "customers"), new MergeOptions().uniqueIdentifier(asList("date", "customerId")) .whenMatched(MergeOptions.WhenMatched.REPLACE) .whenNotMatched(MergeOptions.WhenNotMatched.INSERT));
GraphLookup
Use the graphLookup()
method to create a $graphLookup
pipeline stage that performs a recursive search on a specified collection to match
a specified field in one document to a specified field of another document.
The following example computes the social network graph for users in the
contacts
collection, recursively matching the value in the friends
field
to the name
field:
graphLookup("contacts", "$friends", "friends", "name", "socialNetwork");
Using GraphLookupOptions
, you can specify the depth to recurse as well as
the name of the depth field, if desired. In this example, $graphLookup
will
recurse up to two times, and create a field called degrees
with the
recursion depth information for every document.
graphLookup("contacts", "$friends", "friends", "name", "socialNetwork", new GraphLookupOptions().maxDepth(2).depthField("degrees"));
Using GraphLookupOptions
, you can specify a filter that documents must match
in order for MongoDB to include them in your search. In this
example, only links with "golf" in their hobbies
field will be included.
graphLookup("contacts", "$friends", "friends", "name", "socialNetwork", new GraphLookupOptions().maxDepth(1).restrictSearchWithMatch(eq("hobbies", "golf")));
SortByCount
Use the sortByCount()
method to create a $sortByCount
pipeline stage that groups documents by a given expression and then sorts
these groups by count in descending order.
Tip
The $sortByCount
stage is identical to a $group
stage with a
$sum
accumulator followed by a $sort
stage.
[ { "$group": { "_id": <expression to group on>, "count": { "$sum": 1 } } }, { "$sort": { "count": -1 } } ]
The following example groups documents by the truncated value of the field x
and computes the count for each distinct value:
sortByCount(new Document("$floor", "$x"));
ReplaceRoot
Use the replaceRoot()
method to create a $replaceRoot
pipeline stage that replaces each input document with the specified document.
The following example replaces each input document with the nested document
in the spanish_translation
field:
replaceRoot("$spanish_translation");
AddFields
Use the addFields()
method to create an $addFields
pipeline stage that adds new fields to documents.
Tip
Use $addFields
when you do not want to project field inclusion
or exclusion.
Th following example adds two new fields, a
and b
to the input documents:
addFields(new Field("a", 1), new Field("b", 2));
Count
Use the count()
method to create a $count
pipeline stage that counts the number of documents that enter the stage, and assigns
that value to a specified field name. If you do not specify a field,
count()
defaults the field name to "count".
Tip
The $count
stage is syntactic sugar for:
{ "$group":{ "_id": 0, "count": { "$sum" : 1 } } }
The following example creates a pipeline stage that outputs the count of incoming documents in a field called "total":
count("total");
Bucket
Use the bucket()
method to create a $bucket
pipeline stage that automates the bucketing of data around predefined boundary
values.
The following example creates a pipeline stage that groups incoming documents based
on the value of their screenSize
field, inclusive of the lower boundary
and exclusive of the upper boundary.
bucket("$screenSize", asList(0, 24, 32, 50, 70, 200));
Use the BucketOptions
class to specify a default bucket for values
outside of the specified boundaries, and to specify additional accumulators.
The following example creates a pipeline stage that groups incoming documents based
on the value of their screenSize
field, counting the number of documents
that fall within each bucket, pushing the value of screenSize
into a
field called matches
, and capturing any screen sizes greater than "70"
into a bucket called "monster" for monstrously large screen sizes:
Tip
The driver includes the Accumulators class with static factory methods for each of the supported accumulators.
bucket("$screenSize", asList(0, 24, 32, 50, 70), new BucketOptions().defaultBucket("monster").output(sum("count", 1), push("matches", "$screenSize")));
BucketAuto
Use the bucketAuto()
method to create a $bucketAuto
pipeline stage that automatically determines the boundaries of each bucket
in its attempt to distribute the documents evenly into a specified number of buckets.
The following example creates a pipeline stage that will attempt to create and evenly
distribute documents into 10 buckets using the value of their price
field:
bucketAuto("$price", 10);
Use the BucketAutoOptions
class to specify a preferred number
based scheme to set boundary values, and specify additional accumulators.
The following example creates a pipeline stage that will attempt to create and evenly
distribute documents into 10 buckets using the value of their price
field,
setting the bucket boundaries at powers of 2 (2, 4, 8, 16, ...). It also counts
the number of documents in each bucket, and calculates their average price
in a new field called avgPrice
:
Tip
The driver includes the Accumulators class with static factory methods for each of the supported accumulators.
bucketAuto("$price", 10, new BucketAutoOptions().granularity(BucketGranularity.POWERSOF2) .output(sum("count", 1), avg("avgPrice", "$price")));
Facet
Use the facet()
method to create a $facet
pipeline stage that allows for the definition of parallel pipelines.
The following example creates a pipeline stage that executes two parallel aggregations:
The first aggregation distributes incoming documents into 5 groups according to their
attributes.screen_size
field.The second aggregation counts all manufacturers and returns their count, limited to the top 5.
facet(new Facet("Screen Sizes", bucketAuto("$attributes.screen_size", 5, new BucketAutoOptions().output(sum("count", 1)))), new Facet("Manufacturer", sortByCount("$attributes.manufacturer"), limit(5)));
SetWindowFields
Use the setWindowFields()
method to create a $setWindowFields
pipeline stage that allows using window operators to perform operations
on a specified span of documents in a collection.
Tip
Window Functions
The driver includes the WindowedComputations class with static factory methods for each of the supported window operators.
The following example creates a pipeline stage that computes the
accumulated rainfall and the average temperature over the past month for
each locality from more fine-grained measurements presented in the rainfall
and temperature
fields:
Window pastMonth = Windows.timeRange(-1, MongoTimeUnit.MONTH, Windows.Bound.CURRENT); setWindowFields("$localityId", Sorts.ascending("measurementDateTime"), WindowedComputations.sum("monthlyRainfall", "$rainfall", pastMonth), WindowedComputations.avg("monthlyAvgTemp", "$temperature", pastMonth));