Docs Menu

Docs HomeDevelop ApplicationsMongoDB DriversJava

Aggregates Builders

On this page

  • Overview
  • Match
  • Project
  • Projecting Computed Fields
  • Sample
  • Sort
  • Skip
  • Limit
  • Lookup
  • Left Outer Join
  • Full Join and Uncorrelated SubQueries
  • Group
  • Unwind
  • Out
  • Merge
  • GraphLookup
  • SortByCount
  • ReplaceRoot
  • AddFields
  • Count
  • Bucket
  • BucketAuto
  • Facet
  • SetWindowFields

In this guide, you can learn how to use the Aggregates class which provides static factory methods that build aggregation pipeline stages in the MongoDB Java driver.

For a more thorough introduction to Aggregation, see our Aggregation guide.

Tip

For brevity, you may choose to import the methods of the following classes statically to make your queries more succinct:

  • Aggregates

  • Filters

  • Projections

  • Sorts

  • Accumulators

import static com.mongodb.client.model.Aggregates.*;
import static com.mongodb.client.model.Filters.*;
import static com.mongodb.client.model.Projections.*;
import static com.mongodb.client.model.Sorts.*;
import static com.mongodb.client.model.Accumulators.*;
import static java.util.Arrays.asList;

The examples on this page assume these static imports, in addition to statically importing the asList() method.

Use these methods to construct pipeline stages and specify them in your aggregation as a list:

Bson matchStage = match(eq("some_field", "some_criteria"));
Bson sortByCountStage = sortByCount("some_field");
collection.aggregate(asList(matchStage, sortByCountStage)).forEach(doc -> System.out.println(doc));

Use the match() method to create a $match pipeline stage that matches incoming documents against the specified query filter, filtering out documents that do not match.

Tip

The filter can be an instance of any class that implements Bson, but it's convenient to combine with use of the Filters class.

The following example creates a pipeline stage that matches all documents where the title field is equal to "The Shawshank Redemption":

match(eq("title", "The Shawshank Redemption"));

Use the project() method to create a $project pipeline stage that project specified document fields. Field projection in aggregation follows the same rules as field projection in queries.

Tip

Though the projection can be an instance of any class that implements Bson, it's convenient to combine with use of Projections.

The following example creates a pipeline stage that excludes the _id field but includes the title and plot fields:

project(fields(include("title", "plot"), excludeId()));

The $project stage can project computed fields as well.

The following example creates a pipeline stage that projects the rated field into a new field called rating, effectively renaming the field.

project(fields(computed("rating", "$rated"), excludeId()));

Use the sample() method to create a $sample pipeline stage to randomly select documents from input.

The following example creates a pipeline stage that randomly selects 5 documents:

sample(5);

Use the sort() method to create a $sort pipeline stage to sort by the specified criteria.

Tip

Though the sort criteria can be an instance of any class that implements Bson, it's convenient to combine with use of Sorts.

The following example creates a pipeline stage that sorts in descending order according to the value of the year field and then in ascending order according to the value of the title field:

sort(orderBy(descending("year"), ascending("title")));

Use the skip() method to create a $skip pipeline stage to skip over the specified number of documents before passing documents into the next stage.

The following example creates a pipeline stage that skips the first 5 documents:

skip(5);

Use the $limit pipeline stage to limit the number of documents passed to the next stage.

The following example creates a pipeline stage that limits the number of documents to 10:

limit(10);

Use the lookup() method to create a $lookup pipeline stage to perform joins and uncorrelated subqueries between two collections.

The following example creates a pipeline stage that performs a left outer join between the movies and comments collections:

  • It joins the _id field from movies to the movie_id field in comments

  • It outputs the results in the joined_comments field:

lookup("comments", "_id", "movie_id", "joined_comments");

The following example creates a pipeline stage that joins two collections, orders and warehouses, by the item and whether the available quantity is enough to fulfill the ordered quantity:

List<Variable<String>> variables = asList(new Variable<>("order_item", "$item"),
new Variable<>("order_qty", "$ordered"));
List<Bson> pipeline = asList(
match(expr(new Document("$and",
asList(new Document("$eq", asList("$$order_item", "$stock_item")),
new Document("$gte", asList("$instock", "$$order_qty")))))),
project(fields(exclude("stock_item"), excludeId())));
List<Bson> innerJoinLookup = lookup("warehouses", variables, pipeline, "stockdata");

Use the group() method to create a $group pipeline stage to group documents by a specified expression and output a document for each distinct grouping.

Tip

The driver includes the Accumulators class with static factory methods for each of the supported accumulators.

The following example creates a pipeline stage that groups documents by the value of the customerId field. Each group accumulates the sum and average of the values of the quantity field into the totalQuantity and averageQuantity fields.

group("$customerId", sum("totalQuantity", "$quantity"), avg("averageQuantity", "$quantity"));

Use the unwind() method to create an $unwind pipeline stage to deconstruct an array field from input documents, creating an output document for each array element.

The following example creates a document for each element in the sizes array:

unwind("$sizes");

To preserve documents that have missing or null values for the array field, or where array is empty:

unwind("$sizes", new UnwindOptions().preserveNullAndEmptyArrays(true));

To include the array index, in this example in a field called "position":

unwind("$sizes", new UnwindOptions().includeArrayIndex("position"));

Use the out() method to create an $out pipeline stage that writes all documents to the specified collection in the same database.

Important

The $out stage must be the last stage in any aggregation pipeline.

The following example writes the results of the pipeline to the authors collection:

out("authors");

Use the merge() method to create a $merge pipeline stage that merges all documents into the specified collection.

Important

The $merge stage must be the last stage in any aggregation pipeline.

The following example merges the pipeline into the authors collection using the default options:

merge("authors");

The following example merges the pipeline into the customers collection in the reporting database using some options that specify to replace the document if both date and customerId match, otherwise insert the document:

merge(new MongoNamespace("reporting", "customers"),
new MergeOptions().uniqueIdentifier(asList("date", "customerId"))
.whenMatched(MergeOptions.WhenMatched.REPLACE)
.whenNotMatched(MergeOptions.WhenNotMatched.INSERT));

Use the graphLookup() method to create a $graphLookup pipeline stage that performs a recursive search on a specified collection to match a specified field in one document to a specified field of another document.

The following example computes the social network graph for users in the contacts collection, recursively matching the value in the friends field to the name field:

graphLookup("contacts", "$friends", "friends", "name", "socialNetwork");

Using GraphLookupOptions, you can specify the depth to recurse as well as the name of the depth field, if desired. In this example, $graphLookup will recurse up to two times, and create a field called degrees with the recursion depth information for every document.

graphLookup("contacts", "$friends", "friends", "name", "socialNetwork",
new GraphLookupOptions().maxDepth(2).depthField("degrees"));

Using GraphLookupOptions, you can specify a filter that documents must match in order for MongoDB to include them in your search. In this example, only links with "golf" in their hobbies field will be included.

graphLookup("contacts", "$friends", "friends", "name", "socialNetwork",
new GraphLookupOptions().maxDepth(1).restrictSearchWithMatch(eq("hobbies", "golf")));

Use the sortByCount() method to create a $sortByCount pipeline stage that groups documents by a given expression and then sorts these groups by count in descending order.

Tip

The $sortByCount stage is identical to a $group stage with a $sum accumulator followed by a $sort stage.

[
{ "$group": { "_id": <expression to group on>, "count": { "$sum": 1 } } },
{ "$sort": { "count": -1 } }
]

The following example groups documents by the truncated value of the field x and computes the count for each distinct value:

sortByCount(new Document("$floor", "$x"));

Use the replaceRoot() method to create a $replaceRoot pipeline stage that replaces each input document with the specified document.

The following example replaces each input document with the nested document in the spanish_translation field:

replaceRoot("$spanish_translation");

Use the addFields() method to create an $addFields pipeline stage that adds new fields to documents.

Tip

Use $addFields when you do not want to project field inclusion or exclusion.

Th following example adds two new fields, a and b to the input documents:

addFields(new Field("a", 1), new Field("b", 2));

Use the count() method to create a $count pipeline stage that counts the number of documents that enter the stage, and assigns that value to a specified field name. If you do not specify a field, count() defaults the field name to "count".

Tip

The $count stage is syntactic sugar for:

{ "$group":{ "_id": 0, "count": { "$sum" : 1 } } }

The following example creates a pipeline stage that outputs the count of incoming documents in a field called "total":

count("total");

Use the bucket() method to create a $bucket pipeline stage that automates the bucketing of data around predefined boundary values.

The following example creates a pipeline stage that groups incoming documents based on the value of their screenSize field, inclusive of the lower boundary and exclusive of the upper boundary.

bucket("$screenSize", asList(0, 24, 32, 50, 70, 200));

Use the BucketOptions class to specify a default bucket for values outside of the specified boundaries, and to specify additional accumulators.

The following example creates a pipeline stage that groups incoming documents based on the value of their screenSize field, counting the number of documents that fall within each bucket, pushing the value of screenSize into a field called matches, and capturing any screen sizes greater than "70" into a bucket called "monster" for monstrously large screen sizes:

Tip

The driver includes the Accumulators class with static factory methods for each of the supported accumulators.

bucket("$screenSize", asList(0, 24, 32, 50, 70),
new BucketOptions().defaultBucket("monster").output(sum("count", 1), push("matches", "$screenSize")));

Use the bucketAuto() method to create a $bucketAuto pipeline stage that automatically determines the boundaries of each bucket in its attempt to distribute the documents evenly into a specified number of buckets.

The following example creates a pipeline stage that will attempt to create and evenly distribute documents into 10 buckets using the value of their price field:

bucketAuto("$price", 10);

Use the BucketAutoOptions class to specify a preferred number based scheme to set boundary values, and specify additional accumulators.

The following example creates a pipeline stage that will attempt to create and evenly distribute documents into 10 buckets using the value of their price field, setting the bucket boundaries at powers of 2 (2, 4, 8, 16, ...). It also counts the number of documents in each bucket, and calculates their average price in a new field called avgPrice:

Tip

The driver includes the Accumulators class with static factory methods for each of the supported accumulators.

bucketAuto("$price", 10, new BucketAutoOptions().granularity(BucketGranularity.POWERSOF2)
.output(sum("count", 1), avg("avgPrice", "$price")));

Use the facet() method to create a $facet pipeline stage that allows for the definition of parallel pipelines.

The following example creates a pipeline stage that executes two parallel aggregations:

  • The first aggregation distributes incoming documents into 5 groups according to their attributes.screen_size field.

  • The second aggregation counts all manufacturers and returns their count, limited to the top 5.

facet(new Facet("Screen Sizes",
bucketAuto("$attributes.screen_size", 5, new BucketAutoOptions().output(sum("count", 1)))),
new Facet("Manufacturer", sortByCount("$attributes.manufacturer"), limit(5)));

Use the setWindowFields() method to create a $setWindowFields pipeline stage that allows using window operators to perform operations on a specified span of documents in a collection.

Tip

Window Functions

The driver includes the WindowedComputations class with static factory methods for each of the supported window operators.

The following example creates a pipeline stage that computes the accumulated rainfall and the average temperature over the past month for each locality from more fine-grained measurements presented in the rainfall and temperature fields:

Window pastMonth = Windows.timeRange(-1, MongoTimeUnit.MONTH, Windows.Bound.CURRENT);
setWindowFields("$localityId", Sorts.ascending("measurementDateTime"),
WindowedComputations.sum("monthlyRainfall", "$rainfall", pastMonth),
WindowedComputations.avg("monthlyAvgTemp", "$temperature", pastMonth));
←  BuildersFilters Builders →