Docs Menu
Docs Home
/ / /
Java Sync
/

Aggregation Expression Operations

On this page

  • Overview
  • How to Use Operations
  • Constructor Methods
  • Operations
  • Arithmetic Operations
  • Array Operations
  • Boolean Operations
  • Comparison Operations
  • Conditional Operations
  • Convenience Operations
  • Conversion Operations
  • Date Operations
  • Document Operations
  • Map Operations
  • String Operations
  • Type-Checking Operations

In this guide, you can learn how to use the MongoDB Java Driver to construct expressions for use in the aggregation pipeline. You can perform expression operations with discoverable, typesafe Java methods rather than BSON documents. Because these methods follow the fluent interface pattern, you can chain aggregation operations together to create code that is both more compact and more naturally readable.

The operations in this guide use methods from the com.mongodb.client.model.mql package. These methods provide an idiomatic way to use the Query API, the mechanism by which the driver interacts with a MongoDB deployment. To learn more about the Query API, see the Server manual documentation.

The examples in this guide assume that you include the following static imports in your code:

import static com.mongodb.client.model.Aggregates.*;
import static com.mongodb.client.model.Accumulators.*
import static com.mongodb.client.model.Projections.*;
import static com.mongodb.client.model.Filters.*;
import static com.mongodb.client.model.mql.MqlValues.*;
import static java.util.Arrays.asList;

To access document fields in an expression, you need to reference the current document being processed by the aggregation pipeline. Use the current() method to refer to this document. To access the value of a field, you must use the appropriately typed method, such as getString() or getDate(). When you specify the type for a field, you ensure that the driver provides only those methods which are compatible with that type. The following code shows how to reference a string field called name:

current().getString("name")

To specify a value in an operation, pass it to the of() constructor method to convert it to a valid type. The following code shows how to reference a value of 1.0:

of(1.0)

To create an operation, chain a method to your field or value reference. You can build more complex operations by chaining additional methods.

The following example creates an operation to find patients in New Mexico who have visited the doctor’s office at least once. The operation performs the following actions:

  • Checks if the size of the visitDates array is greater than 0 by using the gt() method

  • Checks if the state field value is “New Mexico” by using the eq() method

The and() method links these operations so that the pipeline stage matches only documents that meet both criteria.

current()
.getArray("visitDates")
.size()
.gt(of(0))
.and(current()
.getString("state")
.eq(of("New Mexico")));

While some aggregation stages, such as group(), accept operations directly, other stages expect that you first include your operation in a method such as computed() or expr(). These methods, which take values of type TExpression, allow you to use your expressions in certain aggregations.

To complete your aggregation pipeline stage, include your expression in an aggregates builder method. The following list provides examples of how to include your expression in common aggregates builder methods:

  • match(expr(<expression>))

  • project(fields(computed("<field name>", <expression>)))

  • group(<expression>)

To learn more about these methods, see Aggregates Builders.

The examples use the asList() method to create a list of aggregation stages. This list is passed to the aggregate() method of MongoCollection.

You can use these constructor methods to define values for use in Java aggregation expressions.

Method
Description
References the current document being processed by the aggregation pipeline.
References the current document being processed by the aggregation pipeline as a map value.
Returns an MqlValue type corresponding to the provided primitive.
Returns an array of MqlValue types corresponding to the provided array of primitives.
Returns an entry value.
Returns an empty map value.
Returns the null value as exists in the Query API.

Important

When you provide a value to one of these methods, the driver treats it literally. For example, of("$x") represents the string value "$x", rather than a field named x.

Refer to any of the sections in Operations for examples using these methods.

The following sections provide information and examples for aggregation expression operations available in the driver. The operations are categorized by purpose and functionality.

Each section has a table that describes aggregation methods available in the driver and corresponding expression operators in the Query API. The method names link to API documentation and the aggregation pipeline operator names link to descriptions and examples in the Server manual documentation. While each Java method is effectively equivalent to the corresponding Query API expression, they may differ in expected parameters and implementation.

Note

The driver generates a Query API expression that may be different from the Query API expression provided in each example. However, both expressions will produce the same aggregation result.

Important

The driver does not provide methods for all aggregation pipeline operators in the Query API. If you need to use an unsupported operation in an aggregation, you must define the entire expression using the BSON Document type. To learn more about the Document type, see Documents.

You can perform an arithmetic operation on a value of type MqlInteger or MqlNumber using the methods described in this section.

Suppose you have weather data for a specific year that includes the precipitation measurement (in inches) for each day. You want find the average precipitation, in millimeters, for each month.

The multiply() operator multiplies the precipitation field by 25.4 to convert the value to millimeters. The avg() accumulator method returns the average as the avgPrecipMM field. The group() method groups the values by month given in each document's date field.

The following code shows the pipeline for this aggregation:

var month = current().getDate("date").month(of("UTC"));
var precip = current().getInteger("precipitation");
asList(group(
month,
avg("avgPrecipMM", precip.multiply(25.4))
));

The following code provides an equivalent aggregation pipeline in the Query API:

[ { $group: {
_id: { $month: "$date" },
avgPrecipMM: {
$avg: { $multiply: ["$precipitation", 25.4] } }
} } ]

You can perform an array operation on a value of type MqlArray using the methods described in this section.

Suppose you have a collection of movies, each of which contains an array of nested documents for upcoming showtimes. Each nested document contains an array that represents the total number of seats in the theater, where the first array entry is the number of premium seats and the second entry is the number of regular seats. Each nested document also contains the number of tickets that have already been bought for the showtime. A document in this collection might resemble the following:

{
"_id": ...,
"movie": "Hamlet",
"showtimes": [
{
"date": "May 14, 2023, 12:00 PM",
"seats": [ 20, 80 ],
"ticketsBought": 100
},
{
"date": "May 20, 2023, 08:00 PM",
"seats": [ 10, 40 ],
"ticketsBought": 34
}]
}

The filter() method displays only the results matching the provided predicate. In this case, the predicate uses sum() to calculate the total number of seats and compares that value to the number of ticketsBought with lt(). The project() method stores these filtered results as a new availableShowtimes array.

Tip

You must specify the type of the array that you retrieve with the getArray() method if you need to work with the values of the array as their specific type.

In this example, we specify that the seats array contains values of type MqlDocument so that we can extract nested fields from each array entry.

The following code shows the pipeline for this aggregation:

var showtimes = current().<MqlDocument>getArray("showtimes");
asList(project(fields(
computed("availableShowtimes", showtimes
.filter(showtime -> {
var seats = showtime.<MqlInteger>getArray("seats");
var totalSeats = seats.sum(n -> n);
var ticketsBought = showtime.getInteger("ticketsBought");
var isAvailable = ticketsBought.lt(totalSeats);
return isAvailable;
}))
)));

Note

To improve readability, the previous example assigns intermediary values to the totalSeats and isAvailable variables. If you don't pull out these intermediary values into variables, the code still produces equivalent results.

The following code provides an equivalent aggregation pipeline in the Query API:

[ { $project: {
availableShowtimes: {
$filter: {
input: "$showtimes",
as: "showtime",
cond: { $lt: [ "$$showtime.ticketsBought", { $sum: "$$showtime.seats" } ] }
} }
} } ]

You can perform a boolean operation on a value of type MqlBoolean using the methods described in this section.

Java Method
Aggregation Pipeline Operator

Suppose you want to classify very low or high weather temperature readings (in degrees Fahrenheit) as extreme.

The or() operator checks to see if temperatures are extreme by comparing the temperature field to predefined values with lt() and gt(). The project() method records this result in the extremeTemp field.

The following code shows the pipeline for this aggregation:

var temperature = current().getInteger("temperature");
asList(project(fields(
computed("extremeTemp", temperature
.lt(of(10))
.or(temperature.gt(of(95))))
)));

The following code provides an equivalent aggregation pipeline in the Query API:

[ { $project: {
extremeTemp: { $or: [ { $lt: ["$temperature", 10] },
{ $gt: ["$temperature", 95] } ] }
} } ]

You can perform a comparison operation on a value of type MqlValue using the methods described in this section.

Tip

The cond() method is similar to the ternary operator in Java and you should use it for simple branches based on a boolean value. You should use the switchOn() methods for more complex comparisons such as performing pattern matching on the value type or other arbitrary checks on the value.

The following example shows a pipeline that matches all the documents where the location field has the value "California":

var location = current().getString("location");
asList(match(expr(location.eq(of("California")))));

The following code provides an equivalent aggregation pipeline in the Query API:

[ { $match: { location: { $eq: "California" } } } ]

You can perform a conditional operation using the methods described in this section.

Suppose you have a collection of customers with their membership information. Originally, customers were either members or not. Over time, membership levels were introduced and used the same field. The information stored in this field can be one of a few different types, and you want to create a standardized value indicating their membership level.

The switchOn() method checks each clause in order. If the value matches the type indicated by the clause, that clause determines the string value corresponding to the membership level. If the original value is a string, it represents the membership level and that value is used. If the data type is a boolean, it returns either Gold or Guest for the membership level. If the data type is an array, it returns the most recent string in the array which matches the most recent membership level. If the member field is an unknown type, the switchOn() method provides a default value of Guest.

The following code shows the pipeline for this aggregation:

var member = current().getField("member");
asList(project(fields(
computed("membershipLevel",
member.switchOn(field -> field
.isString(s -> s)
.isBoolean(b -> b.cond(of("Gold"), of("Guest")))
.<MqlString>isArray(a -> a.last())
.defaults(d -> of("Guest"))))
)));

The following code provides an equivalent aggregation pipeline in the Query API:

[ { $project: {
membershipLevel: {
$switch: {
branches: [
{ case: { $eq: [ { $type: "$member" }, "string" ] }, then: "$member" },
{ case: { $eq: [ { $type: "$member" }, "bool" ] }, then: { $cond: {
if: "$member",
then: "Gold",
else: "Guest" } } },
{ case: { $eq: [ { $type: "$member" }, "array" ] }, then: { $last: "$member" } }
],
default: "Guest" } }
} } ]

You can apply custom functions to values of type MqlValue using the methods described in this section.

To improve readability and allow for code reuse, you can move redundant code into static methods. However, it is not possible to directly chain static methods in Java. The passTo() method lets you chain values into custom static methods.

Java Method
Aggregation Pipeline Operator
No corresponding operator

Suppose you need to determine how a class is performing against some benchmarks. You want to find the average final grade for each class and compare it against the benchmark values.

The following custom method gradeAverage() takes an array of documents and the name of an integer field shared across those documents. It calculates the average of that field across all the documents in the provided array and determines the average of that field across all the elements in the provided array. The evaluate() method compares a provided value to two provided range limits and generates a response string based on how the values compare:

public static MqlNumber gradeAverage(MqlArray<MqlDocument> students, String fieldName) {
var sum = students.sum(student -> student.getInteger(fieldName));
var avg = sum.divide(students.size());
return avg;
}
public static MqlString evaluate(MqlNumber grade, MqlNumber cutoff1, MqlNumber cutoff2) {
var message = grade.switchOn(on -> on
.lte(cutoff1, g -> of("Needs improvement"))
.lte(cutoff2, g -> of("Meets expectations"))
.defaults(g -> of("Exceeds expectations")));
return message;
}

Tip

One advantage of using the passTo() method is that you can reuse your custom methods for other aggregations. You could use the gradeAverage() method to find the average of grades for groups of students filtered by, for example, entry year or district, not just their class. You could use the evaluate() method to evaluate, for example, an individual student's performance, or an entire school's or district's performance.

The passArrayTo() method takes all of the students and calculates the average score by using the gradeAverage() method. Then, the passNumberTo() method uses the evaluate() method to determine how the classes are performing. This example stores the result as the evaluation field using the project() method.

The following code shows the pipeline for this aggregation:

var students = current().<MqlDocument>getArray("students");
asList(project(fields(
computed("evaluation", students
.passArrayTo(students -> gradeAverage(students, "finalGrade"))
.passNumberTo(grade -> evaluate(grade, of(70), of(85))))
)));

The following code provides an equivalent aggregation pipeline in the Query API:

[ { $project: {
evaluation: { $switch: {
branches: [
{ case: { $lte: [ { $avg: "$students.finalGrade" }, 70 ] },
then: "Needs improvement"
},
{ case: { $lte: [ { $avg: "$students.finalGrade" }, 85 ] },
then: "Meets expectations"
}
],
default: "Exceeds expectations" } }
} } ]

You can perform a conversion operation to convert between certain MqlValue types using the methods described in this section.

Suppose you want to have a collection of student data that includes their graduation years, which are stored as strings. You want to calculate the year of their five-year reunion and store this value in a new field.

The parseInteger() method converts the graduationYear to an integer so that add() can calculate the reunion year. The addFields() method stores this result as a new reunionYear field.

The following code shows the pipeline for this aggregation:

var graduationYear = current().getString("graduationYear");
asList(addFields(
new Field("reunionYear",
graduationYear
.parseInteger()
.add(5))
));

The following code provides an equivalent aggregation pipeline in the Query API:

[ { $addFields: {
reunionYear: {
$add: [ { $toInt: "$graduationYear" }, 5 ] }
} } ]

You can perform a date operation on a value of type MqlDate using the methods described in this section.

Suppose you have data about package deliveries and need to match deliveries that occurred on any Monday in the "America/New_York" time zone.

If the deliveryDate field contains any string values representing valid dates, such as "2018-01-15T16:00:00Z" or Jan 15, 2018, 12:00 PM EST, you can use the parseDate() method to convert the strings into date types.

The dayOfWeek() method determines which day of the week it is and converts it to a number based on which day is a Monday according to the "America/New_York" parameter. The eq() method compares this value to 2, which corresponds to Monday based on the provided timezone parameter.

The following code shows the pipeline for this aggregation:

var deliveryDate = current().getString("deliveryDate");
asList(match(expr(deliveryDate
.parseDate()
.dayOfWeek(of("America/New_York"))
.eq(of(2))
)));

The following code provides an equivalent aggregation pipeline in the Query API:

[ { $match: {
$expr: {
$eq: [ {
$dayOfWeek: {
date: { $dateFromString: { dateString: "$deliveryDate" } },
timezone: "America/New_York" }},
2
] }
} } ]

You can perform a document operation on a value of type MqlDocument using the methods described in this section.

Suppose you have a collection of legacy customer data which includes addresses as child documents under the mailing.address field. You want to find all the customers who currently live in Washington state. A document in this collection might resemble the following:

{
"_id": ...,
"customer.name": "Mary Kenneth Keller",
"mailing.address":
{
"street": "601 Mongo Drive",
"city": "Vasqueztown",
"state": "CO",
"zip": 27017
}
}

The getDocument() method retrieves the mailing.address field as a document so the nested state field can be retrieved with the getString() method. The eq() method checks if the value of the state field is "WA".

The following code shows the pipeline for this aggregation:

var address = current().getDocument("mailing.address");
asList(match(expr(address
.getString("state")
.eq(of("WA"))
)));

The following code provides an equivalent aggregation pipeline in the Query API:

[
{ $match: {
$expr: {
$eq: [{
$getField: {
input: { $getField: { input: "$$CURRENT", field: "mailing.address"}},
field: "state" }},
"WA" ]
}}}]

You can perform a map operation on a value of either type MqlMap or MqlEntry using the methods described in this section.

Tip

You should represent data as a map if the data maps arbitrary keys such as dates or item IDs to values.

Java Method
Aggregation Pipeline Operator
No corresponding operator
No corresponding operator
No corresponding operator
No corresponding operator
No corresponding operator
No corresponding operator
No corresponding operator
No corresponding operator
No corresponding operator

Suppose you have a collection of inventory data where each document represents an individual item you're responsible for supplying. Each document contains a field that is a map of all your warehouses and how many copies they currently have in their inventory of the item. You want to determine the total number of copies of items you have across all of your warehouses. A document in this collection might resemble the following:

{
"_id": ...,
"item": "notebook"
"warehouses": [
{ "Atlanta", 50 },
{ "Chicago", 0 },
{ "Portland", 120 },
{ "Dallas", 6 }
]
}

The entries() method returns the map entries in the warehouses field as an array. The sum() method calculates the total value of items based on the values in the array retrieved with the getValue() method. This example stores the result as the new totalInventory field using the project() method.

The following code shows the pipeline for this aggregation:

var warehouses = current().getMap("warehouses");
asList(project(fields(
computed("totalInventory", warehouses
.entries()
.sum(v -> v.getValue()))
)));

The following code provides an equivalent aggregation pipeline in the Query API:

[ { $project: {
totalInventory: {
$sum: {
$getField: { $objectToArray: "$warehouses" },
} }
} } ]

You can perform a string operation on a value of type MqlString using the methods described in this section.

Suppose you need to generate lowercase usernames for employees of a company from the employees' last names and employee IDs.

The append() method combines the firstName and lastName fields into a single username, while the toLower() method makes the entire username lowercase. This example stores the result as a new username field using the project() method.

The following code shows the pipeline for this aggregation:

var lastName = current().getString("lastName");
var employeeID = current().getString("employeeID");
asList(project(fields(
computed("username", lastName
.append(employeeID)
.toLower())
)));

The following code provides an equivalent aggregation pipeline in the Query API:

[ { $project: {
username: {
$toLower: { $concat: ["$lastName", "$employeeID"] } }
} } ]

You can perform a type-check operation on a value of type MqlValue using the methods described in this section.

These methods do not return boolean values. Instead, you provide a default value that matches the type specified by the method. If the checked value matches the method type, the checked value is returned. Otherwise, the supplied default value is returned. If you want to program branching logic based on the data type, see switchOn().

Java Method
Aggregation Pipeline Operator
No corresponding operator
No corresponding operator
No corresponding operator
No corresponding operator
No corresponding operator
No corresponding operator
No corresponding operator
No corresponding operator

Suppose you have a collection of rating data. An early version of the review schema allowed users to submit negative reviews without a star rating. You want convert any of these negative reviews without a star rating to have the minimum value of 1 star.

The isNumberOr() method returns either the value of rating, or a value of 1 if rating is not a number or is null. The project() method returns this value as a new numericalRating field.

The following code shows the pipeline for this aggregation:

var rating = current().getField("rating");
asList(project(fields(
computed("numericalRating", rating
.isNumberOr(of(1)))
)));

The following code provides an equivalent aggregation pipeline in the Query API:

[ { $project: {
numericalRating: {
$cond: { if: { $isNumber: "$rating" },
then: "$rating",
else: 1
} }
} } ]

Back

Aggregation