Docs Menu
Docs Home
/ / /
Scala
/

Aggregates

On this page

  • Match
  • Project
  • Computed Fields
  • Sample
  • Sort
  • Skip
  • Limit
  • Lookup
  • Group
  • Unwind
  • Set Window Fields
  • Assembling a Pipeline

The Aggregates class provides static factory methods that build aggregation pipeline stages. Each method returns an instance of the Bson type, which can in turn be passed to the MongoCollection.aggregate() method.

You can import the methods of the Aggregates class statically, as shown in the following code:

import org.mongodb.scala.model.Aggregates._

The examples in this guide assume this static import.

The $match pipeline stage passes all documents matching the specified filter to the next stage. Though the filter can be an instance of any class that implements Bson, it’s convenient to use methods from the Filters class.

The following example creates a pipeline stage that matches all documents where the author field value is "Dave":

`match`(equal("author", "Dave"))

Note

As match is a reserved word in Scala and has to be escaped by backticks, you might prefer to use the filter() alias:

filter(equal("author", "Dave"))

The $project pipeline stage passes the projected fields of all documents to the next stage. Though the projection can be an instance of any class that implements Bson, it’s convenient to use methods from the Projections class.

The following example creates a pipeline stage that excludes the _id field but includes the title and author fields:

project(fields(include("title", "author"), excludeId()))

The $project stage can project computed fields as well.

The following example projects the qty field into a new field called quantity. In other words, it renames the field:

project(computed("quantity", "$qty"))

The $sample pipeline stage randomly select N documents from input documents. The following example uses the sample() method to randomly select 5 documents from the collection:

sample(5)

The $sort pipeline stage passes all documents to the next stage, sorted by the specified sort criteria. Though the sort criteria can be an instance of any class that implements Bson, it’s convenient to use methods from the Sorts class.

The following example creates a pipeline stage that sorts in descending order according to the value of the age field and then in ascending order according to the value of the posts field:

sort(orderBy(descending("age"), ascending("posts")))

The $skip pipeline stage skips over the specified number of documents that pass into the stage and passes the remaining documents to the next stage.

The following example skips the first 5 documents:

skip(5)

The $limit pipeline stage limits the number of documents passed to the next stage.

The following example limits the number of documents to 10:

limit(10)

The $lookup pipeline stage performs a left outer join with another collection to filter in documents from the joined collection for processing.

The following example performs a left outer join on the fromCollection collection, joining the local field to the from field and outputted in the joinedOutput field:

lookup("fromCollection", "local", "from", "joinedOutput")

The $group pipeline stage groups documents by some specified expression and outputs a document for each distinct grouping to the next stage. A group consists of an _id which specifies the expression on which to group, and zero or more accumulators which are evaluated for each grouping.

To simplify the expression of accumulators, the driver includes an Accumulators singleton object with factory methods for each of the supported accumulators.

The following example groups documents by the value of the customerId field, and for each group accumulates the sum and average of the values of the quantity field into the totalQuantity and averageQuantity fields, respectively:

group("$customerId", sum("totalQuantity", "$quantity"), avg("averageQuantity", "$quantity"))

The $unwind pipeline stage deconstructs an array field from the input documents to output a document for each element.

The following example outputs, for each document, a document for each element in the sizes array:

unwind("$sizes")

The following example also includes any documents that have missing or null values for the sizes field or where the sizes list is empty:

unwind("$sizes", UnwindOptions().preserveNullAndEmptyArrays(true))

The following example unwinds the sizes array and also outputs the array index into the position field:

unwind("$sizes", UnwindOptions().includeArrayIndex("$position"))

The $setWindowFields pipeline stage allows using window operators. This stage partitions the input documents similarly to the $group pipeline stage, optionally sorts them, computes fields in the documents by computing window functions over windows specified per function, and outputs the documents. A window is a subset of a partition.

The important difference from the $group pipeline stage is that documents belonging to the same partition or window are not folded into a single document.

The driver includes the WindowedComputations singleton object with factory methods for supported window operators.

The following example computes the accumulated rainfall and the average temperature over the past month per each locality from more fine-grained measurements presented in the rainfall and temperature fields:

val pastMonth: Window = Windows.timeRange(-1, MongoTimeUnit.MONTH, Windows.Bound.CURRENT)
setWindowFields(Some("$localityId"), Some(Sorts.ascending("measurementDateTime")),
WindowedComputations.sum("monthlyRainfall", "$rainfall", Some(pastMonth)),
WindowedComputations.avg("monthlyAvgTemp", "$temperature", Some(pastMonth)))

Pipeline operators are typically combined into a list and passed to the aggregate() method of a MongoCollection:

collection.aggregate(List(filter(equal("author", "Dave")),
group("$customerId", sum("totalQuantity", "$quantity"),
avg("averageQuantity", "$quantity")),
out("authors")))

Back

Sorts