Aggregation
On this page
Overview
In this guide, you can learn how to perform aggregation operations in the Rust driver.
Aggregation operations process data in your MongoDB collections based on specifications you can set in an aggregation pipeline. An aggregation pipeline consists of one or more stages. Each stage performs an operation based on its expression operators. After the driver executes the aggregation pipeline, it returns an aggregated result.
This guide includes the following sections:
Compare Aggregation and Find Operations describes the functionality differences between aggregation and find operations
Server Limitations describes the server limitations on memory usage for aggregation operations
Examples provides examples of aggregations for different use cases
Additional Information provides links to resources and API documentation for types and methods mentioned in this guide
Analogy
Aggregation operations function similarly to car factories with assembly lines. The assembly lines have stations with specialized tools to perform specific tasks. For example, when building a car, the assembly line begins with the frame. Then, as the car frame moves through the assembly line, each station assembles a separate part. The result is a transformed final product, the finished car.
The assembly line represents the aggregation pipeline, the individual stations represent the aggregation stages, the specialized tools represent the expression operators, and the finished product represents the aggregated result.
Compare Aggregation and Find Operations
The following table lists the different tasks you can perform with find operations, compared to what you can achieve with aggregation operations. The aggregation framework provides expanded functionality that allows you to transform and manipulate your data.
Find Operations | Aggregation Operations |
---|---|
Select certain documents to return Select which fields to return Sort the results Limit the results Count the results | Select certain documents to return Select which fields to return Sort the results Limit the results Count the results Rename fields Compute new fields Summarize data Connect and merge data sets |
Server Limitations
When performing aggregation operations, consider the following limitations:
Returned documents must not violate the BSON document size limit of 16 megabytes.
Pipeline stages have a memory limit of 100 megabytes by default. If required, you can exceed this limit by setting the allow_disk_use field in your
AggregateOptions
.The $graphLookup operator has a strict memory limit of 100 megabytes and ignores the
allow_disk_use
setting.
Examples
The examples in this section use the following sample documents. Each document represents a user profile on a book review website and contains information about their name, age, genre interests, and date that they were last active on the website:
{ "name": "Sonya Mehta", "age": 23, "genre_interests": ["fiction", "mystery", "memoir"], "last_active": { "$date": "2023-05-13T00:00:00.000Z" } }, { "name": "Selena Sun", "age": 45, "genre_interests": ["fiction", "literary", "theory"], "last_active": { "$date": "2023-05-25T00:00:00.000Z" } }, { "name": "Carter Johnson", "age": 56, "genre_interests": ["literary", "self help"], "last_active": { "$date": "2023-05-31T00:00:00.000Z" } }, { "name": "Rick Cortes", "age": 18, "genre_interests": ["sci-fi", "fantasy", "memoir"], "last_active": { "$date": "2023-07-01T00:00:00.000Z" } }, { "name": "Belinda James", "age": 76, "genre_interests": ["literary", "nonfiction"], "last_active": { "$date": "2023-06-11T00:00:00.000Z" } }, { "name": "Corey Saltz", "age": 29, "genre_interests": ["fiction", "sports", "memoir"], "last_active": { "$date": "2023-01-23T00:00:00.000Z" } }, { "name": "John Soo", "age": 16, "genre_interests": ["fiction", "sports"], "last_active": { "$date": "2023-01-03T00:00:00.000Z" } }, { "name": "Lisa Ray", "age": 39, "genre_interests": ["poetry", "art", "memoir"], "last_active": { "$date": "2023-05-30T00:00:00.000Z" } }, { "name": "Kiran Murray", "age": 20, "genre_interests": ["mystery", "fantasy", "memoir"], "last_active": { "$date": "2023-01-30T00:00:00.000Z" } }, { "name": "Beth Carson", "age": 31, "genre_interests": ["mystery", "nonfiction"], "last_active": { "$date": "2023-08-04T00:00:00.000Z" } }, { "name": "Thalia Dorn", "age": 21, "genre_interests": ["theory", "literary", "fiction"], "last_active": { "$date": "2023-08-19T00:00:00.000Z" } }, { "name": "Arthur Ray", "age": 66, "genre_interests": ["sci-fi", "fantasy", "fiction"], "last_active": { "$date": "2023-11-27T00:00:00.000Z" } }
Age Insights by Genre
The following example calculates the average, minimum, and maximum age of users interested in each genre.
The aggregation pipeline contains the following stages:
An
$unwind
stage to separate each array entry in thegenre_interests
field into a new document.A
$group
stage to group documents by the value of thegenre_interests
field. This stage finds the average, minimum, and maximum user age by using the$avg
,$min
, and$max
operators.
let age_pipeline = vec![ doc! { "$unwind": doc! { "path": "$genre_interests" } }, doc! { "$group": doc! { "_id": "$genre_interests", "avg_age": doc! { "$avg": "$age" }, "min_age": doc! { "$min": "$age" }, "max_age": doc! { "$max": "$age" } } } ]; let mut results = my_coll.aggregate(age_pipeline).await?; while let Some(result) = results.try_next().await? { println!("* {:?}", result); }
* { "_id": "memoir", "avg_age": 25.8, "min_age": 18, "max_age": 39 } * { "_id": "sci-fi", "avg_age": 42, "min_age": 18, "max_age": 66 } * { "_id": "fiction", "avg_age": 33.333333333333336, "min_age": 16, "max_age": 66 } * { "_id": "nonfiction", "avg_age": 53.5, "min_age": 31, "max_age": 76 } * { "_id": "self help", "avg_age": 56, "min_age": 56, "max_age": 56 } * { "_id": "poetry", "avg_age": 39, "min_age": 39, "max_age": 39 } * { "_id": "literary", "avg_age": 49.5, "min_age": 21, "max_age": 76 } * { "_id": "fantasy", "avg_age": 34.666666666666664, "min_age": 18, "max_age": 66 } * { "_id": "mystery", "avg_age": 24.666666666666668, "min_age": 20, "max_age": 31 } * { "_id": "theory", "avg_age": 33, "min_age": 21, "max_age": 45 } * { "_id": "art", "avg_age": 39, "min_age": 39, "max_age": 39 } * { "_id": "sports", "avg_age": 22.5, "min_age": 16, "max_age": 29 }
Group by Time Component
The following example finds how many users were last active in each month.
The aggregation pipeline contains the following stages:
$project
stage to extract the month from thelast_active
field as a number into themonth_last_active
field$group
stage to group documents by themonth_last_active
field and count the number of documents for each month$sort
stage to set an ascending sort on the month
let last_active_pipeline = vec![ doc! { "$project": { "month_last_active" : doc! { "$month" : "$last_active" } } }, doc! { "$group": doc! { "_id" : doc! {"month_last_active": "$month_last_active"} , "number" : doc! { "$sum" : 1 } } }, doc! { "$sort": { "_id.month_last_active" : 1 } } ]; let mut results = my_coll.aggregate(last_active_pipeline).await?; while let Some(result) = results.try_next().await? { println!("* {:?}", result); }
* { "_id": { "month_last_active": 1 }, "number": 3 } * { "_id": { "month_last_active": 5 }, "number": 4 } * { "_id": { "month_last_active": 6 }, "number": 1 } * { "_id": { "month_last_active": 7 }, "number": 1 } * { "_id": { "month_last_active": 8 }, "number": 2 } * { "_id": { "month_last_active": 11 }, "number": 1 }
Calculate Popular Genres
The following example finds the three most popular genres based on how often they appear in users' interests.
The aggregation pipeline contains the following stages:
$unwind
stage to separate each array entry in thegenre_interests
field into a new document$group
stage to group documents by thegenre_interests
field and count the number of documents for each genre$sort
stage to set a descending sort on the genre popularity$limit
stage to show only the first three genres
let popularity_pipeline = vec![ doc! { "$unwind" : "$genre_interests" }, doc! { "$group" : doc! { "_id" : "$genre_interests" , "number" : doc! { "$sum" : 1 } } }, doc! { "$sort" : doc! { "number" : -1 } }, doc! { "$limit": 3 } ]; let mut results = my_coll.aggregate(popularity_pipeline).await?; while let Some(result) = results.try_next().await? { println!("* {:?}", result); }
* { "_id": "fiction", "number": 6 } * { "_id": "memoir", "number": 5 } * { "_id": "literary", "number": 4 }
Additional Information
To learn more about the concepts mentioned in this guide, see the following Server manual entries:
To learn more about the behavior of the aggregate()
method, see the
Aggregation Operations section of the
Retrieve Data guide.
To learn more about sorting results within an aggregation pipeline, see the Sort Results guide.
API Documentation
To learn more about the methods and types mentioned in this guide, see the following API documentation: