Transform Your Data with Aggregation
On this page
Overview
In this guide, you can learn how to use the C++ driver to perform aggregation operations.
Aggregation operations process data in your MongoDB collections and return computed results. The MongoDB Aggregation framework, which is part of the Query API, is modeled on the concept of data processing pipelines. Documents enter a pipeline that contains one or more stages, and this pipeline transforms the documents into an aggregated result.
An aggregation operation is similar to a car factory. A car factory has an assembly line, which contains assembly stations with specialized tools to do specific jobs, like drills and welders. Raw parts enter the factory, and then the assembly line transforms and assembles them into a finished product.
The aggregation pipeline is the assembly line, aggregation stages are the assembly stations, and operator expressions are the specialized tools.
Aggregation Versus Find Operations
You can use find operations to perform the following actions:
Select which documents to return
Select which fields to return
Sort the results
You can use aggregation operations to perform the following actions:
Run find operations
Rename fields
Calculate fields
Summarize data
Group values
Limitations
Keep the following limitations in mind when using aggregation operations:
Returned documents cannot violate the BSON document size limit of 16 megabytes.
Pipeline stages have a memory limit of 100 megabytes by default. You can exceed this limit by setting the
allow_disk_use
field of amongocxx::options::aggregate
instance totrue
.
Important
$graphLookup Exception
The $graphLookup stage has a strict
memory limit of 100 megabytes and ignores the allow_disk_use
field.
Aggregation Example
Note
The examples in this guide use the restaurants
collection in the sample_restaurants
database from the Atlas sample datasets. To learn how to create a
free MongoDB Atlas cluster and load the sample datasets, see the Get Started with Atlas guide.
To perform an aggregation, pass a mongocxx::pipeline
instance containing the aggregation
stages to the collection.aggregate()
method.
The following code example produces a count of the number of bakeries in each borough of New York. To do so, it uses an aggregation pipeline that contains the following stages:
$match stage to filter for documents in which the
cuisine
field contains the value"Bakery"
$group stage to group the matching documents by the
borough
field, accumulating a count of documents for each distinct value
mongocxx::pipeline stages; stages.match(make_document(kvp("cuisine", "Bakery"))) .group(make_document(kvp("_id", "$borough"), kvp("count", make_document(kvp("$sum", 1))))); auto cursor = collection.aggregate(stages); for (auto&& doc : cursor) { std::cout << bsoncxx::to_json(doc) << std::endl; }
{ "_id" : "Brooklyn", "count" : 173 } { "_id" : "Queens", "count" : 204 } { "_id" : "Bronx", "count" : 71 } { "_id" : "Staten Island", "count" : 20 } { "_id" : "Missing", "count" : 2 } { "_id" : "Manhattan", "count" : 221 }
Explain an Aggregation
To view information about how MongoDB executes your operation, you can instruct the MongoDB query planner to explain it. When MongoDB explains an operation, it returns execution plans and performance statistics. An execution plan is a potential way MongoDB can complete an operation. When you instruct MongoDB to explain an operation, it returns both the plan MongoDB executed and any rejected execution plans.
To explain an aggregation operation, run the explain
database command by specifying
the command in a BSON document and passing it as an argument to the run_command()
method.
The following example instructs MongoDB to explain the aggregation operation from the preceding Aggregation Example:
mongocxx::pipeline stages; stages.match(make_document(kvp("cuisine", "Bakery"))) .group(make_document(kvp("_id", "$borough"), kvp("count", make_document(kvp("$sum", 1))))); auto command = make_document( kvp("explain", make_document( kvp("aggregate", "restaurants"), kvp("pipeline", stages.view_array()), kvp("cursor", make_document())))); auto result = db.run_command(command.view()); std::cout << bsoncxx::to_json(result) << std::endl;
{ "explainVersion" : "2", "queryPlanner" : { "namespace" : "sample_restaurants.restaurants", "indexFilterSet" : false, "parsedQuery" : { "cuisine" : { "$eq" : "Bakery" } }, "queryHash": "...", "planCacheKey" : "...", "optimizedPipeline" : true, "maxIndexedOrSolutionsReached": false, "maxIndexedAndSolutionsReached" : false, "maxScansToExplodeReached" : false, "winningPlan" : { ... } ... }
Additional Information
MongoDB Server Manual
To view a full list of expression operators, see Aggregation Operators.
To learn about assembling an aggregation pipeline and view examples, see Aggregation Pipeline.
To learn more about creating pipeline stages, see Aggregation Stages.
To learn more about explaining MongoDB operations, see Explain Output and Query Plans.
API Documentation
For more information about executing aggregation operations with the C++ driver, see the following API documentation: