Aggregation

Overview

In this guide, you can learn how to use the C driver to perform aggregation operations.

You can use aggregation operations to process data in your MongoDB collections and return computed results. The MongoDB Aggregation framework, which is part of the Query API, is modeled on the concept of a data processing pipeline. Documents enter a pipeline that contains one or more stages, and each stage transforms the documents to output a final aggregated result.

You can think of an aggregation operation as similar to a car factory. A car factory has an assembly line, which contains assembly stations with specialized tools to do specific jobs, like drills and welders. Raw parts enter the factory, and then the assembly line transforms and assembles them into a finished product.

The aggregation pipeline is the assembly line, aggregation stages are the assembly stations, and operator expressions are the specialized tools.

Compare Aggregation and Find Operations

You can use find operations to perform the following actions:

Select which documents to return
Select which fields to return
Sort the results

You can use aggregation operations to perform the following actions:

Perform find operations
Rename fields
Calculate fields
Summarize data
Group values

Limitations

The following limitations apply when using aggregation operations:

Returned documents must not violate the BSON document size limit of 16 megabytes.
Pipeline stages have a memory limit of 100 megabytes by default. You can exceed this limit by setting the allowDiskUse option to true.

Important

$graphLookup exception

The $graphLookup stage has a strict memory limit of 100 megabytes and ignores the allowDiskUse option.

Aggregation Examples

The examples in this guide use the restaurants collection in the sample_restaurants database from the Atlas sample datasets. To learn how to create a free MongoDB Atlas cluster and load the sample datasets, see the MongoDB Get Started guide.

Tip

Complete Aggregation Tutorials

You can find tutorials that provide detailed explanations of common aggregation tasks in the Complete Aggregation Pipeline Tutorials section of the Server manual. Select a tutorial, and then pick C from the Select your language drop-down menu in the upper-right corner of the page.

Build and Execute an Aggregation Pipeline

To perform an aggregation on the documents in a collection, pass a bson_t structure that represents the pipeline stages to the mongoc_collection_aggregate() function.

This example outputs a count of the number of bakeries in each borough of New York City. The following code creates an aggregation pipeline that contains the following stages:

A $match stage to filter for documents in which the value of the cuisine field is "Bakery".
A $group stage to group the matching documents by the borough field, producing a count of documents for each distinct value of that field.

const bson_t *doc;
bson_t *pipeline = BCON_NEW("pipeline",
    "[", 
    "{", "$match", "{", "cuisine", BCON_UTF8("Bakery"), "}", "}",
    "{", "$group", "{", 
        "_id", BCON_UTF8("$borough"), "count", "{", "$sum", BCON_INT32(1), "}", "}",
    "}",
    "]");
mongoc_cursor_t *results =
    mongoc_collection_aggregate(collection, MONGOC_QUERY_NONE, pipeline, NULL, NULL);
bson_error_t error;
if (mongoc_cursor_error(results, &error))
{
    fprintf(stderr, "Aggregate failed: %s\n", error.message);
} else {
    while (mongoc_cursor_next(results, &doc)) {
        char *str = bson_as_canonical_extended_json(doc, NULL);
        printf("%s\n", str);
        bson_free(str);
    }
}
bson_destroy(pipeline);
mongoc_cursor_destroy(results);

{ "_id" : "Queens", "count" : { "$numberInt" : "204" } }
{ "_id" : "Staten Island", "count" : { "$numberInt" : "20" } }
{ "_id" : "Missing", "count" : { "$numberInt" : "2" } }
{ "_id" : "Bronx", "count" : { "$numberInt" : "71" } }
{ "_id" : "Brooklyn", "count" : { "$numberInt" : "173" } }
{ "_id" : "Manhattan", "count" : { "$numberInt" : "221" } }

Explain an Aggregation

To view information about how MongoDB executes your operation, you can run the the explain operation on your pipeline. When MongoDB explains an operation, it returns execution plans and performance statistics. An execution plan is a potential way MongoDB can complete an operation. When you instruct MongoDB to explain an operation, it returns both the plan MongoDB selected for the operation and any rejected execution plans.

The following code example runs the same aggregation shown in the preceding section, but uses the mongoc_client_command_simple() function to explain the operation details:

bson_t reply;
bson_error_t error;
bson_t *command = BCON_NEW(
    "aggregate", BCON_UTF8("restaurants"),
    "explain", BCON_BOOL(true),
    "pipeline",
    "[",
    "{", "$match", "{", "cuisine", BCON_UTF8("Bakery"), "}", "}",
    "{", "$group", "{",
         "_id", BCON_UTF8("$borough"), "count", "{", "$sum", BCON_INT32(1), "}", "}",
    "}",
    "]");
if (mongoc_client_command_simple(client, "sample_restaurants", command, NULL, &reply, &error)) {
    char *str = bson_as_canonical_extended_json(&reply, NULL);
    printf("%s\n", str);
    bson_free(str);
} else {
    fprintf(stderr, "Command failed: %s\n", error.message);
}
bson_destroy(command);
bson_destroy(&reply);

{
  "explainVersion": "2",
  "queryPlanner": {
      "namespace": "sample_restaurants.restaurants"
      "indexFilterSet": false,
      "parsedQuery": {
        "cuisine": {"$eq": "Bakery"}
      },
      "queryHash": "865F14C3",
      "planCacheKey": "0697561B",
      "optimizedPipeline": true,
      "maxIndexedOrSolutionsReached": false,
      "maxIndexedAndSolutionsReached": false,
      "maxScansToExplodeReached": false,
      "winningPlan": { ... },
      "rejectedPlans": []
      ...
  }
  ...
}

Additional Information

For a full list of aggregation stages, see Aggregation Stages in the MongoDB Server manual.

To learn about assembling an aggregation pipeline and view examples, see Aggregation Pipeline in the MongoDB Server manual.

To learn more about explaining MongoDB operations, see Explain Output and Query Plans in the MongoDB Server manual.

API Documentation

For more information about executing aggregation operations with the C driver, see the following API documentation:

Back

Store Large Files

BSON