Docs Menu
Docs Home
/ / /
Mongoid
/

Transform Your Data with Aggregation

On this page

  • Overview
  • Compare Aggregation and Find Operations
  • Mongoid Builders
  • Example
  • Aggregation without Builders
  • Example
  • Additional Information
  • API Documentation

In this guide, you can learn how to use Mongoid to perform aggregation operations.

Aggregation operations process data in your MongoDB collections and return computed results. The MongoDB Aggregation framework, which is part of the Query API, is modeled on the concept of data processing pipelines. Documents enter a pipeline that contains one or more stages, and this pipeline transforms the documents into an aggregated result.

Aggregation operations function similarly to car factories with assembly lines. The assembly lines have stations with specialized tools to perform specific tasks. For example, when building a car, the assembly line begins with the frame. Then, as the car frame moves through the assembly line, each station assembles a separate part. The result is a transformed final product, the finished car.

The assembly line represents the aggregation pipeline, the individual stations represent the aggregation stages, the specialized tools represent the expression operators, and the finished product represents the aggregated result.

The following table lists the different tasks you can perform with find operations, compared to what you can achieve with aggregation operations. The aggregation framework provides expanded functionality that allows you to transform and manipulate your data.

Find Operations
Aggregation Operations
Select certain documents to return
Select which fields to return
Sort the results
Limit the results
Count the results
Select certain documents to return
Select which fields to return
Sort the results
Limit the results
Count the results
Rename fields
Compute new fields
Summarize data
Connect and merge data sets

You can construct an aggregation pipeline by using Mongoid's high-level domain-specific language (DSL). The DSL supports the following aggregation pipeline operators:

Operator
Method Name

$group

group

project

unwind

To create an aggregation pipeline by using one of the preceding operators, call the corresponding method on an instance of Criteria. Calling the method adds the aggregation operation to the pipeline atrritbure of the Criteria instance. To run the aggregation pipeline, pass the pipeline attribute value to the Collection#aggregate method.

Consider a database that contains a collection with documents that are modeled by the following classes:

class Tour
include Mongoid::Document
embeds_many :participants
field :name, type: String
field :states, type: Array
end
class Participant
include Mongoid::Document
embedded_in :tour
field :name, type: String
end

In this example, the Tour model represents the name of a tour and the states it travels through, and the Participant model represents the name of a person participating in the tour.

The following example creates an aggregation pipeline that outputs the states a participant has visited by using the following aggregation operations:

  • match, which find documents in which the participants.name field value is "Serenity"

  • unwind, which deconstructs the states array field and outputs a document for each element in the array

  • group, which groups the documents by the value of their states field

  • project, which prompts the pipeline to return only the _id and states fields

criteria = Tour.where('participant.name' => 'Serenity').
unwind(:states).
group(_id: 'states', :states.add_to_set => '$states').
project(_id: 0, states: 1)
@states = Tour.collection.aggregate(criteria.pipeline).to_json
[{"states":["OR","WA","CA"]}]

You can use the Collection#aggregate method to run aggregation operations that do not have corresponding builder methods by passing in an array of aggregation operations. Using this method to perform the aggregation returns raw BSON::Document objects rather than Mongoid::Document model instances.

Consider a database that contains a collection with documents that are modeled by the following classes:

class Band
include Mongoid::Document
has_many :tours
has_many :awards
field :name, type: String
end
class Tour
include Mongoid::Document
belongs_to :band
field :year, type: Integer
end
class Award
include Mongoid::Document
belongs_to :band
field :name, type: String
end

The following example creates an aggregation pipeline to retrieve all bands that have toured since 2000 and have at least 1 award:

band_ids = Band.collection.aggregate([
{ '$lookup' => {
from: 'tours',
localField: '_id',
foreignField: 'band_id',
as: 'tours',
} },
{ '$lookup' => {
from: 'awards',
localField: '_id',
foreignField: 'band_id',
as: 'awards',
} },
{ '$match' => {
'tours.year' => {'$gte' => 2000},
'awards._id' => {'$exists' => true},
} },
{'$project' => {_id: 1}},
])
bands = Band.find(band_ids.to_a)
[
{"_id": "...", "name": "Deftones" },
{"_id": "...", "name": "Tool"},
...
]

Tip

The preceding example projects only the _id field of the output documents. It then uses the projected results to find the documents and return them as Mongoid::Document model instances. This optional step is not required to run an aggregation pipeline.

To view a full list of aggregation operators, see Aggregation Operators.

To learn about assembling an aggregation pipeline and view examples, see Aggregation Pipeline.

To learn more about creating pipeline stages, see Aggregation Stages.

To learn more about any of the methods discussed in this guide, see the following API documentation:

Back

Modify Query Results