Map/Reduce
On this page
Mongoid provides a DSL around MongoDB's map/reduce framework, for performing custom map/reduce jobs or simple aggregations.
Note
The map-reduce operation is deprecated. The aggregation framework provides better performance and usability than map-reduce operations, and should be preferred for new development.
Execution
You can tell Mongoid off the class or a criteria to perform a map/reduce
by calling map_reduce
and providing map and reduce javascript
functions.
map = %Q{ function() { emit(this.name, { likes: this.likes }); } } reduce = %Q{ function(key, values) { var result = { likes: 0 }; values.forEach(function(value) { result.likes += value.likes; }); return result; } } Band.where(:likes.gt => 100).map_reduce(map, reduce).out(inline: 1)
Just like criteria, map/reduce calls are lazily evaluated. So nothing will hit the database until you iterate over the results, or make a call on the wrapper that would need to force a database hit.
Band.map_reduce(map, reduce).out(replace: "mr-results").each do |document| p document # { "_id" => "Tool", "value" => { "likes" => 200 }} end
The only required thing you provide along with a map/reduce is where to
output the results. If you do not provide this an error will be raised.
Valid options to #out
are:
inline: 1
: Don't store the output in a collection.replace: "name"
: Store in a collection with the provided name, and overwrite any documents that exist in it.merge: "name"
: Store in a collection with the provided name, and merge the results with the existing documents.reduce: "name"
: Store in a collection with the provided name, and reduce all existing results in that collection.
Raw Results
Results of Map/Reduce execution can be retrieved via the execute
method
or its aliases raw
and results
:
mr = Band.where(:likes.gt => 100).map_reduce(map, reduce).out(inline: 1) mr.execute # => {"results"=>[{"_id"=>"Tool", "value"=>{"likes"=>200.0}}], "timeMillis"=>14, "counts"=>{"input"=>4, "emit"=>4, "reduce"=>1, "output"=>1}, "ok"=>1.0, "$clusterTime"=>{"clusterTime"=>#<BSON::Timestamp:0x00005633c2c2ad20 @seconds=1590105400, @increment=1>, "signature"=>{"hash"=><BSON::Binary:0x12240 type=generic data=0x0000000000000000...>, "keyId"=>0}}, "operationTime"=>#<BSON::Timestamp:0x00005633c2c2aaf0 @seconds=1590105400, @increment=1>}
Statistics
MongoDB servers 4.2 and lower provide Map/Reduce execution statistics. As of MongoDB 4.4, Map/Reduce is implemented via the aggregation pipeline and statistics described in this section are not available.
The following methods are provided on the MapReduce
object:
counts
: Number of documents read, emitted, reduced and output through the pipeline.input
,emitted
,reduced
,output
: individual count methods. Note thatemitted
andreduced
methods are named differently from hash keys incounts
.time
: The time, in milliseconds, that Map/Reduce pipeline took to execute.
The following code illustrates retrieving the statistics:
mr = Band.where(:likes.gt => 100).map_reduce(map, reduce).out(inline: 1) mr.counts # => {"input"=>4, "emit"=>4, "reduce"=>1, "output"=>1} mr.input # => 4 mr.emitted # => 4 mr.reduced # => 1 mr.output # => 1 mr.time # => 14
Note
Each statistics method invocation re-executes the Map/Reduce pipeline.
The results of execution are not stored by Mongoid. Consider using the
execute
method to retrieve the raw results and obtaining the statistics
from the raw results if multiple statistics are desired.