Navigation

Aggregation Pipeline

Mongoid exposes MongoDB’s aggregation pipeline, which is used to construct flows of operations that process and return results. The aggregation pipeline is a superset of the deprecated map/reduce framework functionality.

Basic Usage

Querying Across Multiple Collections

The aggregation pipeline may be used for queries involving multiple referenced associations at the same time:

class Band
  include Mongoid::Document
  has_many :tours
  has_many :awards
  field :name, type: String
end

class Tour
  include Mongoid::Document
  belongs_to :band
  field :year, type: Integer
end

class Award
  include Mongoid::Document
  belongs_to :band
  field :name, type: String
end

To retrieve bands that toured since 2000 and have at least one award, one could do the following:

band_ids = Band.collection.aggregate([
  { '$lookup' => {
    from: 'tours',
    localField: '_id',
    foreignField: 'band_id',
    as: 'tours',
  } },
  { '$lookup' => {
    from: 'awards',
    localField: '_id',
    foreignField: 'band_id',
    as: 'awards',
  } },
  { '$match' => {
    'tours.year' => {'$gte' => 2000},
    'awards._id' => {'$exists' => true},
  } },
  {'$project' => {_id: 1}},
])
bands = Band.find(band_ids)

Note that the aggregation pipeline, since it is implemented by the Ruby driver for MongoDB and not Mongoid, returns raw BSON::Document objects rather than Mongoid::Document model instances. The above example projects only the _id field which is then used to load full models. An alternative is to not perform such a projection and work with raw fields, which would eliminate having to send the list of document ids to Mongoid in the second query (which could be large).

Builder DSL

Mongoid provides limited support for constructing the aggregation pipeline itself using a high-level DSL. The following aggregation pipeline operators are supported:

To construct a pipeline, call the corresponding aggregation pipeline methods on a Criteria instance. Aggregation pipeline operations are added to the pipeline attribute of the Criteria instance. To execute the pipeline, pass the pipeline attribute value to Collection#aggragegate method.

For example, given the following models:

class Tour
  include Mongoid::Document

  embeds_many :participants

  field :name, type: String
  field :states, type: Array
end

class Participant
  include Mongoid::Document

  embedded_in :tour

  field :name, type: String
end

We can find out which states a participant visited:

criteria = Tour.where('participants.name' => 'Serenity',).
  unwind(:states).
  group(_id: 'states', :states.add_to_set => '$states').
  project(_id: 0, states: 1)

pp criteria.pipeline
# => [{"$match"=>{"participants.name"=>"Serenity"}},
#     {"$unwind"=>"$states"},
#     {"$group"=>{"_id"=>"states", "states"=>{"$addToSet"=>"$states"}}},
#     {"$project"=>{"_id"=>0, "states"=>1}}]

Tour.collection.aggregate(criteria.pipeline).to_a

group

The group method adds a $group aggregation pipeline stage.

The field expressions support Mongoid symbol-operator syntax:

criteria = Tour.all.group(_id: 'states', :states.add_to_set => '$states')
criteria.pipeline
# => [{"$group"=>{"_id"=>"states", "states"=>{"$addToSet"=>"$states"}}}]

Alternatively, standard MongoDB aggregation pipeline syntax may be used:

criteria = Tour.all.group(_id: 'states', states: {'$addToSet' => '$states'})

project

The project method adds a $project aggregation pipeline stage.

The argument should be a Hash specifying the projection:

criteria = Tour.all.project(_id: 0, states: 1)
criteria.pipeline
# => [{"$project"=>{"_id"=>0, "states"=>1}}]

unwind

The unwind method adds an $unwind aggregation pipeline stage.

The argument can be a field name, specifiable as a symbol or a string, or a Hash or a BSON::Document instance:

criteria = Tour.all.unwind(:states)
criteria = Tour.all.unwind('states')
criteria.pipeline
# => [{"$unwind"=>"$states"}]

criteria = Tour.all.unwind(path: '$states')
criteria.pipeline
# => [{"$unwind"=>{:path=>"$states"}}]
←   Text Search Map/Reduce  →