- Aggregation >
- Aggregation Concepts >
- Aggregation Mechanics >
- Aggregation Pipeline and Sharded Collections
Aggregation Pipeline and Sharded Collections¶
On this page
The aggregation pipeline supports operations on sharded collections. This section describes behaviors specific to the aggregation pipeline and sharded collections.
Behavior¶
Changed in version 2.6.
When operating on a sharded collection, the aggregation pipeline is
split into two parts. The first pipeline runs on each shard, or if an
early $match
can exclude shards through the use of the
shard key in the predicate, the pipeline runs on only the relevant
shards.
The second pipeline consists of the remaining pipeline stages and runs
on the primary shard. The primary shard merges
the cursors from the other shards and runs the second pipeline on these
results. The primary shard forwards
the final results to the mongos
. In previous versions, the
second pipeline would run on the mongos
.
[1]
Optimization¶
When splitting the aggregation pipeline into two parts, the pipeline is split to ensure that the shards perform as many stages as possible with consideration for optimization.
To see how the pipeline was split, include the explain
option in the
db.collection.aggregate()
method.
Optimizations are subject to change between releases.
[1] | Until all shards upgrade to v2.6, the second pipeline runs on the
mongos if any shards are still running v2.4. |