An aggregation over previous aggregation results

Hi all !
I need to perform aggregation over previous aggregation results.
I implemented this as follows:

  1. Do the first aggregation and write the result to a new collection through $out.
  2. Do the second aggregation of the new collection.
  3. Delete the new collection.

Is there a more elegant way to perform the same procedure without $out ?
I use this operation on second (slave) node of my cluster.
I am worried if there will be problems with replication after a change of leadership.

My codeblock (Python):

prepare test data
   {"item": 1, "link": "AAA"},
   {"item": 2, "link": "AAA"},
   {"item": 3, "link": "BBB"},
   {"item": 4, "link": "BBB"},
   {"item": 5, "link": "CCC"}
initialize collection

test_document = db.test_document

aggregation 1
test_document.aggregate([  { "$group": { "_id": { "link": "$link"}, "count": { "$sum": 1} } },
                                             { "out": "agg_res_temp"}

_id: AAA, count: 2
_id: BBB, count: 2
_id: CCC, count: 1

initialize new collection
agg_res_temp = db.agg_res_temp
agg_res_temp.aggregate([ { "$group": { "_id": "$count", "count": { "$sum": 1} } } ])

_id: 2, count: 2
_id: 1, count: 1

Thanks a lot !

if you don’t need the temporary collection for anything else, why don’t you do that with a single aggregate like

test_document.aggregate([  { "$group": { "_id": { "link": "$link"}, "count": { "$sum": 1} } },
                                             { "$group": { "_id": "$count", "count": { "$sum": 1} } }

At each step of the aggregation pipeline, you have access to the results of the previous step.

1 Like

@RemiJ, thanks a lot for the tip !
This approach solves my problem.
I didn’t think that you can use the $group this way.
Now I will definitely use it :blush:

This composability aspect is described here: Embrace Composability For Increased Productivity - Practical MongoDB Aggregations Book