How to compute frequency for multiple fields using a single pipeline in MongoDB

It turns out you can do this in a single pipeline and you don’t need $facet at all (which basically emulates multiple pipelines). Here’s how (using sample data that matches your field counts):

db.frequency.find()
{ "_id" : ObjectId("6112ece6bba8077232b072be"), "field1" : "value1", "field2" : "v2" }
{ "_id" : ObjectId("6112ece9bba8077232b072bf"), "field1" : "value1", "field2" : "v1" }
{ "_id" : ObjectId("6112ecebbba8077232b072c0"), "field1" : "value1", "field2" : "v1" }
{ "_id" : ObjectId("6112ecebbba8077232b072c1"), "field1" : "value1", "field2" : "v2" }
{ "_id" : ObjectId("6112ecf0bba8077232b072c2"), "field1" : "value2", "field2" : "v2" }
{ "_id" : ObjectId("6112ecf3bba8077232b072c3"), "field1" : "value2", "field2" : "v1" }
{ "_id" : ObjectId("6112ecf4bba8077232b072c4"), "field1" : "value2", "field2" : "v1" }
{ "_id" : ObjectId("6112ecf8bba8077232b072c5"), "field1" : "value2", "field2" : "v1" }
{ "_id" : ObjectId("6112ed21bba8077232b072c6"), "field1" : "value1", "field2" : "v1" }
{ "_id" : ObjectId("6112ed27bba8077232b072c7"), "field1" : "value1", "field2" : "v1" }
{ "_id" : ObjectId("6112ed27bba8077232b072c8"), "field1" : "value1", "field2" : "v1" }
{ "_id" : ObjectId("6112ed28bba8077232b072c9"), "field1" : "value1", "field2" : "v1" }
{ "_id" : ObjectId("6112ed29bba8077232b072ca"), "field1" : "value1", "field2" : "v1" }
{ "_id" : ObjectId("6112ed34bba8077232b072cb"), "field1" : "value1", "field2" : "v1" }
db.frequency.aggregate([
    {$project:{_id:0}},
    {$project:{x:{$objectToArray:"$$ROOT"}}},
    {$unwind:"$x"},
    {$group:{_id:{f:"$x.k",v:"$x.v"},count:{$sum:1}}},
    {$group:{_id:"$_id.f", values:{$push:{k:"$_id.v", v:"$count"}}}},
    {$group:{_id:0, all:{$push:{k:"$_id", v:{$arrayToObject:"$values"}}}}}, 
    {$replaceWith:{$arrayToObject:"$all"}}
]).pretty()
{
"field2" : {
	"v2" : 3,
	"v1" : 11
},
"field1" : {
	"value2" : 4,
	"value1" : 10
}
}

To explain, after removing _id, I convert the object into an array of key/value pairs, unwind them, then I group by key+value to get counts for each distinct field value, then group by field names (keys) to get things summarized by each field and then group by 0 to get a single document result. Pushing things into array of "k"/"v" subdocuments allows me to convert it back to an object where each field has subdocument of values and their counts.

Asya

2 Likes