How to use $bucketAuto to divide into evenly distributed buckets

Kushagra_Kesav · March 23, 2023, 3:09am

Welcome back to the MongoDB Community forums

The $bucketAuto accepts an optional granularity parameter which ensures that the boundaries of all buckets adhere to a specified preferred number series. So, here if you don’t specify the granularity it will distribute it automatically in equal sets**.

**Note: It will depend purely on the incoming number of the document.

For example:

A collection of things have an _id numbered from 1 to 100:

{ _id: 1 }
{ _id: 2 }
...
{ _id: 100 }

If I use the $bucketAuto without specifying the granularity, it will distribute it into equal counts of 20 documents.

db.things.aggregate( [
  {
    $bucketAuto: {
      groupBy: "$_id",
      buckets: 5,
      granularity: <No granularity>
    }
  }
] )

{ "_id" : { "min" : 1, "max" : 21 }, "count" : 20 }
{ "_id" : { "min" : 21, "max" : 41 }, "count" : 20 }
{ "_id" : { "min" : 41, "max" : 61 }, "count" : 20 }
{ "_id" : { "min" : 61, "max" : 81 }, "count" : 20 }
{ "_id" : { "min" : 81, "max" : 100 }, "count" : 20 }

But if I just increase one more document in the collection the result will be not consistent across each bucket.

{ "_id" : { "min" : 1, "max" : 21 }, "count" : 20 }
{ "_id" : { "min" : 21, "max" : 41 }, "count" : 20 }
{ "_id" : { "min" : 41, "max" : 61 }, "count" : 20 }
{ "_id" : { "min" : 61, "max" : 81 }, "count" : 20 }
{ "_id" : { "min" : 81, "max" : 101 }, "count" : 21 }

It is simply because 101 is not wholly divided by 5. Overall, using $bucketAuto we can specify the number of buckets, but not the number of documents each bucket will contain.

I hope it answers your question.

Best,
Kushagra