Returning an array after unwinding and calculating distribution

I’m trying to create an aggregation that will compute the distribution of values across an array of objects and return an array of computed values.

Here is a sample document

[
  {
    "duration": 1208,
    "dataPoints": 2,
    "binMin": 0,
    "binMax": 5000
  },
  {
    "duration": 25735,
    "dataPoints": 3,
    "binMin": 5000,
    "binMax": 10000
  },
  {
    "duration": 0,
    "dataPoints": 0,
    "binMin": 10000,
    "binMax": 20000
  },
  {
    "duration": 54088,
    "dataPoints": 2,
    "binMin": 20000,
    "binMax": 28817
  }
]

I need to add up the durations for each object, then compute the distribution across each object and return a new array like so:

[
  {
    "duration": 1208,
    "dataPoints": 2,
    "binMin": 0,
    "binMax": 5000,
    "ratio": 0.014907874763979
  },
  {
    "duration": 25735,
    "dataPoints": 3,
    "binMin": 5000,
    "binMax": 10000,
    "ratio": 0.317594500870037
  },
  {
    "duration": 0,
    "dataPoints": 0,
    "binMin": 10000,
    "binMax": 20000,
    "ratio": 0
  },
  {
    "duration": 54088,
    "dataPoints": 2,
    "binMin": 20000,
    "binMax": 28817,
    "ratio": 0.667497624365983
  }
]

I am able to calculate the totalDuration and divide to get the ratio value but it seems to be only doing it to the first element of the array.

This is my aggregation so far:

[{$project: {
    _id: '$_id',
    username: 1,
    uuid: 1,
    data: '$stats.dataHistogram'
   }}, {$unwind: {
    path: '$data'
   }}, {$group: {
    _id: '$_id',
    data_bin: {
     $first: '$data'
    },
    total_duration: {
     $sum: '$data.duration'
    }
   }}, {$project: {
    _id: '$_id',
    total_duration: 1,
    data_bin: 1,
    ratio: {
     $divide: [
      '$data_bin.duration',
      {
       $add: [
        '$total_duration',
        1
       ]
      }
     ]
    }
   }}]

(I’m adding a 1 to the $total_duration because it can be 0 some times and I get a “Cannot divide by zero” error)

I feel like I’m super close but not sure what the next steps should be. Thanks for the help!

Hi @Hasan_Khan1 ,

Since you use the $first operator it will only use the first value .

You have 2 options or to use $push instead of $first and then unwind again deviding each by total or use $map to traverse the pushed array and devide.

I recommend installing MongoDB compass and using aggregation builder … Makes it so much easier believe me

Please note that 5.0 MongoDB has $setWindowFunction to perform window functions that can help in this case :

Thanks
Pavel

Wait, why are you using $unwind at all? Aren’t these values inside of an array in a single document? If you don’t need to aggregate things across different documents, you should not be using $unwind/$group at all.

Try this assuming the array is called data:

  {$addFields: {totalDuration: {$sum:"$data.duration"}}}
  {$addFields: {data: {$map:{ input: "$data", in: {$mergeObjects:[ "$$this", { ratio:{$divide:[ "$$this.duration", "$totalDuration"]}}]}}}}}

Asya

1 Like