Sum in each floor by category in aggregation

I have a MongoDB collection as below picture (sample).

I want to have a sum of each category one after another. If I use ‘$group’ aggregation, I will have the sum of all categories, which is not my desired result.

Can anyone help me in this case?

I have no idea what to try.

1 Like

You mentioned you want the sum in each floor in your title yet your sample data has no field that we could interpret as a floor. It is also not clear what criteria make the A of 2021-01-01 any different from the A of 2023-01-08 but it looks you do not want them to be sum together.

Please share your sample documents in text form so that we can cut-n-paste into our system to experiment with your issue.

Also share the aggregation you tried and explain how it fails from your expected result.

Hello Steeve,
thanks for reply.
I did some changes to make the data clear.
here are my data:

[
{
User: "Edvin",
Date: "2023-01-01",
City : "Paris",
Cost: 10
},
{
User: "Edvin",
Date: "2023-01-02",
City : "Paris",
Cost: 7
},
{
User: "Edvin",
Date: "2023-01-03",
City : "Milan",
Cost: -9
},
{
User: "Edvin",
Date: "2023-01-04",
City : "Milan",
Cost: 6
},
{
User: "Edvin",
Date: "2023-01-05",
City : "Milan",
Cost: 25
},
{
User: "Edvin",
Date: "2023-01-06",
City : "Berlin",
Cost: 3
},
{
User: "Edvin",
Date: "2023-01-07",
City : "Berlin",
Cost: 17.5
},
{

User: "Edvin",
Date: "2023-01-08",
City : "Paris",
Cost: 4
},
{
User: "Edvin",
Date: "2023-01-9",
City : "Paris",
Cost: 5
},
{
User: "Edvin",
Date: "2023-01-10",
City : "Milan",
Cost: 1
},
{
User: "Edvin",
Date: "2023-01-11",
City : "Paris",
Cost: 9
},
{
User: "Edvin",
Date: "2023-01-12",
City : "Berlin",
Cost: 12
},
{
User: "Edvin",
Date: "2023-01-13",
City : "Berlin",
Cost: 1.5
},
{
User: "Edvin",
Date: "2023-01-14",
City : "Berlin",
Cost: 3.2
},
]

What i’m trying to do is to find each user Timeline including Cost.
(From the first line of my row data)Assume Edvin went Paris on 2023-01-01 and he left Paris on 2023-01-02, then he went to Milan on 2023-01-03 and left on 2023-01-05 and so on.
I want some of cost by each destination but step by step.
i.e:
1: Paris - 2023-01-02($last date), total cost: 17
2: Milan - 2023-01-05 - total cost: 22
3- Berlin - 2023-01-07 - total cost: 20.5
4- Paris - 2023-01-09 - total cost: 9
and so on

I really have no idea what to do.
If i use $group, it returns 3 cities overall.
I attached my desire result in a chart to make the question clear.

Thanks for the documents and the extra explication.

That is very interesting and I do not have an obvious solution at hand.

A possible solution to investigate is:

1 - $match the user
2 - $sort on date
3 - $group _id:null, with $push
4 - $set with a $reduce to create an array of each leg (the hard part)
5 - $set with a $reduce on the legs array that $sum the values

1 Like

@Hermann_Samimi I’m sure this is possible to do with aggregation exclusively, but assuming you’re returning a sorted set of data you can do this with JavaScript directly as follows:

var current = { City: null, Cost: null };
var groups = [];
db.aggregate([{
  $documents: [
    { User: "Edvin", Date: "2023-01-01", City: "Paris", Cost: 10 },
    { User: "Edvin", Date: "2023-01-02", City: "Paris", Cost: 7 }, 
    { User: "Edvin", Date: "2023-01-03", City: "Milan", Cost: -9 }, 
    { User: "Edvin", Date: "2023-01-04", City: "Milan", Cost: 6 }, 
    { User: "Edvin", Date: "2023-01-05", City: "Milan", Cost: 25 }, 
    { User: "Edvin", Date: "2023-01-06", City: "Berlin", Cost: 3 }, 
    { User: "Edvin", Date: "2023-01-07", City: "Berlin", Cost: 17.5 }, 
    { User: "Edvin", Date: "2023-01-08", City: "Paris", Cost: 4 }, 
    { User: "Edvin", Date: "2023-01-9", City: "Paris", Cost: 5 }, 
    { User: "Edvin", Date: "2023-01-10", City: "Milan", Cost: 1 }, 
    { User: "Edvin", Date: "2023-01-11", City: "Paris", Cost: 9 }, 
    { User: "Edvin", Date: "2023-01-12", City: "Berlin", Cost: 12 }, 
    { User: "Edvin", Date: "2023-01-13", City: "Berlin", Cost: 1.5 }, 
    { User: "Edvin", Date: "2023-01-14", City: "Berlin", Cost: 3.2 },  
]}]).forEach((d) => {  
  if (current.City != d.City) {    
    if (current.Cost != null) {
      groups.push(current);
    }
    current = { City: d.City, Cost: d.Cost }; 
  } else {
    current.Cost += d.Cost;
  }  
});
groups.push(current);
console.log(groups);

Note that $documents here is just used to avoid creating a collection to insert the data into, but you could replace that with a db.coll.find({ .... }).sort({ Date: 1 }) just as easily.

1 Like

Everything can be done with any language when you download all the data you need from the server. The challenge is to reduce bandwidth and profit from the fact that usually the server has more power. The hard part is to implement the same algorithm as the forEach().

And since I am always up for a challenge here is something that almost work and can be made to work with a few tweaks.

function match()
{
   return { "$match" : {
      "User" : "Edvin"
   } }
}

function sort()
{
   return { "$sort" : {
      "Date" : 1
   } }
}

function group()
{
   return { "$group" : {
      "_id" : null ,
      "_tmp_cost" : { "$push" : {
         "_group_city" : "$City" ,
         "_group_cost" : "$Cost"
      } }
   } }
}
function reduce()
{
   return { "$set" : { "_tmp_legs" : { "$reduce" : {
      "input" : "$_tmp_cost" ,
      "initialValue" : { "_city" : null , "_legs" : [] , "_leg_cost" : 0 } ,
      "in" : { "$cond" : {
         "if" : { "$eq" : [ "$$this._group_city" , "$$value._city" ] } ,
         "then" : {
            "_city" : "$$this._group_city" ,
            "_legs" : "$$value._legs" ,
            "_leg_cost" : { "$sum" : [ "$$value._leg_cost" , "$$this._group_cost" ] }
         } ,
         "else" : {
            "_city" : "$$this._group_city" ,
            "_legs" : {
                  "$concatArrays" : [
                     "$$value._legs" ,
                     [ { "_city" : "$$value._city" , "_leg_cost" : "$$value._leg_cost" } ]
                  ]
            } ,
            "_leg_cost" : "$$this._group_cost"
         }
      } }
   } } } } ;
}

function pipeline()
{
   return [
      match() ,
      sort() ,
      group() ,
      reduce()
   ] ;
}

db.collection.aggregate( pipeline() )

I mentioned almost work because the first element of _legs is not valid and should be remove from the final result. In addition the last leg is _tmp_legs._city with _tmp_legs._leg_cost. But these are cosmetic details that can be easily fixed with a $project or $set. You should also $unset the _tmp fields that are created.

Anyway the calculated values are the one you expect once the sample data is corrected from:

to

{
User: "Edvin",
Date: "2023-01-09",
City : "Paris",
Cost: 5
},

With a easy Saturday morning I have updated the solution with the cosmetic stages I mentioned. I also added the processing of the first and last date, a requirement I missed from the original post.

function match()
{
   return { "$match" : {
      "User" : "Edvin"
   } }
}

function sort()
{
   return { "$sort" : {
      "Date" : 1
   } }
}

function group()
{
   return { "$group" : {
      "_id" : null ,
      "_tmp_cost" : { "$push" : {
         "_group_city" : "$City" ,
         "_group_date" : "$Date" ,
         "_group_cost" : "$Cost"
      } }
   } }
}
function reduce()
{
   return { "$set" : { "_tmp_legs" : { "$reduce" : {
      "input" : "$_tmp_cost" ,
      "initialValue" : { "_city" : null , "_legs" : [] , "_cost" : 0 } ,
      "in" : { "$cond" : {
         "if" : { "$eq" : [ "$$this._group_city" , "$$value._city" ] } ,
         "then" : {
            "_city" : "$$this._group_city" ,
            "_legs" : "$$value._legs" ,
            "_first" : "$$value._first" ,
            "_last" : "$$this._group_date" ,
            "_cost" : { "$sum" : [ "$$value._cost" , "$$this._group_cost" ] }
         } ,
         "else" : {
            "_city" : "$$this._group_city" ,
            "_legs" : { "$concatArrays" : [
               "$$value._legs" ,
               [ { "_city" : "$$value._city" ,
                  "_cost" : "$$value._cost" ,
                  "_first" : "$$value._first" ,
                  "_last" : "$$value._last" ,
               } ]
            ] } ,
            "_first" : "$$this._group_date" ,
            "_last" : "$$this._group_date" ,
            "_cost" : "$$this._group_cost"
         }
      } }
   } } } } ;
}
function set_last_leg()
{
   return { "$set" : {
      "_legs" : { "$concatArrays" : [
         { "$slice" : [
             "$_tmp_legs._legs" ,
             1 ,
             { "$size" : "$_tmp_legs._legs" }
        ] } ,
         [ {
            "_city" : "$_tmp_legs._city" ,
            "_cost" : "$_tmp_legs._cost" ,
            "_first" : "$_tmp_legs._first" ,
            "_last" : "$_tmp_legs._last"
         } ]
      ] }
   } } ;
}

function cleanup()
{
   return { "$project" : {
      "_id" : 0 ,
      "_legs" : "$_legs"
   } } ;
}

function unwind()
{
   return { "$unwind" : "$_legs" }
}
function format()
{
   return { "$project" : {
      "_id" : 0 ,
      "City" : "$_legs._city" ,
      "Cost" : "$_legs._cost" ,
      "First" : "$_legs._first" ,
      "Last" : "$_legs._last" ,
   } } ;
}

function pipeline()
{
   return [
      match() ,
      sort() ,
      group() ,
      reduce() ,
      set_last_leg() ,
      cleanup() ,
      unwind() ,
      format()
   ] ;
}

And for those who have difficulties with modular code, this gives the big monolithic pipeline:

[
{ "$match" : {
      "User" : "Edvin"
   } } ,
{ "$sort" : {
      "Date" : 1
   } } ,
{ "$group" : {
      "_id" : null ,
      "_tmp_cost" : { "$push" : {
         "_group_city" : "$City" ,
         "_group_date" : "$Date" ,
         "_group_cost" : "$Cost"
      } }
   } } ,
{ "$set" : { "_tmp_legs" : { "$reduce" : {
      "input" : "$_tmp_cost" ,
      "initialValue" : { "_city" : null , "_legs" : [] , "_cost" : 0 } ,
      "in" : { "$cond" : {
         "if" : { "$eq" : [ "$$this._group_city" , "$$value._city" ] } ,
         "then" : {
            "_city" : "$$this._group_city" ,
            "_legs" : "$$value._legs" ,
            "_first" : "$$value._first" ,
            "_last" : "$$this._group_date" ,
            "_cost" : { "$sum" : [ "$$value._cost" , "$$this._group_cost" ] }
         } ,
         "else" : {
            "_city" : "$$this._group_city" ,
            "_legs" : { "$concatArrays" : [
               "$$value._legs" ,
               [ { "_city" : "$$value._city" ,
                  "_cost" : "$$value._cost" ,
                  "_first" : "$$value._first" ,
                  "_last" : "$$value._last" ,
               } ]
            ] } ,
            "_first" : "$$this._group_date" ,
            "_last" : "$$this._group_date" ,
            "_cost" : "$$this._group_cost"
         }
      } }
   } } } } ,
{ "$set" : {
      "_legs" : { "$concatArrays" : [
         { "$slice" : [ "$_tmp_legs._legs" , 1 , { "$size" : "$_tmp_legs._legs" } ] } ,
         [ {
            "_city" : "$_tmp_legs._city" ,
            "_cost" : "$_tmp_legs._cost" ,
            "_first" : "$_tmp_legs._first" ,
            "_last" : "$_tmp_legs._last"
         } ]
      ] }
   } } ,
{ "$project" : {
      "_id" : 0 ,
      "_legs" : "$_legs"
   } }  ,
{ "$unwind" : "$_legs" } ,
{ "$project" : {
      "_id" : 0 ,
      "City" : "$_legs._city" ,
      "Cost" : "$_legs._cost" ,
      "First" : "$_legs._first" ,
      "Last" : "$_legs._last" ,
   } }
]

1 Like

Dear @Hermann_Samimi, I have spent some time working on your issue. I would appreciate some feedback and closure. Thank.

1 Like

Dear @steevej , thanks for taking time on my question, really appreciate that.
I marked your answer as the solution.
In my real data set I have more fields, I’m trying to put more field regarding to your logic.

1 Like

Hi @alexbevi
thanks for taking time on my project.
i tried to run your solution, but the result was not satisfying.
here is the result:
[
{
“City”: null,
“Cost”: null
},
{
“City”: “Paris”,
“Cost”: 35
},
{
“City”: “Berlin”,
“Cost”: 37.2
},
{
“City”: “Milan”,
“Cost”: 23
}
]

here is the solution : Sum in each floor by category in aggregation - #8 by steevej

I really apreciate taking time on my project. :pray:

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.