Counting the total per type per day

Hey, there I’m trying to figure out how to get the total number of users for each user type per day. What I want is to filter all records between between two dates, then for each day I then sum the number of users for every role so the outpost looks like below

[
    {
        "date": "2023-04-05T01:43:51.661Z",
        "role_1": 7,
        "role_2":23,
        "role_3":15
    },
      {
        "date": "2023-04-06T01:43:51.661Z",
        "role_1": 2,
        "role_2":10,
        "role_3":18
    },
etc...
]

My current code is as follows

// Construct the aggregation pipeline
        let pipeline: any = [];

        if (params.startDate && params.endDate) {
            pipeline.push(
                {
                    $match: {
                        createdAt: {
                            $gte: new Date(params.startDate),
                            $lte: new Date(params.endDate)
                        }
                    }
                }
            );
        } else {
            // Condition to match either start date or end date for custom ranges where either one or both are provided
            pipeline.push(
                {
                    $match: {
                        $or: [
                            {
                                "createdAt": {
                                    $gte: new Date(params.startDate)
                                }
                            },
                            {
                                "createdAt": {
                                    $lte: new Date(params.endDate)
                                }
                            }
                        ]
                    }
                }
            );
        }

        // Include only necessary fields in the result
        pipeline.push({
            $project: {
                _id: 0, // Exclude the _id field
                createdAt: 1,
                role: 1,
            }
        });

        pipeline.push({
            $group: {
                _id: {
                    role: "$role"
                },
                count: { $sum: 1 }
            }
        });

Which gives an overall total like the following:

[
  { _id: { role: 4 }, count: 16 },
  { _id: { role: 2 }, count: 6 },
  { _id: { role: 0 }, count: 1 },
  { _id: { role: 3 }, count: 6 },
  { _id: { role: 1 }, count: 2 }
]

I’m not sure how to proceed, in order to break it down per day/date to look like above

Hello @Awakening_Quasar,

Well, It is possible to achieve your required result in an aggregation query, but I would not recommend doing this kind of grouping operation in an aggregation query when your collection has a large number of documents. Try to manipulate your result on your client/server-side application so you can avoid unnecessary load in the database server.

If you still want to do this operation in an aggregation query or just for the knowledge follow the below aggregation query.

  • first $group to group by date and role, convert the date to your required format and get the total count
  • The second $group is to group by just date and get array roles in the array in your desired format, we have prepared a key-value format so we can convert it to an object.
  • $replaceRoot to prepare the final your desired object
  {
    $group: {
      _id: {
        date: {
          $dateToString: {
            date: "$createdAt",
            format: "%Y-%m-%d",
            // timezone: "Put your timezone if you want to convert"
          }
        },
        role: "$role"
      },
      count: { $sum: 1 }
    }
  },
  {
    $group: {
      _id: "$_id.date",
      roles: {
        $push: {
          k: {
            $concat: [
              "role_",
              { $toString: "$_id.role" }
            ]
          },
          v: "$count"
        }
      }
    }
  },
  {
    $replaceRoot: {
      newRoot: {
        $mergeObjects: [
          { date: "$_id" },
          { $arrayToObject: "$roles" }
        ]
      }
    }
  }

You can use the $addFields stage instead of the $replaceRoot stage,

  {
    $addFields: {
      roles: {
        $arrayToObject: "$roles"
      }
    }
  }

There are multiple ways to achieve this result, but make sure to create an index on the createdAt and role properties to avoid the collection scan if you have a huge number of data.

Hey there, thanks for your assistance.

Can you elaborate on why you’d want to do additional manipulation? Is grouping heavily taxing on the mongo driver? I was also contemplating breaking it down into 4 requests that would simply show one user type per view but it wasnt ideal for the graph I was attempting to build.

Besides that I can work with this. I’ll just loop through and create zero values for data consistency.

Hello @Awakening_Quasar,

There are multiple factors you need to understand,

  • As the number of documents increases, so does the processing time required for grouping operations, because it is a blocking stage, that holds each document and does the operations.
  • Efficient use of indexes can improve the performance of grouping operations when grouping by indexed fields.
  • You can create a compound index on (createdAt, role) properties, at some point it is helpful in the group stage but grouping by a single indexed field typically performs faster than grouping by multiple fields or complex expressions (in your case you have to format the date).
  • The hardware resources available to the MongoDB server, such as CPU, memory, and disk I/O, can impact the performance of aggregation operations, including the group stage.

Performing grouping operations on the client-side may be feasible.

Create the required index, can check the performance of the query by explain command, is it using the proper index or not.

For your query, the first 2 stages $match and `$group can take advantage of the index, but I doubt it because there is one field that you need to format the date.

Blocking v/s Streaming Stage
In a streaming stage, MongoDB processes documents as they are received, without needing to wait for the entire input set to be collected. Include stages: $match, $project, $limit, etc.

A blocking stage is one where the entire input set of documents is processed before the stage produces any output. MongoDB collects and processes all the input documents before moving on to the next stage in the aggregation pipeline. Include stages: $group, $sort, $lookup, etc.

For more explanation, you can refer to this book’s topic,

I see, I really appreciate the detailed response since I struggle with aggregation pipelines. I can potentially skip the entire date formatting part and have the client side do that instead.

Thanks again for helping.

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.