How to get one record for every hour from startdate to enddate in the mongodb shell or in c#?

How to get one record for every hour from startdate to enddate in the mongodb shell or in c# Linq?

Hi @StackQuery_Inc, welcome to the community.
Can you provide more details on what basis you want to select that particular record among all the records related to the concerened hour?
Anyways, I will try my best to answer it in a way that is independent of the basis to select the specific record.
You will need an aggregation pipeline to achieve this, please follow the aggregation-stages mentioned below in the same order:

  1. use $match to extract all the records that exists between the startdate and enddate
  2. use $addFields to add a field that can be denoted as the hour for that particular document. Please note that you also need to attach the date to uniquely identify that hour from any other date. For e.g. 04:00 on 5th Feb is different from 04:00 on 6th Feb.
    Hence, a document having a createdAt field as:
    { createdAt: "2021-02-05T04:15:39.736Z"},
    you can create a new field as below:
    { uniqueHour: "05-02-2021-04" }
    by using $dateToString and all the documents that were created between "2021-02-05T04:00:00.000Z" and "2021-02-05T04:59:59.999Z" will have the same "uniqueHour" value.
  3. use $group to group all the document based on their "uniqueHour" value, you can choose any accumulator mentioned in this accumulator operator list like $first/$last to select the value from the document of your choice
  4. perform transformations if any

In case you have any doubts, please feel free to reach out to us.

Thanks and Regards.
Sourabh Bagrecha,
Curriculum Services Engineer

Hi Sourabh,

Thanks for the help.

I have around 600 documents for every hour.

I would like to select only one record for every hour either first or last in between startdate and endate.

Can you help with the mongo query?

Thanks,
stack dev

I got below code but it is taking forever to load, can you tell me where I am wrong @SourabhBagrecha

    db.collection.aggregate([
 	{
		$match: {
			$and: [{
				createdAt: {
					$gte: ISODate("2021-01-05T12:29:30.000+00:00")
				}
			}, {
				createdAt: {
					$lte: ISODate("2021-02-05T12:29:30.000+00:00")
				}
			}]
		}
	},
   {
		$addFields: {
			uniqueHour: {
				$dateToString: {
					format: "%Y-%m-%d-%H",
					date: "$createdAt"
				}
			}
		}
	},
	{
		$group: {
			_id: "$uniqueHour",
		     first: { $first : "$$ROOT" }
		}
	}
])

Hi @StackQuery_Inc, the query looks good to me, do you have an index for the createdAt field?
If not, please create an index for the createdAt field, and try again. Also, use .explain() to get an idea of how it’s performing.

Also, you can further optimize this pipeline, by removing the “$addFields” stage:

[
  {
    $match: {
      $and: [
        {
          createdAt: {
            $gte: ISODate("2021-01-05T12:29:30.000+00:00"),
          },
        },
        {
          createdAt: {
            $lte: ISODate("2021-02-05T12:29:30.000+00:00"),
          },
        },
      ],
    },
  },
  {
    $group: {
      _id: {
        $dateToString: {
          date: "$createdAt",
          format: "%Y-%m-%d-%H",
        },
      },
      first: { $first: "$$ROOT" },
    },
  },
];

Also, please make sure that you are using proper $limits as well, which can again drastically improve the performance by reducing the amount of data that needs to be transferred from Atlas(or any other MongoDB instance) to your app.

I hope it helps.

In case you have any doubts, please feel free to reach out to us.

Thanks and Regards.
Sourabh Bagrecha

1 Like