Time Series Data in Mongodb

Hi all,
Suppose we insert 10 objects to mongodb time series collection. some of those objects contains common metaField of time series collection. I had expectations that number of documents created will be less than the number of objects inserted. But this is not the case.
Example:

db.createCollection(
    "weather",
    {
       timeseries: {
          timeField: "timestamp",
          metaField: "metadata",
          granularity: "hours"
       }
    }
)
db.weather.insertMany( [
   {
      "metadata": { "sensorId": 5578, "type": "temperature" },
      "timestamp": ISODate("2021-05-18T00:00:00.000Z"),
      "temp": 12
   },
   {
      "metadata": { "sensorId": 5578, "type": "temperature" },
      "timestamp": ISODate("2021-05-18T04:00:00.000Z"),
      "temp": 11
   },
   {
      "metadata": { "sensorId": 5578, "type": "temperature" },
      "timestamp": ISODate("2021-05-18T08:00:00.000Z"),
      "temp": 11
   },
   {
      "metadata": { "sensorId": 5578, "type": "temperature" },
      "timestamp": ISODate("2021-05-18T12:00:00.000Z"),
      "temp": 12
   },
   {
      "metadata": { "sensorId": 5578, "type": "temperature" },
      "timestamp": ISODate("2021-05-18T16:00:00.000Z"),
      "temp": 16
   },
   {
      "metadata": { "sensorId": 5578, "type": "temperature" },
      "timestamp": ISODate("2021-05-18T20:00:00.000Z"),
      "temp": 15
   }, {
      "metadata": { "sensorId": 5578, "type": "temperature" },
      "timestamp": ISODate("2021-05-19T00:00:00.000Z"),
      "temp": 13
   },
   {
      "metadata": { "sensorId": 5578, "type": "temperature" },
      "timestamp": ISODate("2021-05-19T04:00:00.000Z"),
      "temp": 12
   },
   {
      "metadata": { "sensorId": 5578, "type": "temperature" },
      "timestamp": ISODate("2021-05-19T08:00:00.000Z"),
      "temp": 11
   },
   {
      "metadata": { "sensorId": 5578, "type": "temperature" },
      "timestamp": ISODate("2021-05-19T12:00:00.000Z"),
      "temp": 12
   },
   {
      "metadata": { "sensorId": 5578, "type": "temperature" },
      "timestamp": ISODate("2021-05-19T16:00:00.000Z"),
      "temp": 17
   },
   {
      "metadata": { "sensorId": 5578, "type": "temperature" },
      "timestamp": ISODate("2021-05-19T20:00:00.000Z"),
      "temp": 12
   }
] )
> db.weather.countDocuments({})
12

Number of documents seen in the collection are 12 (which is same as number of objects inserted). Why it did not merge the objects together in single document.

Hey :wave: @Yogesh_Sonawane1,

Welcome to the MongoDB Community forums :sparkles:

Thanks for asking the question.

In MongoDB, the time-series collection follows a bucketing pattern to store the data in an optimized format. So, when you create a time-series collection, it creates 3 collections within the same database out of which 2 are internal collections:

For example, in your case, you have created a time-series collection for weather data. The weather collection acts as a view that allows you to interact with all the documents and perform operations.

However, the actual data is getting stored in a bucketing pattern (as you mentioned, merging the objects together into a single document also known as a bucket) in the db.system.buckets.weather collection.

In a TimeSeries Collection, buckets are created based on metadata and timeStamp data, which automatically organizes the time series data into an optimized format.

In order to understand let’s look at the output of the countDocuments() method for both the weather collection and the db.system.buckets.weather collection:

test > db.weather.countDocuments()
12
test > db.system.buckets.weather.countDocuments()
1

which means that the 12 documents are stored in a single bucket within an internal collection.

You can also view the collection and its document by executing the following command in the mongo shell:

db.system.buckets.weather.find()

We strongly advise against making any alterations/modifications to the internal collection data, as this could result in data loss.

Sharing sample bucket document for your reference:

{
  _id: ObjectId("6405b9a0b11b6de5c9ada789"),
  control: {
    version: 1,
    min: {  						// holds the min value of each field as determined by BSON comparison 
      timestamp: ISODate("2023-03-06T10:00:00.000Z"),
      measure: -20,
      _id: ObjectId("642d1bb1ab4b695fbb4415ce")
    },
    max: {  						// holds the max value of each field as determined by BSON comparison 
      timestamp: ISODate("2023-03-06T10:59:47.478Z"),
      measure: 78,
      _id: ObjectId("642d1bb1ab4b695fbb4415f9")
    }
  },
  meta: { unit: 'Celsius' },
  data: {
    timestamp: {
      '0': ISODate("2023-03-06T10:59:47.478Z"),
     ..
      '11': ISODate("2023-03-06T10:16:47.478Z")
    },
    measure: {
      '0': 49,
     ..
      '11': 56
    },
    _id: {
      '0': ObjectId("642d1bb1ab4b695fbb4415ce"),
      ..
      '11': ObjectId("642d1bb1ab4b695fbb4415f9")
    }
  }
}

I hope it clarifies your doubt.

Regards,
Kushagra

3 Likes

Thank you so much for detailed answer. This clarified my doubt. Thanks again.

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.