A production ready database schema for my iot data

I’ve a schema for IoT data
Here it is:-

Devices collection : _id , name
Variables collection : _id , name , deviceId (ref)
Values collection : _id, value, timestamp , variableId (ref) this is a timeseries collection

Is this schema good enough to work for millions of data as the expected data rate is 1000 data points per minute?

Please suggest any possible improvements in this if required

Hey @deep_jagani,

Welcome to the MongoDB Community Forums! :leaves:

The schema you described seems reasonable for modeling IOT data. However, I see that you are creating a time-series collection, but have two additional non-time series collections. You can merge them into one ie, making the device and variable collection as metadata. Here’s how the time series collection will then look like:

{
time: new Date("2023-06-13T10:00:00Z"),
value: 25.5,
metafield: {
deviceId: "device001",
deviceName: "Sensor 1"
},
variable: "temperature"
}

Please note that your actual query performance, however, will also depend on the queries that you will be using. A general rule of thumb while doing schema design in MongoDB is that you should design your database in a way that the most common queries can be satisfied by querying a single collection, even when this means that you will have some redundancy in your database. Thus, it may be beneficial to work from the required queries first, making it as simple as possible, and let the schema design follow the query pattern.

I would suggest using mgeneratejs to quickly create sample documents in any number, so the design can be tested easily. Additionally, you can create secondary indexes on TimeSeries collections based on your specific use case. Also, make sure to refer to the Best Practices for TimeSeries.

You can also read our data modeling documentation on modeling IOT data on other tips to model and improve performance IOT data.

Hope this helps. Please feel free to reach out for anything else as well.

Regards,
Satyam

1 Like