Optimizing Data Ingestion for High-Frequency IoT Sensors in MongoDB Time Series Database

deep_jagani · November 24, 2023, 6:43am

Hello MongoDB community,

I am currently working on a project that involves handling a high data ingestion rate from over 300 IoT sensors, where data is generated every 10 seconds. The database structure comprises three collections: device, variables, and values, forming a time series database.

I’m seeking advice on the most efficient approach to design and write data to the MongoDB Time Series Database given this scenario. Specifically, I would like recommendations on:

Schema Design: How should I structure the collections (device, variables, values) to optimize for high-frequency data ingestion while maintaining query efficiency?
Indexing: What indexing strategies are recommended for optimizing write operations on a time series database with a high volume of incoming data?
Write Concerns: Considering the high data ingestion rate, what are the recommended write concern settings to balance data durability and ingestion performance?

Any insights, best practices, or experiences you can share regarding optimizing MongoDB for time series data with a large number of IoT sensors would be greatly appreciated.

Thank you in advance for your assistance!

Kushagra_Kesav · November 24, 2023, 10:39am

Hey @deep_jagani,

Thanks for reaching out to the MongoDB Community forums

When designing a TimeSeries schema, your choices should be guided by how you intend to retrieve and work with your data, as well as how frequently data ingestion occurs.

If you prefer straightforward and easy queries, creating a single TimeSeries collection with a document for each timestamp at seconds granularity, containing the IoT device name as metadata followed by its value, is a good option. This approach simplifies your queries, avoiding unnecessary complexity and processing.

The MongoDB 6.3 and later automatically creates a compound index on the time and metadata fields for new time series collections. Consequently, if there’s a need to improve the sort perfromance you can create secondary indexes.

Starting with MongoDB 5.0, the default durability guarantee has been elevated to the majority (w:majority) write concern. This means that write success will now only be acknowledged to the application once it has been committed and persisted to disk on a majority of replicas. To read more, please refer to the https://www.mongodb.com/blog/post/default-majority-write-concern-providing-stronger-durability-guarantees-out-box

Also, you can refer to the following resources to learn more about the Time Series data:

Best regards,
Kushagra

system · November 30, 2023, 4:56am

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.