Data modeling for OLAP-style over time trends (no IoT)

Manuel_Reil · August 14, 2020, 12:00pm

Hi guys,

We are fairly experienced with data modeling in MongoDB. However, we will have new access patterns in future that are more BI-like/OLAP-like for some of our collections.

Scenario:

we have collection where we store Risks (we are a risk management software). In this collection we store the most current approved version of a risk (hotter).
In another collection we store all the versions of all risks including the most current one (colder).
Currently we have typical OLTP queries like give me all risks given some conditions and sorting or show me some of the earlier versions of one particular risk - all good.
Additional access patterns will be more of a time series/OLAP-style like “how have the risks developed over time?”.
Example: How many risks have we had for each day for the last 3 months? Imagine a line chart with date as x axis and total number of risks as y axis.
Risks and their versions are only created by human beings and don’t follow any update patterns: some risks are modified daily or more often, others are untouched for years.

Naive approach:

We could copy all current versions of the risks into a new collection every day and by this create daily time slices of the full population. We don’t differentiate between touched and untouched risks.
We can aggregate quite well.
Would kind of work, but as a SaaS we are getting more and more customers…
I don’t like these kind of batch and potentially spike behavior.

Questions:

do you know a better, smarter, more efficient modeling idea around this challenge?
Maybe some smart additional properties on the versions collection somehow adding validity time spans?

Cheers,
Manuel

Pavel_Duchovny · August 15, 2020, 1:55pm

Hi @Manuel_Reil,

I believe you should review the following blogs:

https://www.mongodb.com/blog/post/time-series-data-and-mongodb-part-1-introduction

https://www.mongodb.com/blog/post/time-series-data-and-mongodb-part-2-schema-design-best-practices

I would also consider looking into some of the patterns like bucket, tree, document versioning :

Best
Pavel