Data modeling for OLAP-style over time trends (no IoT)

Hi guys,

We are fairly experienced with data modeling in MongoDB. However, we will have new access patterns in future that are more BI-like/OLAP-like for some of our collections.

Scenario:

  • we have collection where we store Risks (we are a risk management software). In this collection we store the most current approved version of a risk (hotter).
  • In another collection we store all the versions of all risks including the most current one (colder).
  • Currently we have typical OLTP queries like give me all risks given some conditions and sorting or show me some of the earlier versions of one particular risk - all good.
  • Additional access patterns will be more of a time series/OLAP-style like “how have the risks developed over time?”.
  • Example: How many risks have we had for each day for the last 3 months? Imagine a line chart with date as x axis and total number of risks as y axis.
  • Risks and their versions are only created by human beings and don’t follow any update patterns: some risks are modified daily or more often, others are untouched for years.

Naive approach:

  • We could copy all current versions of the risks into a new collection every day and by this create daily time slices of the full population. We don’t differentiate between touched and untouched risks.
  • We can aggregate quite well.
  • Would kind of work, but as a SaaS we are getting more and more customers…
  • I don’t like these kind of batch and potentially spike behavior.

Questions:

  • do you know a better, smarter, more efficient modeling idea around this challenge?
  • Maybe some smart additional properties on the versions collection somehow adding validity time spans?

Cheers,
Manuel

1 Like

Hi @Manuel_Reil,

I believe you should review the following blogs:

I would also consider looking into some of the patterns like bucket, tree, document versioning :

Best
Pavel

1 Like