Intersections on all interval sets

Here’s the problem I would like to solve with aggregate:

Given N interval sets of the form of [[l_1, r_1], [l_2, r_2], … [l_n, r_n]], the task is to find the intersections on all interval sets. An intersection is an interval set that lies within all of the given interval sets. If no such intersections exists then return [].

Example 1

N_1 = [[1, 4] [6, 7], [10, 16]]
N_2 = [[0, 20]]
N_3 = [[6, 10], [14, 20]]

Output:
[6, 7], [14, 16]

Reason:
Intersections on all N interval sets are [6, 7] and [14, 16] because only these two satisfies N_1, N_2 and N_3.

Example 2

N_1 = [[1, 2] [10, 18]]
N_2 = [[3, 5]]

Output:
[]

Reason:
No intersections between N_1 and N_2

Example 3

collection = [
  { set: "N_1", start: 1, end: 4 },
  { set: "N_1", start: 6, end: 10 },
  { set: "N_2", start: 0, end: 10 }
]

await db.collection.aggregate(CleverAggregate())

Output:
[[1, 4], [6,10]]

Big picutre

Each set interval represents event with type, startDate and endDate. I need to know exact date ranges when event.type = 'failure' happened, only if event.type = 'highPriority' and event.type = 'production' occurred. I need to perform such analysis on around 30M entries but I expect that the resulting array will have less than 100 elements.

Thank you for any guides or help.

I solved this with the approximated solution - it’s enough for my case:

a) Interpolate (using $range) between startDate and endDate.
b) Round interpolated values (timestamp) to some arbitrary value, so the frequency of all sets will match.
c) Group by timestamp.
d) Remove invalid groups.
e) Recreate the final intervals set by implementing something similar to 1-D kNN.

2 Likes