Hi
Consider a time series with two sources defined as:
timeseries={ "timeField": "ts",
"metaField": "source"}
and assume each documents, in addition to “ts” and “source”, has a “temperature” field. I am interested in historical values of “temperature”. To be exact, for a given timestamp t0, I want to get the last “temperature” less than or equal to t0 for each source. Note that “ts” for sources can be different, not necessarily synced.
To me, that seems like a very basic requirement from a time series database, and is probably already optimized in mongodb’s time series framework, but I am having difficulty finding an efficient solution for it. This is what I have implemented myself, using aggregation framework:
db.col.aggregate([
{"$match": {"ts": {'$lte': t0}}},
{"$group":{"_id": "$source", "hist_temperature":{"$last":"$temperature"}}}
])
Essentially, matching records that occurred before t0 and picking the last item for each source from the grouped items. But this obviously is not efficient as the time series, behind the scene, is partitioned by source and sorted by time for each source. The matching by time stage in the beginning violates that indexing order. Does anyone have an efficient alternative solution for this? Appreciate your help/comments.