Hello,
I recently built a timeseries db in pymongo. It has a timeField t and a metaField metadata.
metadata is of the form {a: value_a, b: value_b}. The value pairs in metadata characterize different time series types that are stored in my db.
What I would like to do is to find an efficient query in pymongo that lets me query a timerange for all time series simultaneously and returns the result as a pd.DataFrame with MultiIndex columns that have level 0: metadata.a, level 1: metadata.b and as index the timestamp t.
My query so far looks like this:
{'$match':{
't':{'$gte':pd.Timestamp('now')-10*pd.Timedelta('1h'),'$lte':pd.Timestamp('now')},
}},
{'$project':{
'_id':0,
}},
}
The query just returns all data in the desired range. The resulting dataframe will have a column ‘metadata’ which contains dicts {a: value_a, b: value_b}, which is not the form that I would like to have. I tried using $unwind, but it did not work, and I don’t really know what operators to use here. If it’s too complicated to return the df with a column multiindex, it could also simply have the metadata dict unpacked in the df columns, i.e. have keys a and b in the columns of the pd.df.
Thanks!
Best, JZ