Pymongo timeseries db metadata to multiindex columns

Hello,
I recently built a timeseries db in pymongo. It has a timeField t and a metaField metadata.
metadata is of the form {a: value_a, b: value_b}. The value pairs in metadata characterize different time series types that are stored in my db.
What I would like to do is to find an efficient query in pymongo that lets me query a timerange for all time series simultaneously and returns the result as a pd.DataFrame with MultiIndex columns that have level 0: metadata.a, level 1: metadata.b and as index the timestamp t.

My query so far looks like this:

{'$match':{
        't':{'$gte':pd.Timestamp('now')-10*pd.Timedelta('1h'),'$lte':pd.Timestamp('now')},
    }},
    {'$project':{
        '_id':0,
    }},
}

The query just returns all data in the desired range. The resulting dataframe will have a column ‘metadata’ which contains dicts {a: value_a, b: value_b}, which is not the form that I would like to have. I tried using $unwind, but it did not work, and I don’t really know what operators to use here. If it’s too complicated to return the df with a column multiindex, it could also simply have the metadata dict unpacked in the df columns, i.e. have keys a and b in the columns of the pd.df.

Thanks!

Best, JZ

Hi @jayzee ,

Can you please share a document sample for your timeseries data and share what you defined as timeField and what is metaField?

Additional question are you using PyMongoArrow new driver to turn the data set into Pandas?

Finally, please share the output you want to see in a raw Panda/Arrow table when querying the sample data.

The above will help me assist you better.

Ty
Pavel