Hello,
thanks for the reply. Looks like I am dealing with an attribute pattern.
So, I will probably arrange my documents as follows:
{‘t’:time,‘metadata’:{…},‘data’:[{‘a’:1},{‘b’:2}]}
{‘t’:time,‘metadata’:{…},‘data’:[{‘b’:2},{‘c’:3}]}
Then there would be no nan entries for missing values (of c or a in this case), right?
In the end, they way I would like to query my distributions on the time series collection is the following:
query: data with t from t1 to t2, metadata in {metadata1, metadata2, …}
metadata will contain two fields ‘number’ and ‘type’.
number will select the corresponding time series, while type will select the type of documents (with specific keys) that are stored in data. For a given value of type, the documents in data will have identical keys, so they can then be easily collected into a pd.DataFrame.
is that about right?