Hi! I’m new to mongoDB and would love advice on appropriate schema design for our use case. Please excuse me if this post is inappropriate or off-topic.
I am an electrophysiologist, and our lab collects high frequency, voltage time-series data sampled from biological systems. Essentially, we run ‘experimental sessions’, where the data collected from an experimental session is voltage measurements sampled at either 250, 500, or 1000 Hz, along with other quantitative. Each experimental session can last up to a few hours (collected maybe a few gigabytes of data), and has descriptions the experimental setup (e.g. the experimenter, device used for collection, biological system measured from, etc…).
I am considering creating a time-series collection, where each document contains fields that describe the experimental setup, and also contains a data field. The data field will contain subdocuments, where each subdocument is a datapoint from that experimental session.
For Example:
{
‘session#’: 1
metadata: {
‘experimenter’: ‘Tony’
‘sample_rate’: ‘500’
‘system#’: 12
}
data: {
{
timeStamp: UnixTimeStamp1,
‘voltage’: 1e-3,
‘PowerBand1’: 1e-5},
{
timeStamp: UnixTimeStamp2,
‘voltage’: 1.5e-3,
‘PowerBand1’: 2.3e-5}
}
}
I’ll often query on ‘session#’, metadata like ‘system#’, or obtaining voltage when ‘PowerBand1’==SomeValue. Is this a good use for the time series collection?
Alternatively, I could create a normal collection, where each document is a session, and input the data as a parquet file into a ‘data’ field for each document. Is there a best way to proceed? Any advice is welcome, thank you!