I am trying to model data that is both geo-spatial and time-variant. The time-variant part lends itself to a time-series collection. Geo-spatial is less clear, as the data is actually a grid of sample values (at least 1000x1000) that map over a region. The queries performed on the collection fundamentally consist of aggregating the samples both over a polygon (intersect) and over time. I am unsure as to the best way of modelling the grid of samples. The sample values per cell could heavily correlate over time (e.g. weather data).
- Store each individual grid cell value as a document in the time-series collection. There can be a huge number of them making up the grid per time, and querying a subset of them using a geo-spatial intersect of a polygon across time could be costly. Each ‘cell’ value could be a four-sided polygon, or a single point (centre). A single point would I guess be more efficient when performing an intersect.
- Store each grid at time as a single document. This heavily reduces the number of documents, but means you lose geo-spatial querying. Working out a geo-spatial ‘filter’ to apply to the grid would have to be done outside of MongoDB, and therefore involve pulling the entire grid of values to filter.
Any thoughts/guidance is appreciated.