My team has a collection of ~45 million documents, each containing a timestamp and a geojson field.
The main use case is querying across both the timestamp and the geojson.
The team had a few solutions:
Using a compound index for the two fields - performance wasn’t on par with product expectations - on some cases search time was over 10 seconds.
Using the bucket pattern (bucket per time range) for reducing search time, but because the query is across two fields we couldn’t properly use the geojson index.
Bucketing by collection, i.e. a collection for every time range. This enables the use of the geojson index while still using buckets. I feel like that’s not a standard solution and requires more implementation at the application layer. This solution however, has the best performance - sub 1 second search time
I would like to hear from you your opinions about the solutions and whether there are some modeling patterns that I am missing