Strategy for storing billions of documents that need to be queryable

I need to store up to billions of rather small documents containing information like this:

{ "deviceId": 1234, "timestamp": "2020-04-03T15:00Z", "value": 13.534233}

there may be additional fields but basically thats about it.

Documents need to be queryable by timestamp, deviceId etc.

Typically there are 10^3 to 10^6 documents for a specific deviceId.

Problem is: over time we will end up having billions of documents. At some point it may be possible to actually drop old data but MongoDB must be able to handle billions of documents and be able serve query in reasonable time. At this time MongoDB needs to run locally. Scaling, vertically or horizontically could be an option.

Are there any suitable strategies for that? Can I somehow split collections?

Currently we use Elasticsearch with up to 1000 indices containing all the documents.

Hi @Kay_Zander,
In my opinion, the bucket pattern seems to fit very well for your case.

1 Like