We are investigating if we can use mongodb as a cache where we fetch 100 keys everytime
We have 1000 customers and upto 100000 of keys for each customer where values are 5KB JSON. We are expecting total size of data to be around 200GB.
In each request, we fetch 100 keys together (but all of them belong to same customer)
If we use Hash(_id) as shard key, each request will need mongos router aggregating the data from multiple shards. Is that ok?
Is mongos router efficient when I use
$in clause with multiple _ids which belong in multiple physical nodes when sharded.
Is there a pattern of sharding which makes access more efficient?
if it was redis, I could have use customerId has cluster hash tag so I can use MGET to fetch multiple keys together
if it was cassandra, I could make (customer_id, key) as primary key with key portion as sort key to ensure queries go to same node for efficient retrievalI
I am new to MongoDB. Say, if I am sharding by using customerId here, wondering if that is optimal as it can lead to large chunks (which I read somewhere in docs that it is bad)