We have a few large collections and were devising strategies to partition those into smaller collections for better CRUD performance. Especially because we generally access data for a particular logical group (customer, etc.) together.
We are considering chunking data with a logical partitioning key to get the performance benefit without client code needing to be aware of underlying chunking. Would chunking give desired benefits even if we have a single server (one shard to start with, can add more as data further increased)? Benefits due to physically partitioned data, better locality of access, relatively uniform chunk sizing, smaller indexes per chunk?
Hi @Parikshit_Samant, welcome to the community!
I believe the performance of your database could be worse using a shared cluster with only one shard when compared with a standard replica set cluster. Sharding adds more load to your replica set because it will need to check the range keys/chunks of data and split them when required. Another issue is that a sharded cluster needs other entities to work, config servers, and Mongos, increasing the cost of your deployment. Considering that you have only one shard, no performance gain would be provided, and the cost would increase.
Relating to partitioning your data, you can do something similar to it with your indexes. Using a Partial Index to index only documents that match a filter.