Some Questions regarding sharding

I have a few questions regarding sharing in MongoDB :

  1. My understanding is that sharding implies the data is spread across multiple machines while partitioning does not necessarily. Is this understanding correct in MongoDB ? Also, replication is simply an extra replica on a different node.

  2. If sharding does imply a DB stores across multiple machines, right now I have a 3 nodes cluster (1 master + 2 slaves), should I re-config (or re-install) to form a cluster so that all nodes could read/write (not just master could write) ?

  3. Is 3-nodes cluster enough for sharding with reasonable performance? or need more nodes in the cluster ?

  4. Typically how large a DB could make use of sharding reasonably ? What I mean is that if a DB is too small, the disadvantage of sharding (e.g., the query overhead or the extra replications due to sharding, etc) might be over the advantage of sharding, hence sharding would not be a good choice. Is there any magic number ? for example, how many number of rows of each shard-ID would make sharding reasonable (otherwise would considered too small)? or how large a DB (how many TB or GB) could make full use of sharding, etc, etc.

I appreciate your opinion,
Best !

Yes, for the spread but I do not understand the while partitioning does not part of your question.

Yes replication means the same data on different nodes, so that if one crashes you do not lose data.

First, the terminology is Primary and Secondaries rather than master and slaves. A well configured shard cluster requires at least 9 machines, 3 for the first replica set, 3 for the second replica set and 3 for the configuration server replica set. You may run mongos on any of the 9 machines, but for optimal performance you would likely need a 10th machines.

The primary of each replica set can write its own data. Having secondaries to write is not possible. You may read from secondaries, but secondaries do as much writing as the primary. So you may read stale data. And now you secondaries might be overloaded and cannot keep up with the primary for writing.

You may start small with 3 nodes but you won’t have any benefits. Actually it is probably slower because you have processes fighting over the same resources. But I would do it only because I am 110% sure I will need to shard. This way you configuration is tested but it would be sub-optimal.

Yep, that is the main point.

No magic number. That is why I prefer not to shard until I really need it. I then bite the bullet at this time.