Hello @Maciej_Pakulski1, welcome to the MongoDB Community forum!
Here are some clarifications.
How I understand it, is that for every single database, mongo will will choose one of the shards to be the primary and basically keep the data there.
For existing databases, when you deploy a new sharded cluster with shards that were previously used as replica sets, all existing databases continue to reside on their original replica sets.
This means your existing data is on a replica set. In a sharded cluster this replica set will become a shard. The existing databases on the replica set will remain on that replica set (and the shard).
For the new databases, that is the databases created subsequently, the database may reside on any shard in the cluster. The
mongos selects the primary shard when creating a new database by picking the shard in the cluster that has the least amount of data.
After creating a sharded cluster, the new databases created can be on any of the two shards (in your case).
Note that you can change the primary shard for a database using the
However, will mongo also try to rebalance the databases between 2 shards, so that data are spread more or less evenly ?
The distribution of collection data happens upon sharding a collection. In a sharded cluster you can have sharded and un-sharded collections. Only, the sharded collections are distributed among the shards. How evenly the data distribution happens is mainly determined by the shard key (and the number of shards).