Hi Olivier,
Apologies for the delay in responding to this wonderful post. There’s a lot here and no one answer so I’d like to take a high level swing but then separately I recommend you go deeper with a MongoDB expert in the future.
Re (1)
To answer your question benefits/drawbacks of multi-region vs cluster per region: First of all there are two types of multi-region clusters, those that I’ll refer to as vanilla, where there is a preferred region that writes go to and as many secondary regions that can enable reads (and high availability), and separately there’s Atlas’s “Global Clusters” which leverage an opinionated version of geo-zoned sharding whereby you embed a client location attribute into your schema and this causes that client’s data to be written to/homed in the part of the global cluster nearest to that location enabling low latency in-region writes and data placement based on sovereignty needs (there isn’t quite magic here, by which I mean you’re responsible in this model for getting the client’s country code embedded in the document, but then thee cluster does he work to ensure that document goes to the right part of the cluster). In the Global Cluster paradigm you can scale by adding more geo-zones over time e.g. maybe you eventually get enough customers in India to justify creating an India zone: when you do this the data will be moved to that new shard infrastructure with no downtime. Alternatively you can scale withtin a zone by adding shards: so maybe the US zone would have 3 shards, the EU zone 2 shards, and the APAC zone 1 shard, for example. A key advantage of this model is you can use the same connection string everywhere and the driver finds the nearest part of the cluster to interact with: you can even move data between zones thanks to MongoDB’s ability to transactionally change the value of a shard key field, so customer foo mapped to US could be updated to FR and their data will be moved from the US to EU part of the global cluster. Another advantage to this global connection string is the ability to do global aggregation queries. Finally, you can optionally enable global in-region latency for reads by having a replica from every zone in every other zone.
Now some folks prefer to manage their customer-to-cluster affinity at the application tier and in turn use a different cluster in each geo region (each of which in turn can still be spread across regions within a larger geo, say US East 1, US East 2, and US West 2 versus EU-West 1, EU-Central 1 and EU-North 1 for example for US and European data respectively). An advantage of this model is that each cluster is truly isolated, and can be independently scaled vertically as well as with respect to horizontal scale-out with sharding. A disadvantage of this model is that you need to manage the affinity of which customer should be mapped to which cluster based on your own logic, and in turn deal with all of the connection strings (not necessarily a big deal by any means). While you don’t have the same out of the box ability to easily do global aggregations across all the clusters, you can optionally approximate doing so with Atlas Data Lake’s query federation which can provide a unified virtual collection that spans your many collections for read-only analytics purposes.
Re (2)
As a general rule the idea of using a distinct database per geo does not work very well with Alas Global Clusters which instead operate based on the idea that you declare which collections themselves are global (in other words rather than declaring which databases live on which zones), with the shard key space based on a location attribute. If you use database per tenant this might still make sense IF you want each tenant to respectively be globally distributed BUT if each tenant is homed to one geo then my suggestion would probably be to use a cluster per geo for said tenants.
Re (3)
In general with sharding if you use database per tenant (there are pros and cons too this approach) then only sharded collections will ever span more than one shard: otherwise the collections go to the default shard of the database which is more or less round robin’d amongst databases in a sharded cluster. So my answer here is based on the assumption that you’re using sharding in a vanilla cluster that has a single preferred write region rather than an Atlas Global Cluster which really operates differently and has this opinionated collection & location embedded in schema attribute.
re (4)
Yes any sharded cluster can be communicated to for both reads and writes with a single connection string. AND as I mentioned above, using Atlas Data Lake’s query federation you can even approximate somenthing similar for reads specifically across multiple distinct clusters.
re (5)
As a general rule if a single database grows, one option would be to shard the collections in that database across the shards: alternatively that database could be moved into a separate cluster entirely to isolate it from the rest of the workload (sometimes this makes sense when you have a really noisy neighbor). Now this brings up a whole separate topic around how you, at-scale, coordinate movement of data/tenants between cluster(s) whether where you need to move data from one region to another if using multiple clusters across regions or where you need to peel the onion and move a heavy tenant out into their own dedicated cluster: there are a variety of strategies for doing this ranging from piping mongodump to mongorestore assuming you can quiesce writes for a period fo time, to using mongomirror with namespace targeting to synchronize changes allowing for a quick cutover (supports replica sets specifically and not sharded clusters): we are working on innovative enhancements to this problem space over time and would be happy to get in touch with you.
Regarding your further details, noting that you want to avoid sharding an individual tenant’s collections but want to keep their collections together for performance reasons, I suspect these are meaty tenants specifically and would probably make sense to bring out into their own dedicated clusters BUT I do think having a MongoDB expert really look at your overall problem space, goals, data model, schemas, and macro lifecycle/scale/global strategy could likely offer much more fine-tuned help than what I’m able to provide on this forum.
I think a key question to ask is whether you want your own customers (your tenants) to be able to do global local writes or if instead each of your tenants needs only one write region (even if other regions are used for HA and lower latency reads).
Cheers
-Andrew