Multitenancy, multi-region, sharding and zones

Hello, I am quite new to MongoDB architecture and I am considering MongoDB Atlas for our primary transactional database. Our application is a B2B multitenant SaaS application that is currently in a single region, but will expand to multiple regions in years to come. I was wondering how the concepts of multitenancy, multi-region, sharding and zones all intersect and how they can help answer my questions below?

  1. What are the benefits/drawbacks of a multi-region cluster vs cluster per region?
  2. Can I start with an unsharded cluster with the “tenant per database” multitenancy model and evolve this into a multi-master-like sharded cluster by assigning tenant databases to specific shards? Do I need zones to do this, if at all?
  3. If sharding is enabled, is it generally possible to ensure that each tenant database stays confined to a single shard?
  4. If using sharding, both with or without zones, is it possible to communicate with the cluster from a single URL/domain for all reads and writes?
  5. If a single tenant database begins to outgrow its current shared shard, is it possible to move it to a dedicated shard? What does this involve from both the client (microservice running in a container) and server (MongoDB Atlas) sides?

Further details:
• I do not expect the need to shard individual tenant collections, but I expect that collectively the individual tenant collections from their respective tenant databases will eventually surpass the storage limits of a single node.
• I want to avoid sharding a single tenant’s collection in order to maintain data locality and ensure fast reads.
• I want to eventually support multiple write nodes for high availability and throughput without requiring multiple database clusters, which would introduce added complexity on the client side.

Hi Olivier,

Apologies for the delay in responding to this wonderful post. There’s a lot here and no one answer so I’d like to take a high level swing but then separately I recommend you go deeper with a MongoDB expert in the future.

Re (1)
To answer your question benefits/drawbacks of multi-region vs cluster per region: First of all there are two types of multi-region clusters, those that I’ll refer to as vanilla, where there is a preferred region that writes go to and as many secondary regions that can enable reads (and high availability), and separately there’s Atlas’s “Global Clusters” which leverage an opinionated version of geo-zoned sharding whereby you embed a client location attribute into your schema and this causes that client’s data to be written to/homed in the part of the global cluster nearest to that location enabling low latency in-region writes and data placement based on sovereignty needs (there isn’t quite magic here, by which I mean you’re responsible in this model for getting the client’s country code embedded in the document, but then thee cluster does he work to ensure that document goes to the right part of the cluster). In the Global Cluster paradigm you can scale by adding more geo-zones over time e.g. maybe you eventually get enough customers in India to justify creating an India zone: when you do this the data will be moved to that new shard infrastructure with no downtime. Alternatively you can scale withtin a zone by adding shards: so maybe the US zone would have 3 shards, the EU zone 2 shards, and the APAC zone 1 shard, for example. A key advantage of this model is you can use the same connection string everywhere and the driver finds the nearest part of the cluster to interact with: you can even move data between zones thanks to MongoDB’s ability to transactionally change the value of a shard key field, so customer foo mapped to US could be updated to FR and their data will be moved from the US to EU part of the global cluster. Another advantage to this global connection string is the ability to do global aggregation queries. Finally, you can optionally enable global in-region latency for reads by having a replica from every zone in every other zone.

Now some folks prefer to manage their customer-to-cluster affinity at the application tier and in turn use a different cluster in each geo region (each of which in turn can still be spread across regions within a larger geo, say US East 1, US East 2, and US West 2 versus EU-West 1, EU-Central 1 and EU-North 1 for example for US and European data respectively). An advantage of this model is that each cluster is truly isolated, and can be independently scaled vertically as well as with respect to horizontal scale-out with sharding. A disadvantage of this model is that you need to manage the affinity of which customer should be mapped to which cluster based on your own logic, and in turn deal with all of the connection strings (not necessarily a big deal by any means). While you don’t have the same out of the box ability to easily do global aggregations across all the clusters, you can optionally approximate doing so with Atlas Data Lake’s query federation which can provide a unified virtual collection that spans your many collections for read-only analytics purposes.

Re (2)
As a general rule the idea of using a distinct database per geo does not work very well with Alas Global Clusters which instead operate based on the idea that you declare which collections themselves are global (in other words rather than declaring which databases live on which zones), with the shard key space based on a location attribute. If you use database per tenant this might still make sense IF you want each tenant to respectively be globally distributed BUT if each tenant is homed to one geo then my suggestion would probably be to use a cluster per geo for said tenants.

Re (3)
In general with sharding if you use database per tenant (there are pros and cons too this approach) then only sharded collections will ever span more than one shard: otherwise the collections go to the default shard of the database which is more or less round robin’d amongst databases in a sharded cluster. So my answer here is based on the assumption that you’re using sharding in a vanilla cluster that has a single preferred write region rather than an Atlas Global Cluster which really operates differently and has this opinionated collection & location embedded in schema attribute.

re (4)
Yes any sharded cluster can be communicated to for both reads and writes with a single connection string. AND as I mentioned above, using Atlas Data Lake’s query federation you can even approximate somenthing similar for reads specifically across multiple distinct clusters.

re (5)
As a general rule if a single database grows, one option would be to shard the collections in that database across the shards: alternatively that database could be moved into a separate cluster entirely to isolate it from the rest of the workload (sometimes this makes sense when you have a really noisy neighbor). Now this brings up a whole separate topic around how you, at-scale, coordinate movement of data/tenants between cluster(s) whether where you need to move data from one region to another if using multiple clusters across regions or where you need to peel the onion and move a heavy tenant out into their own dedicated cluster: there are a variety of strategies for doing this ranging from piping mongodump to mongorestore assuming you can quiesce writes for a period fo time, to using mongomirror with namespace targeting to synchronize changes allowing for a quick cutover (supports replica sets specifically and not sharded clusters): we are working on innovative enhancements to this problem space over time and would be happy to get in touch with you.

Regarding your further details, noting that you want to avoid sharding an individual tenant’s collections but want to keep their collections together for performance reasons, I suspect these are meaty tenants specifically and would probably make sense to bring out into their own dedicated clusters BUT I do think having a MongoDB expert really look at your overall problem space, goals, data model, schemas, and macro lifecycle/scale/global strategy could likely offer much more fine-tuned help than what I’m able to provide on this forum.

I think a key question to ask is whether you want your own customers (your tenants) to be able to do global local writes or if instead each of your tenants needs only one write region (even if other regions are used for HA and lower latency reads).

Cheers
-Andrew

2 Likes

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.