My team is considering a migration to MongoDB. We have three regional data centres and our requirements include that that all data be replicated across regions and read/writes happen locally. After reviewing docs it seems the preferred way to do this is to shard data regionally, with each region containing the primary for its own shard as well as a secondary for each other region.
My concern is what happens if some user’s data becomes separated across shards? Say they are travelling and connect to a different data center. We would still want them to be able to access their data, and access and data they wrote to that region once they returned home. It seems if we include shard key for the user we will miss their data on other shards. Would we need to query only by a user’s UUID and rely on scatter gather queries, despite that not being scalable?
Please confirm if my understanding of your use case is correct, you have 3 data centres globally and want the data to be replicated overtime but initially want read/writes to happen locally?
If you can use Atlas, you might want to check out “Atlas Global Clusters” where you can define single or multi-region Zones, where each zone supports write and read operations from geographically local shards. You can also configure zones that support global low-latency secondary reads.