Assignment of Shards and Zones

Hi,

I have a collection that contains more than 800 million documents. Documents contain details about multiple countries. So I am planning to Shard based on the country code for e.g. US, UK, IN etc.

Also, I want to create Zones like USZone, UKZone and assign each shard to corresponding country zones. Now I have the following questions regarding the behaviour of Zones and Shards

  1. Since I want to create shards based on the country code and I don’t want multiple countries present in a single shard, I thought of using hashed sharding. Please let me know if you think hashing strategy will not work.

  2. Zones document talks about lower and upper boundaries when assigning shard to Zone. But in my case, there is no lower and upper boundary because I am sharding based on a 2-letter Country code. So how to assign a shard to Zone when there is no range involved?

  3. After assigning US shard to US Zone, say after some time that shard went down. Now whether MongoDB will move incoming data (with country code as the US) to a different shard/zone or it will throw an error. If MongoDB moves the data to a different shard/zone then once the US shard comes back online whether already moved data will move to the US shard.

Thanks in advance.

Hi @Allwyn_Jesu ,

Zone sharding feature is for tagging specific shards to cover specific shard values and ranges.

Since you can control the countries or list of country values that will be assigned in each shard I am not quite understand the need of using hash sharding for distribution . Maybe you can use a hash value for the leading field that might distribute data in a set of shards whithin a zone (eg. {county : 1, userId : hashed}).

I suggest to read the following guide to understand it better:

In terms of lower and upper you can assign a single value for the field for both lower and upper, however if you have a compound shard key like for example { country : 1 , userid : 1} than you can say that all users for a specific country is assigned to zone X:

sh.addTagRange( 
  "chat.messages",
  { "country" : "UK", "userid" : MinKey },
  { "country" : "UK", "userid" : MaxKey }, 
  "EU"
)
  1. After assigning US shard to US Zone, say after some time that shard went down. Now whether MongoDB will move incoming data (with country code as the US) to a different shard/zone or it will throw an error. If MongoDB moves the data to a different shard/zone then once the US shard comes back online whether already moved data will move to the US shard.

If a shard is down then any data related to that shard is unavailable either for reads or writes. Its data will not be moved because of inviolability (sharding is built for scaling and not HA) We have HA replica set mechanics to avoid a shard failure and you can distribute secondaries to other regions to assure fault tolerance.

Best regards,
Pavel

Hi @Pavel_Duchovny,

Thanks for the quick reply.

The document in the collection does not contain a user id. It contains fields such as content, published_at, country, and few more fields.

I am sharding based on the country field but like you have mentioned, I don’t need hash sharing. So if I am going with range sharding, whether MongoDB will create one shard per country or there is the possibility that one shard might contain documents from multiple countries.

sh.shardCollection(“articles.news”, {country : 1})

Since I don’t have a compound shard key, I am assuming that adding a shard to the zone will be enough.

sh.addShardToZone(“shard0000”, “UK”)

or do I need to create a Zone range as well?

sh.updateZoneKeyRange(“articles.news”, { country: “UK” }, “UK”)

You have to create a zone range to associate a collection and a specific field to a value.

MongoDB does not create shards for you. You need to configure shards and associate the values to them in zone sharding.

There will be always a range that falls on some shard. So if you have 2 shards and you associate the UK to shard1 all other countries will go to shard 2 and so on.

Please let me know if you have any additional questions.

Best regards,
Pavel

Could you please explain how to associate all UK documents to Shard1? Say in my collection, each document will one of the following countries - UK, US, IN, or SG. Say if I create four shards using range sharding where in shard key is the country field, will each shard contains documents from the only one country? If not, how to make sure each shard contains documents related to only one country?

Also, is this the right way to create a zone range when the shard key is based on only one field i.e. country?

sh.updateZoneKeyRange(“articles.news”, { country: “UK” }, “UK”)

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.