Difference between _id and sharding key?

I am using MongoDB Atlas as a Software Developer. Must Sharding be considered from a developer perspective?

What is the difference between _id and sharding key?

I am reading the link below. Is there a link on Sharding which has examples, is easy to read for developers? Please provide links if possible.

Hey @Ping_Pong ,

Please note that, We recommend sharding if the individual collection size grew more than 2TB. You can refer to Performance Best practice blog for more information.

_id is Primary key MongoDB by default it will be ObjectID. An ObjectId is a 12-byte hexadecimal value that is likely to be globally unique and guaranteed to be unique per collection when auto-generated. This 12-byte configuration is smaller than a typical universally unique identifier (UUID), which is, typically, 128-bits. Beginning in MongoDB 3.4, an ObjectId consists of the following values:

  • 4-byte value representing the seconds since the Unix epoch,
  • 5-byte random value, and
  • 3-byte counter, starting with a random value.
    The odds of two ObjectIds being the same would be 1 in 18,446,744,100,000,000,000. We came to this value as there are eight bits in a byte, and eight random bytes in our ObjectId (5 random + 3 random starting values), making the denominator in our odds ratio 2^(8*8), or 1.84467441x10’^19.
    As such, it is possible that there could be duplication of values, but it is highly unlikely.
    For more information on MongoDB’s ObjectId data type I recommend reviewing the following blog posts as they cover this topic in greater detail:
  • “Quick Start: BSON Data Types - ObjectId”
  • “Generating Globally Unique Identifiers for Use with MongoDB”

Shard key can be a combination of multiple column in the collection which has to be indexed and no need to be unique but it has to be highly cardinal. You can read more about it in shard key documentation.

You can choose _id as shard key but there are complication such as Most recent data moves to single shard, old data will be residing in other shards. You can read selecting shard key blog to choose the best shard key for your sharded cluster.

Thanks,
Darshan

@DarshanJayarama

Thanks. When Mongo Atlas is used, do I need to perform related processes for Sharding? for example, reshaping collections. In other words, are the same processes required for Sharding on both infrastructure setup and coding level, regardless of either on-prem or cloud?

Correct, the user-space experience for sharding in MongoDB Atlas is consistent with self-managed MongoDB: you still choose a shard key. Of course Atlas makes it much easier to scale up and out elastically with your needs, all declaratively.

Cheers
-Andrew

1 Like

Thanks @Andrew_Davidson

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.