ObjectId Guarantees absolute Insertion Order if just one replica set and no sharding throughout lifetime?

Masoud_Naghizade · March 3, 2021, 1:31pm

according to docs, ObjectId consists of 12 bytes.but what it does not explain is what is that 5 random bytes and does it change within the same server instance?

so if that 5 random bytes do not change within the same Mongo server instance, it means that all docs inserted in the same running server instance will have the same 5 random bytes

now as the last 3 bytes are incremental, can we conclude that _id field has the absolute insertion order if we only have one replica set with no sharding throughout the lifetime of database and if the system time does not change mistakenly never?

Stennie_X · March 21, 2021, 11:24am

Hi @Masoud_Naghizade,

By default new ObjectIDs are generated by the client/driver (although they can also be generated on the server if the client does not provide an _id). In most cases the driver generates the ObjectID (for _id) and adds this to the document representation before sending the server request.

Example from PyMongo documentation on Inserting a Document:

When a document is inserted a special key, "_id" , is automatically added if the document doesn’t already contain an "_id" key. The value of "_id" must be unique across the collection.

The “automatically added” mention is referring to the driver adding the _id to your document. This allows the driver to provide the _id for a document without waiting for a round-trip to the server to fetch a generated _id. You can also use that _id to prepare multiple related documents in a transaction.

ObjectIDs are generally monotonically increasing because of the leading timestamp prefix, but do not strictly reflect insertion order. The granularity of the timestamp is in seconds, so multiple ObjectIDs generated within the same second do not have predictable ordering. The client can also generate ObjectIDs well before the request is sent to the server, or be subject to clock skew for the timestamp component.

Random bytes change for each call to generate a new ObjectID. This provides some differentiation for ObjectIDs that may be generated concurrently on different application servers.

No.

Since ObjectIDs are typically generated on the client, the deployment topology does not factor into the ordering of ObjectIDs. Even if the ObjectIDs are generated on the server, multiple ObjectIDs generated in the same second will not have a predictable ordering.

For a use case requiring strict insertion order you could use a capped collection to provide a guarantee that results queried in natural order will reflect insertion order. However, capped collections have many associated restrictions in order to achieve this guarantee. They are FIFO (FIrst-In First-Out) collections with a maximum file size and do not allow direct removal or changing document size in updates.

If capped collections aren’t a suitable solution, you will have to find an alternative approach to ensure generation of unique monotonically increasing identifiers that reflect insertion order and suit your use case.

For an overview of common approaches and their associated benefits & drawbacks, see: Generating Globally Unique Identifiers for Use with MongoDB.

Regards,
Stennie

system · March 29, 2021, 1:29pm

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.