Docs Menu
Docs Home
/
MongoDB Manual
/

BSON Types

On this page

  • Binary Data
  • ObjectId
  • String
  • Timestamps
  • Date

BSON is a binary serialization format used to store documents and make remote procedure calls in MongoDB. The BSON specification is located at bsonspec.org.

Each BSON type has both integer and string identifiers as listed in the following table:

Type
Number
Alias
Notes
Double
1
"double"
String
2
"string"
Object
3
"object"
Array
4
"array"
Binary data
5
"binData"
Undefined
6
"undefined"
Deprecated.
ObjectId
7
"objectId"
Boolean
8
"bool"
Date
9
"date"
Null
10
"null"
Regular Expression
11
"regex"
DBPointer
12
"dbPointer"
Deprecated.
JavaScript
13
"javascript"
Symbol
14
"symbol"
Deprecated.
32-bit integer
16
"int"
Timestamp
17
"timestamp"
64-bit integer
18
"long"
Decimal128
19
"decimal"
Min key
-1
"minKey"
Max key
127
"maxKey"

To determine a field's type, see Type Checking.

If you convert BSON to JSON, see the Extended JSON reference.

The following sections describe special considerations for particular BSON types.

A BSON binary binData value is a byte array. A binData value has a subtype that indicates how to interpret the binary data. The following table shows the subtypes.

Number
Subtype
0
Generic binary subtype
1
Function data
2
Binary (old)
3
UUID (old)
4
UUID
5
MD5
6
Encrypted BSON value
7

Compressed time series data

New in version 5.2.

128
Custom data

ObjectIds are small, likely unique, fast to generate, and ordered. ObjectId values are 12 bytes in length, consisting of:

  • A 4-byte timestamp, representing the ObjectId's creation, measured in seconds since the Unix epoch.

  • A 5-byte random value generated once per process. This random value is unique to the machine and process.

  • A 3-byte incrementing counter, initialized to a random value.

For timestamp and counter values, the most significant bytes appear first in the byte sequence (big-endian). This is unlike other BSON values, where the least significant bytes appear first (little-endian).

If an integer value is used to create an ObjectId, the integer replaces the timestamp.

In MongoDB, each document stored in a collection requires a unique _id field that acts as a primary key. If an inserted document omits the _id field, the MongoDB driver automatically generates an ObjectId for the _id field.

This also applies to documents inserted through update operations with upsert: true.

MongoDB clients should add an _id field with a unique ObjectId. Using ObjectIds for the _id field provides the following additional benefits:

  • in mongosh, you can access the creation time of the ObjectId, using the ObjectId.getTimestamp() method.

  • sorting on an _id field that stores ObjectId values is roughly equivalent to sorting by creation time.

    Important

    While ObjectId values should increase over time, they are not necessarily monotonic. This is because they:

    • Only contain one second of temporal resolution, so ObjectId values created within the same second do not have a guaranteed ordering, and

    • Are generated by clients, which may have differing system clocks.

Use the ObjectId() methods to set and retrieve ObjectId values.

Starting in MongoDB 5.0, mongosh replaces the legacy mongo shell. The ObjectId() methods work differently in mongosh than in the legacy mongo shell. For more information on the legacy methods, see Legacy mongo Shell.

BSON strings are UTF-8. In general, drivers for each programming language convert from the language's string format to UTF-8 when serializing and deserializing BSON. This makes it possible to store most international characters in BSON strings with ease. [1] In addition, MongoDB $regex queries support UTF-8 in the regex string.

[1] Given strings using UTF-8 character sets, using sort() on strings will be reasonably correct. However, because internally sort() uses the C++ strcmp api, the sort order may handle some characters incorrectly.

BSON has a special timestamp type for internal MongoDB use and is not associated with the regular Date type. This internal timestamp type is a 64 bit value where:

  • the most significant 32 bits are a time_t value (seconds since the Unix epoch)

  • the least significant 32 bits are an incrementing ordinal for operations within a given second.

While the BSON format is little-endian, and therefore stores the least significant bits first, the mongod instance always compares the time_t value before the ordinal value on all platforms, regardless of endianness.

Within a single mongod instance, timestamp values are always unique.

In replication, the oplog has a ts field. The values in this field reflect the operation time, which uses a BSON timestamp value.

Note

The BSON timestamp type is for internal MongoDB use. For most cases, in application development, you will want to use the BSON date type. See Date for more information.

When inserting a document that contains top-level fields with empty timestamp values, MongoDB replaces the empty timestamp values with the current timestamp value, with the following exception. If the _id field itself contains an empty timestamp value, it will always be inserted as is and not replaced.

Example

Insert a document with an empty timestamp value:

db.test.insertOne( { ts: new Timestamp() } );

Running db.test.find() would then return a document which resembles the following:

{ "_id" : ObjectId("542c2b97bac0595474108b48"), "ts" : Timestamp(1412180887, 1) }

The server has replaced the empty timestamp value for ts with the timestamp value at time of insert.

BSON Date is a 64-bit integer that represents the number of milliseconds since the Unix epoch (Jan 1, 1970). This results in a representable date range of about 290 million years into the past and future.

The official BSON specification refers to the BSON Date type as the UTC datetime.

BSON Date type is signed. [2] Negative values represent dates before 1970.

Example

Construct a Date using the new Date() constructor in mongosh:

var mydate1 = new Date()

Example

Construct a Date using the ISODate() constructor in mongosh:

var mydate2 = ISODate()

Example

Return the Date value as string:

mydate1.toString()

Example

Return the month portion of the Date value; months are zero-indexed, so that January is month 0:

mydate1.getMonth()
[2] Prior to version 2.0, Date values were incorrectly interpreted as unsigned integers, which affected sorts, range queries, and indexes on Date fields. Because indexes are not recreated when upgrading, please re-index if you created an index on Date values with an earlier version, and dates before 1970 are relevant to your application.

Back

Query API

Next

Comparison and Sort Order