- Core MongoDB Operations (CRUD) >
- BSON Documents
BSON Documents¶
On this page
MongoDB is a document-based database system, and as a result, all records, or data, in MongoDB are documents. Documents are the default representation of most user accessible data structures in the database. Documents provide structure for data in the following MongoDB contexts:
- the records stored in collections
- the query selectors that determine which records to select for read, update, and delete operations
- the update actions that specify the particular field updates to perform during an update operation
- the specification of indexes for collection.
- arguments to several MongoDB methods and operators, including:
- sort order for the
sort()
method. - index specification for the
hint()
method.
- sort order for the
- the output of a number of MongoDB commands and operations, including:
- the output
of
collStats
command, and - the output of the
serverStatus
command.
- the output
of
Structure¶
The document structure in MongoDB are BSON objects with support for the full range of BSON types; however, BSON documents are conceptually, similar to JSON objects, and have the following structure:
Having support for the full range of BSON types, MongoDB documents may
contain field and value pairs where the value can be another document,
an array, an array of documents as well as the basic types such as
Double
, String
, and Date
. See also
BSON Type Considerations.
Consider the following document that contains values of varying types:
The document contains the following fields:
_id
that holds an ObjectId.name
that holds a subdocument that contains the fieldsfirst
andlast
.birth
anddeath
, which both have Date types.contribs
that holds an array of strings.views
that holds a value of NumberLong type.
All field names are strings in BSON documents. Be aware that
there are some restrictions on field names
for BSON documents: field names cannot contain null
characters, dots (.
), or dollar signs ($
).
Note
BSON documents may have more than one field with the same name; however, most MongoDB Interfaces represent MongoDB with a structure (e.g. a hash table) that does not support duplicate field names. If you need to manipulate documents that have more than one field with the same name, see your driver’s documentation for more information.
Some documents created by internal MongoDB processes may have duplicate fields, but no MongoDB process will ever add duplicate keys to an existing user document.
Type Operators¶
To determine the type of fields, the mongo
shell provides
the following operators:
instanceof
returns a boolean to test if a value has a specific type.typeof
returns the type of a field.
Example
Consider the following operations using instanceof
and
typeof
:
The following operation tests whether the
_id
field is of typeObjectId
:The operation returns
true
.The following operation returns the type of the
_id
field:In this case
typeof
will return the more genericobject
type rather thanObjectId
type.
Dot Notation¶
MongoDB uses the dot notation to access the elements of an array and to access the fields of a subdocument.
To access an element of an array by the zero-based index position, you
concatenate the array name with the dot (.
) and zero-based index
position:
To access a field of a subdocument with dot-notation, you concatenate
the subdocument name with the dot (.
) and the field name:
See also
- Subdocuments for dot notation examples with subdocuments.
- Arrays for dot notation examples with arrays.
Document Types in MongoDB¶
Record Documents¶
Most documents in MongoDB in collections store data from users’ applications.
These documents have the following attributes:
The maximum BSON document size is 16 megabytes.
The maximum document size helps ensure that a single document cannot use excessive amount of RAM or, during transmission, excessive amount of bandwidth. To store documents larger than the maximum size, MongoDB provides the GridFS API. See
mongofiles
and the documentation for your driver for more information about GridFS.Documents have the following restrictions on field names:
- The field name
_id
is reserved for use as a primary key; its value must be unique in the collection, is immutable, and may be of any type other than an array. - The field names cannot start with the
$
character. - The field names cannot contain the
.
character.
- The field name
Note
Most MongoDB driver clients will include the _id
field and
generate an ObjectId
before sending the insert operation to
MongoDB; however, if the client sends a document without an _id
field, the mongod
will add the _id
field and generate
the ObjectId
.
The following document specifies a record in a collection:
The document contains the following fields:
_id
, which must hold a unique value and is immutable.name
that holds another document. This sub-document contains the fieldsfirst
andlast
, which both hold strings.birth
anddeath
that both have date types.contribs
that holds an array of strings.awards
that holds an array of documents.
Consider the following behavior and constraints of the _id
field in
MongoDB documents:
- In documents, the
_id
field is always indexed for regular collections. - The
_id
field may contain values of any BSON data type other than an array.
Consider the following options for the value of an _id
field:
Use an
ObjectId
. See the ObjectId documentation.Although it is common to assign
ObjectId
values to_id
fields, if your objects have a natural unique identifier, consider using that for the value of_id
to save space and to avoid an additional index.Generate a sequence number for the documents in your collection in your application and use this value for the
_id
value. See the Create an Auto-Incrementing Sequence Field tutorial for an implementation pattern.Generate a UUID in your application code. For a more efficient storage of the UUID values in the collection and in the
_id
index, store the UUID as a value of the BSONBinData
type.Index keys that are of the
BinData
type are more efficiently stored in the index if:- the binary subtype value is in the range of 0-7 or 128-135, and
- the length of the byte array is: 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 14, 16, 20, 24, or 32.
Use your driver’s BSON UUID facility to generate UUIDs. Be aware that driver implementations may implement UUID serialization and deserialization logic differently, which may not be fully compatible with other drivers. See your driver documentation for information concerning UUID interoperability.
Query Specification Documents¶
Query documents specify the conditions that determine which records to
select for read, update, and delete operations. You can use
<field>:<value>
expressions to specify the equality condition and
query operator expressions to specify
additional conditions.
When passed as an argument to methods such as the find()
method, the remove()
method, or the update()
method, the query document selects documents
for MongoDB to return, remove, or update, as in the following:
See also
- Query Document and Read for more examples on selecting documents for reads.
- Update for more examples on selecting documents for updates.
- Delete for more examples on selecting documents for deletes.
Update Specification Documents¶
Update documents specify the data modifications to perform during
an update()
operation to modify
existing records in a collection. You can use update operators to specify the exact actions to perform on the
document fields.
Consider the update document example:
When passed as an argument to the update()
method, the update actions document:
- Modifies the field
name
whose value is another document. Specifically, the$set
operator updates themiddle
field in thename
subdocument. The document uses dot notation to access a field in a subdocument. - Adds an element to the field
awards
whose value is an array. Specifically, the$push
operator adds another document as element to the fieldawards
.
See also
- update operators page for the available update operators and syntax.
- update for more examples on update documents.
For additional examples of updates that involve array elements,
including where the elements are documents, see the $
positional operator.
Index Specification Documents¶
Index specification documents describe the fields to index on during the index creation. See indexes for an overview of indexes. [1]
Index documents contain field and value pairs, in the following form:
field
is the field in the documents to index.value
is either 1 for ascending or -1 for descending.
The following document specifies the multi-key index on the _id
field and the last
field
contained in the subdocument name
field. The document uses
dot notation to access a field in a
subdocument:
When passed as an argument to the ensureIndex()
method, the index documents specifies
the index to create:
[1] | Indexes optimize a number of key read and write operations. |
Sort Order Specification Documents¶
Sort order documents specify the order of documents that a
query()
returns. Pass sort order
specification documents as an argument to the sort()
method. See the sort()
page
for more information on sorting.
The sort order documents contain field and value pairs, in the following form:
field
is the field by which to sort documents.value
is either 1 for ascending or -1 for descending.
The following document specifies the sort order using the fields from a
sub-document name
first sort by the last
field ascending, then
by the first
field also ascending:
When passed as an argument to the sort()
method, the sort order document sorts the results of the
find()
method:
BSON Type Considerations¶
The following BSON types require special consideration:
ObjectId¶
ObjectIds are: small, likely unique, fast to generate, and ordered. These values consists of 12-bytes, where the first 4-bytes is a timestamp that reflects the ObjectId’s creation. Refer to the ObjectId documentation for more information.
String¶
BSON strings are UTF-8. In general, drivers for each programming
language convert from the language’s string format to UTF-8 when
serializing and deserializing BSON. This makes it possible to store
most international characters in BSON strings with ease.
[2] In addition, MongoDB
$regex
queries support UTF-8 in the regex string.
[2] | Given strings using UTF-8
character sets, using sort() on strings
will be reasonably correct; however, because internally
sort() uses the C++ strcmp api, the
sort order may handle some characters incorrectly. |
Timestamps¶
BSON has a special timestamp type for internal MongoDB use and is not associated with the regular Date type. Timestamp values are a 64 bit value where:
- the first 32 bits are a
time_t
value (seconds since the Unix epoch) - the second 32 bits are an incrementing
ordinal
for operations within a given second.
Within a single mongod
instance, timestamp values are
always unique.
In replication, the oplog has a ts
field. The values in
this field reflect the operation time, which uses a BSON timestamp
value.
Note
The BSON Timestamp type is for internal MongoDB use. For most cases, in application development, you will want to use the BSON date type. See Date for more information.
If you create a BSON Timestamp using the empty constructor (e.g. new Timestamp()
),
MongoDB will only generate a timestamp if you use the constructor in
the first field of the document. [3] Otherwise, MongoDB
will generate an empty timestamp value (i.e. Timestamp(0, 0)
.)
Changed in version 2.1: mongo
shell displays the Timestamp value with the wrapper:
Prior to version 2.1, the mongo
shell display the
Timestamp value as a document:
[3] | If the first field in the document is In the following example, MongoDB will generate a Timestamp
value, even though the |
Date¶
BSON Date is a 64-bit integer that represents the number of milliseconds since the Unix epoch (Jan 1, 1970). The official BSON specification refers to the BSON Date type as the UTC datetime.
Changed in version 2.0: BSON Date type is signed. [4] Negative values represent dates before 1970.
Consider the following examples of BSON Date:
Construct a Date using the
new Date()
constructor in themongo
shell:Construct a Date using the
ISODate()
constructor in themongo
shell:Return the
Date
value as string:Return the month portion of the Date value; months are zero-indexed, so that January is month
0
:
[4] | Prior to version 2.0, Date values were
incorrectly interpreted as unsigned integers, which affected
sorts, range queries, and indexes on Date fields. Because
indexes are not recreated when upgrading, please re-index if you
created an index on Date values with an earlier version, and
dates before 1970 are relevant to your application. |