EVENTYou’re invited to MongoDB.local NYC on May 2. Use code Web50 for 50% off your ticket! Learn more >

JSON and BSON are close cousins, as their nearly identical names imply, but you wouldn’t know it by looking at them side by side. JSON, or JavaScript Object Notation, is the wildly popular standard for data interchange on the web, on which BSON (Binary JSON) is based. We’ll take a look at each, and hopefully shed some light on the JSON vs. BSON mystery: what’s the difference, and why does it matter?


Table of contents

What is JSON?

JSON, or JavaScript Object Notation, is a human-readable data interchange format, specified in the early 2000s. Even though JSON is based on a subset of the JavaScript programming language standard, it’s completely language-independent.

JSON objects are associative containers, wherein a string key is mapped to a value (which can be a number, string, boolean, array, an empty value — null, or even another object). Almost any programming language has an implementation for this abstract data structure — objects in JavaScript, dictionaries in Python, hash tables in Java and C#, associative arrays in C++, and so on. JSON objects are easy for humans to understand and for machines to parse and generate:

{
  "_id": 1,
  "name": { "first" : "John", "last" : "Backus" },
  "contribs": [ "Fortran", "ALGOL", "Backus-Naur Form", "FP" ],
  "awards": [
    {
      "award": "W.W. McDowell Award",
      "year": 1967,
      "by": "IEEE Computer Society"
    }, {
      "award": "Draper Prize",
      "year": 1993,
      "by": "National Academy of Engineering"
    }
  ]
}

As JavaScript became the leading language for web development, JSON began to take on a life of its own. By virtue of being both human- and machine-readable, and comparatively simple to implement support for in other languages, JSON quickly moved beyond the web page, and into software everywhere.

Today, JSON shows up in many different cases:

  • APIs
  • Configuration files
  • Log messages
  • Database storage

The MongoDB-JSON connection

MongoDB was designed from its inception to be a database focused on delivering great development experience. JSON’s ubiquity made it the obvious choice for representing data structures in MongoDB’s document data model.

However, there are several issues that make JSON less than ideal for usage inside of a database.

  1. JSON only supports a limited number of basic data types. Most notably, JSON lacks support for dates and binary data.

  2. JSON objects and properties don’t have fixed length which makes traversal slower.

In order to make MongoDB JSON-first, but still high performance and general purpose, BSON was invented to bridge the gap: a binary representation to store data in JSON format, optimized for speed, space, and efficiency. It’s not dissimilar from other binary interchange formats like Protocol Buffers, or Thrift, in terms of approach.

What is BSON?

BSON stands for “Binary JSON,” and that’s exactly what it was invented to be. BSON’s binary structure encodes type and length information, which allows it to be traversed much more quickly compared to JSON.

BSON adds some non-JSON-native data types, like dates and binary data, without which MongoDB would have been missing some valuable support.

The following are some example JSON objects and their corresponding BSON representations.

{"hello": "world"} →
\x16\x00\x00\x00           // total document size
\x02                       // 0x02 = type String
hello\x00                  // field name
\x06\x00\x00\x00world\x00  // field value
\x00                       // 0x00 = type EOO ('end of object')
 
{"BSON": ["awesome", 5.05, 1986]} →
\x31\x00\x00\x00
 \x04BSON\x00
 \x26\x00\x00\x00
 \x02\x30\x00\x08\x00\x00\x00awesome\x00
 \x01\x31\x00\x33\x33\x33\x33\x33\x33\x14\x40
 \x10\x32\x00\xc2\x07\x00\x00
 \x00
 \x00

You can learn more about the BSON grammar in the BSON specification.

Does MongoDB use BSON or JSON?

MongoDB stores data in BSON format both internally, and over the network, but that doesn’t mean you can’t think of MongoDB as a JSON database. Anything you can represent in JSON can be natively stored in MongoDB, and retrieved just as easily in JSON.

When using the MongoDB driver for your favorite programming language, you work with the native data structures for that language. The driver will take care of converting the data to BSON and back when querying the database.

Unlike systems that store JSON as string-encoded values, or binary-encoded blobs, MongoDB uses BSON to offer powerful indexing and querying features on top of the web’s most popular data format.

For example, MongoDB allows developers to query and manipulate objects by specific keys inside the JSON/BSON document, even in nested documents many layers deep into a record, and create high-performance indexes on those same keys and values.

Firstly, BSON documents may contain Date or Binary objects that are not natively representable in pure JSON. Second, each programming language has its own object semantics. JSON objects have ordered keys, for instance, while Python dictionaries (the closest native data structure that’s analogous to JavaScript objects) are unordered, while differences in numeric and string data types can also come into play. Third, BSON supports a variety of numeric types that are not native to JSON, and many languages represent these differently.

Check your driver documentation to make sure you understand how to best access MongoDB BSON-backed data in your language.

JSON vs BSON

JSONBSON
EncodingUTF-8 StringBinary
Data SupportString, Boolean, Number, Array, Object, nullString, Boolean, Number (Integer, Float, Long, Decimal128...), Array, null, Date, BinData
ReadabilityHuman and MachineMachine Only

JSON and BSON are indeed close cousins by design. BSON is designed as a binary representation of JSON data, with specific extensions for broader applications, and optimized for data storage and traversal. Just like JSON, BSON supports embedding objects and arrays.

One particular way in which BSON differs from JSON is in its support for some more advanced types of data. JSON does not, for instance, differentiate between integers (which are round numbers), and floating-point numbers (which have decimal precision to various degrees).

Most server-side programming languages have more sophisticated numeric types (standards include integer, regular precision floating point number aka “float”, double-precision floating point aka “double”, and boolean values), each with its own optimal usage for efficient mathematical operations.

Schema flexibility and data governance

One of the big attractions for developers using databases with JSON and BSON data models is the dynamic and flexible schema they provide when compared to the rigid, tabular data models used by relational databases.

Firstly, MongoDB documents are polymorphic — fields can vary from document to document within a single collection (analogous to tables in a relational database). This flexibility makes it easier to model data of any structure and adapt the model as requirements change.

Secondly, there is no need to declare the structure of documents to the database – documents are self-describing. Developers can start writing code and persist objects as they are created.

Thirdly, if a new field needs to be added to a document, it can be created without affecting all other documents in the collection, without updating a central system catalog, and without taking the database offline. When you need to make changes to the data model, the document database continues to store the updated objects without the need to perform costly ALTER TABLE operations — or worse, having to redesign the schema from scratch.

Through these advantages, the flexibility of the document data model is well suited to the demands of modern application development practices.

While a flexible schema is a powerful feature, there are situations where you might want more control over the data structure and content of your documents. Most document databases push enforcement of these controls back to the developer to implement in application code. However, more advanced document databases provide schema validation, using approaches such as the IETF JSON Schema standard adopted by MongoDB.