Documents map to the objects in your code, so they are much more natural to work with. There is no need to decompose data across tables, run expensive JOINs, or integrate a separate ORM layer. Data that is accessed together is stored together, so you have less code to write and your users get higher performance.
A document’s schema is dynamic and self-describing, so you don’t need to first pre-define it in the database. Fields can vary from document to document and you modify the structure at any time, avoiding disruptive schema migrations. Some document databases offer JSON Schema so you can optionally enforce rules governing document structures.
Lightweight, language-independent, and human readable, JSON has become an established standard for data interchange and storage. Documents are a superset of all other data models so you can structure data any way your application needs – rich objects, key-value pairs, tables, geospatial and time-series data, and the nodes and edges of a graph. You can work with documents using a single query language, giving you a consistent development experience however you’ve chosen to model your data.
An important difference between document databases is the expressivity of the query language and richness of indexing. The MongoDB Query Language is comprehensive and expressive. Ad hoc queries, indexing, and real time aggregations provide powerful ways to access, transform, and analyze your data. With ACID transactions you maintain the same guarantees you’re used to in SQL databases, whether manipulating data in a single document, or across multiple documents living in multiple shards.
Unlike monolithic, scale-up relational databases, document databases are distributed systems at their core. Documents are independent units which makes it easier to distribute them across multiple servers while preserving data locality. Replication with self-healing recovery keeps your applications highly available while giving you the ability to isolate different workloads from one another in a single cluster. Native sharding provides elastic and application-transparent horizontal scale-out to accommodate your workload’s growth, along with geographic data distribution for data sovereignty.
With document databases empowering developers to build faster, most relational databases have added support for JSON. However, simply adding a JSON data type does not bring the benefits of a native document database. Why? Because the relational approach detracts from developer productivity, rather than improve it. These are some of the things developers have to deal with.
Working with documents means using custom, vendor-specific SQL functions which will not be familiar to most developers, and which don’t work with your favorite SQL tools. Add low-level JDBC/ODBC drivers and ORMs and you face complex development processes resulting in low productivity.
Presenting JSON data as simple strings and numbers rather than the rich data types supported by native document databases such as MongoDB makes computing, comparing, and sorting data complex and error prone.
Relational databases offer little to validate the schema of documents, so you have no way to apply quality controls against your JSON data. And you still need to define a schema for your regular tabular data, with all the overhead that comes when you need to alter your tables as your application’s features evolve.
Most relational databases do not maintain statistics on JSON data, preventing the query planner from optimizing queries against documents, and you from tuning your queries.
Traditional relational databases offer no way for you to partition (“shard”) the database across multiple instances to scale as workloads grow. Instead you have to implement sharding yourself in the application layer, or rely on expensive scale-up systems.