Quick Comparison Table
|Deployment Options||As a service on Amazon Web Services|
|Querying||Key-value queries||Query & analyze data in multiple ways — by single keys, ranges, faceted search, graph traversals, and geospatial queries through to complex aggregations|
|Monitoring||Limited visibility into real-time database behavior||MongoDB Atlas database as a service tracks 100+ metrics that could impact performance|
|Backup||No standardized backup service for DynamoDB||MongoDB Atlas includes continuous, queryable backups with point-in-time recovery|
|Throughput Limits||Up to 10k ops / sec. Additional throughput can be requested with a web form.||None|
|Pricing||Throughput-based pricing. A wide range of inputs may affect price. See Pricing and Commercial Considerations.||MongoDB Atlas pricing is based on RAM, I/O, and storage.|
What is DynamoDB?
DynamoDB is a proprietary NoSQL database service built by Amazon and offered as part of the Amazon Web Services (AWS) portfolio.
The name comes from Dynamo, a highly available key-value store developed in response to holiday outages on the Amazon e-commerce platform in 2004. Initially, however, few teams within Amazon adopted Dynamo due to its high operational complexity and the trade-offs that needed to be made between performance, reliability, query flexibility, and data consistency.
Around the same time, Amazon found that its developers enjoyed using SimpleDB, its primary NoSQL database service at the time which allowed users to offload database administration work. But SimpleDB, which is no longer being updated by Amazon, had severe limitations when it came to scale; its strict storage limitation of 10 GB and the limited number of operations it could support per second made it only viable for small workloads.
DynamoDB, which was launched as a database service on AWS in 2012, was built to address the limitations of both SimpleDB and Dynamo.
What is MongoDB?
MongoDB is an open source NoSQL database built by MongoDB, Inc. The company was established in 2007 by former executives and engineers from DoubleClick, which Google acquired and now uses as the backbone of its advertising products. The founders originally focused on building a platform as a service using entirely open source components, but when they struggled to find an existing database that could meet their requirements for building a service in the cloud, they began work on their own database system. After realizing the potential of the database software on its own, the team shifted their focus to what is now MongoDB. The company released MongoDB as an open source project in 2009.
MongoDB was designed with developer productivity, continuous availability, geo-distribution and scalability in mind, and includes out-of-the-box replication and auto-sharding.
MongoDB stores data in flexible, JSON-like documents, meaning fields can vary from document to document and data structure can be changed over time. This model maps to the objects in application code, making data easy to work with for developers. Related information is typically stored together for fast query access through the MongoDB query language. MongoDB uses dynamic schemas, allowing users to create records without first defining the structure, such as the fields or the types of their values. Users can change the structure of records (which we call documents) simply by adding new fields or deleting existing ones. This flexible data model makes it easy for developers to represent hierarchical relationships and other more complex structures. Documents in a collection need not have an identical set of fields and denormalization of data is common.
In summer of 2016, MongoDB Atlas, a fully managed service for MongoDB, was announced. MongoDB Atlas allows users to offload operational tasks and features built-in best practices for running the database.
Terminology and Concepts
Many concepts in DynamoDB have close analogs in MongoDB. The table below outlines some of the common concepts across DynamoDB and MongoDB.
|Secondary Index||Secondary Index|
MongoDB is an open source database that can be run anywhere – from a developer’s laptop to an on-prem data center to within any of the public cloud platforms. As mentioned above, MongoDB is also available as a fully managed service with MongoDB Atlas; this model is most similar to how DynamoDB is delivered.
In contrast, DynamoDB is a proprietary database only available on Amazon Web Services. While a downloadable version of the database is available for prototyping on a local machine, the database can only be run in production in AWS. As such, organizations looking into DynamoDB should consider the implications of building on a data layer inextricably tied to a single cloud vendor.
Comparethemarket.com, the UK’s leading price comparison service, recently completed a transition from on-prem deployments with Microsoft SQL Server to AWS and MongoDB. When asked why they hadn’t selected DynamoDB, a company representative was quoted as saying "DynamoDB was eschewed to help avoid AWS vendor lock-in."
MongoDB stores data in a JSON-like format called BSON, which allows the database to support a wide spectrum of data types including dates, 64-bit integers, & Decimal128. MongoDB documents can be up to 16 MB in size; with GridFS, even larger assets can be natively stored within the database.
Unlike some NoSQL databases that push enforcement of data quality controls back into the application code, MongoDB provides built-in document validation. Users can enforce checks on document structure, data types, data ranges and the presence of mandatory fields. As a result, DBAs can apply data governance standards, while developers maintain the benefits of a flexible document model.
DynamoDB is a key-value store with added support for JSON to provide document-like data structures that better match with objects in application code. An item or record cannot exceed 400KB. Compared to MongoDB, DynamoDB also has limited support for different data types. For example, it supports only one numeric type and does not support dates. As a result, developers must preserve data types on the client, which adds application complexity and reduces data re-use across different applications. DynamoDB does not have native data validation capabilities.
Queries and Indexes
MongoDB's query language enables developers to build applications that can query and analyze their data in multiple ways – by single keys, ranges, faceted search, graph traversals, and geospatial queries through to complex aggregations, returning responses in milliseconds. Complex queries are executed natively in the database without having to use additional analytics frameworks or tools. This helps users avoid the latency that comes from syncing data between operational and analytical engines.
MongoDB ensures fast access to data by any field with full support for secondary indexes. Indexes can be applied to any field in a document, down to individual values in arrays.
Supported indexing strategies such as compound, unique, array, partial, TTL, geospatial, sparse, hash, and text ensure optimal performance for multiple query patterns, data types, and application requirements. Indexes are strongly consistent with the underlying data.
DynamoDB supports key-value queries. For queries requiring aggregations, graph traversals, or search, data must be copied into additional AWS technologies, such as Elastic MapReduce or Redshift, increasing latency, cost, and complexity. The database supports two types of indexes: Global secondary indexes (GSIs) and local secondary indexes (LSIs). Users can define up to 5 GSIs and 5 LSIs per table. Indexes can be defined as hash or hash-range indexes; more advanced indexing strategies are not supported.
GSIs, which are eventually consistent with the underlying data, do not support ad-hoc queries and usage requires knowledge of data access patterns in advance. LSIs, which can be queried to return strongly consistent data, must be defined when the table is created. They cannot be added to existing tables and they cannot be removed without dropping the table.
DynamoDB indexes are sized and provisioned separately from the underlying tables, which may result in unforeseen issues at runtime. The DynamoDB documentation explains, "Because some or all writes to a DynamoDB table result in writes to related GSIs, it is possible that a GSI’s provisioned throughput can be exhausted. In such a scenario, subsequent writes to the table will be throttled. This can occur even if the table has available write capacity units."
MongoDB is strongly consistent by default as all read/writes go to the primary in a MongoDB replica set. If desired, consistency requirements for read operations can be relaxed. Through secondary consistency controls, read queries can be routed only to secondary replicas that fall within acceptable consistency limits with the primary server.
DynamoDB is eventually consistent by default. Users can configure read operations to return only strongly consistent data, but this doubles the cost of the read (see Pricing and Commercial Considerations) and adds latency. There is also no way to guarantee read consistency when querying against DynamoDB’s global secondary indexes (GSIs); any operation performed against a GSI will be eventually consistent, returning potentially stale or deleted data, and therefore increasing application complexity.
MongoDB Atlas allows users to deploy, manage, and scale their MongoDB deployments using built in operational and security best practices, such as end-to-end encryption, network isolation, role-based access control, VPC peering, and more. With this “MongoDB as a service,” database deployments are guaranteed to be available and durable with distributed and auto-healing replica set members and continuous backups with point in time recovery to protect against data corruption. MongoDB Atlas is fully elastic with zero downtime configuration changes that can all be triggered by the user. Atlas also grants organizations deep insights into how their databases are performing with a comprehensive monitoring dashboard, a real-time performance panel, and customizable alerting.
For organizations that would prefer to run MongoDB on their own infrastructure, MongoDB, Inc. offers advanced operational tooling to handle the automation of the entire database lifecycle, comprehensive monitoring (tracking 100+ metrics that could impact performance), and continuous backup. Product packages like MongoDB Enterprise Advanced bundle operational tooling and visualization and performance optimization platforms with end-to-end security controls for applications managing sensitive data.
Finally, MongoDB’s deployment flexibility allows single clusters to span racks, data centers and continents. With replica sets supporting up to 50 members and zone sharding across regions, administrators can provision clusters that support active/active data center deployments, with write local/read global access patterns and data locality.
Offered only as a managed service on AWS, DynamoDB abstracts away its underlying partitioning and replication schemes. And while provisioning is simple, other key operational tasks are lacking when compared to MongoDB:
No standardized backup service for DynamoDB; users must use AWS Streams or the AWS Data Pipeline to back up data to S3 or use custom, self-developed utilities. These processes often take a full backup and restore and do not allow continuous or incremental backups. Also worth noting is that DynamoDB tables support the configuration of only 2 streams per table; if one is used to push data to an analytics engine and another is used for multi-region replication, then there is no capacity left for streaming data to backups.
Only 15 database metrics are reported by AWS Cloudwatch, which limits visibility into real-time database behavior
Limited toolset to allow developers and/or DBAs to optimize performance by visualizing schema or graphically profiling query performance
Limited security with no native support for encryption of data in-flight or at rest; users must create their own SSL certificates through the DynamoDB SDK. Beyond field-level access controls, DynamoDB is missing key functionality needed to power applications subject to regulatory compliance
Throughput rates are capped at 10k ops/sec. Any requirements to expand capacity must be made by a web form, and only granted once an AWS representative responds to the request. Users can increase capacity multiple times per day, but resources can take up to several hours to become available. As a result, DynamoDB may not provision capacity sufficiently quickly to handle sudden spikes in load. In addition, only four downsizing events are supported per day
Pricing & Commercial Considerations
In this section we will again compare DynamoDB with its closest analog from MongoDB, Inc., MongoDB Atlas.
DynamoDB's pricing model is based on throughput. Users pay for a certain capacity on a given table and AWS automatically throttles any reads or writes that exceed that capacity.
This sounds simple in theory, but the reality is that correctly provisioning throughput and estimating pricing is far more nuanced.
Below is a list of all the factors that could impact the cost of running DynamoDB:
Size of the data set per month
Size of each item
Number of reads per second (pricing is based on “read capacity units”, which are equivalent to reading a 4KB object) and whether those reads need to be strongly consistent or eventually consistent (the former is twice as expensive)
If accessing a JSON object, the entire document must be retrieved, even if the application needs to read only a single element
Number of writes per second (pricing is based on “write capacity units”, which are the equivalent of writing a 1KB object)
Size and throughput requirements for each index created against the table
Data transferred by Dynamo streams per month
Data transfers both in and out of the database per month
Cross-regional data transfers, EC2 instances, and SQS queues needed for cross-regional deployments
The use of additional AWS services to address what is missing from DynamoDB’s limited key value query model
Use of on-demand or reserved instances
Number of metrics pushed into CloudWatch for monitoring
Number of events pushed into CloudTrail for database auditing
Key things to point out from the list above are that indexes affect pricing and consistent reads are twice as expensive. Another important fact to keep in mind with DynamoDB is that throughput pricing actually dictates the number of partitions (individual partitions are limited to 10 GB with 3,000 read capacity units or 1,000 write capacity units), not total throughput. Since users don’t have any control over partitioning, if any individual partition is saturated, one would have to double capacity to split partitions rather than scale linearly. Very careful design of the data model is essential to ensure that provisioned throughput can be realized.
Compared to DynamoDB, pricing for MongoDB Atlas is relatively straightforward:
Select the instance size with enough RAM to accommodate the portion of your data (included indexes) that clients access most often
Determine your IOPS requirement
Add storage as necessary
Adjust your capacity on demand
When to use DynamoDB vs. MongoDB
DynamoDB may work for organizations that are:
Looking for a database to support relatively simple key-value workloads
Heavily invested in AWS with no plans to change their deployment environment in the future
For organizations that need their database to support a wider range of use cases with more deployment flexibility and no platform lock-in, MongoDB would likely be a better fit. Biotechnology giant Thermo Fisher migrated from DynamoDB to MongoDB for their Instrument Connect IoT app, citing that while both databases were easy to deploy, MongoDB Atlas allowed for richer queries and much simpler schema evolution.
Fully managed MongoDB: Spin up a free cluster in minutes
MongoDB Atlas is the easiest way to deploy, manage, and scale MongoDB in the cloud. Get started with the free tier, which includes 512 MB of storage for learning and prototyping your application.
Want to Learn More?
MongoDB Atlas Best Practices
This guide describes the best practices to help you get the most out of the MongoDB Atlas service, including: schema design, capacity planning, security, and performance optimization.
MongoDB Atlas Security Controls
This document will provide you with an understanding of MongoDB Atlas' Security Controls and Features as well as a view into how many of the underlying mechanisms work.