|Deployment Options||As a service on Amazon Web Services|
|Querying||Key-value queries||Query & analyze data in multiple ways — by single keys, ranges, faceted search, graph traversals, and geospatial queries through to complex aggregations|
|Monitoring||Limited visibility into real-time database behavior|
MongoDB Atlas database as a service tracks 100+ metrics that could impact performance
|Backup||On-demand and continuous backups are available at different costs. Backups and restores are an additional charge.||MongoDB Atlas includes continuous, queryable backups with point-in-time recovery|
|Pricing||Throughput-based pricing. A wide range of inputs may affect price. See Pricing and Commercial Considerations.||MongoDB Atlas pricing is based on RAM, I/O, and storage.|
Updated December 2018
DynamoDB is a proprietary NoSQL database service built by Amazon and offered as part of the Amazon Web Services (AWS) portfolio.
The name comes from Dynamo, a highly available key-value store developed in response to holiday outages on the Amazon e-commerce platform in 2004. Initially, however, few teams within Amazon adopted Dynamo due to its high operational complexity and the trade-offs that needed to be made between performance, reliability, query flexibility, and data consistency.
Around the same time, Amazon found that its developers enjoyed using SimpleDB, its primary NoSQL database service at the time which allowed users to offload database administration work. But SimpleDB, which is no longer being updated by Amazon, had severe limitations when it came to scale; its strict storage limitation of 10 GB and the limited number of operations it could support per second made it only viable for small workloads.
DynamoDB, which was launched as a database service on AWS in 2012, was built to address the limitations of both SimpleDB and Dynamo.
MongoDB is a NoSQL database built by MongoDB, Inc. The company was established in 2007 by former executives and engineers from DoubleClick, which Google acquired and now uses as the backbone of its advertising products. The founders originally focused on building a platform as a service using entirely open source components, but when they struggled to find an existing database that could meet their requirements for building a service in the cloud, they began work on their own database system. After realizing the potential of the database software on its own, the team shifted their focus to what is now MongoDB. The company released MongoDB in 2009.
MongoDB was designed to create a technology foundation that enables development teams through:
The document data model – presenting them the best way to work with data.
A distributed systems design – allowing them to intelligently put data where they want it.
A unified experience that gives them the freedom to run anywhere – allowing them to future-proof their work and eliminate vendor lock-in.
MongoDB stores data in flexible, JSON-like documents, meaning fields can vary from document to document and data structure can be changed over time. This model maps to the objects in application code, making data easy to work with for developers. Related information is typically stored together for fast query access through the MongoDB query language. MongoDB uses dynamic schemas, allowing users to create records without first defining the structure, such as the fields or the types of their values. Users can change the structure of records (which we call documents) simply by adding new fields or deleting existing ones. This flexible data model makes it easy for developers to represent hierarchical relationships and other more complex structures. Documents in a collection need not have an identical set of fields and denormalization of data is common.
In summer of 2016, MongoDB Atlas, a fully managed cloud database, was announced. MongoDB Atlas allows users to offload operational tasks and features built-in best practices for running the database.
Many concepts in DynamoDB have close analogs in MongoDB. The table below outlines some of the common concepts across DynamoDB and MongoDB.
|Secondary Index||Secondary Index|
MongoDB can be run anywhere – from a developer’s laptop to an on-prem data center to within any of the public cloud platforms. As mentioned above, MongoDB is also available as a fully managed cloud database with MongoDB Atlas; this model is most similar to how DynamoDB is delivered.
In contrast, DynamoDB is a proprietary database only available on Amazon Web Services. While a downloadable version of the database is available for prototyping on a local machine, the database can only be run in production in AWS. As such, organizations looking into DynamoDB should consider the implications of building on a data layer that is locked in to a single cloud vendor.
Comparethemarket.com, the UK’s leading price comparison service, completed a transition from on-prem deployments with Microsoft SQL Server to AWS and MongoDB. When asked why they hadn’t selected DynamoDB, a company representative was quoted as saying "DynamoDB was eschewed to help avoid AWS vendor lock-in."
MongoDB stores data in a JSON-like format called BSON, which allows the database to support a wide spectrum of data types including dates, timestamps, 64-bit integers, & Decimal128. MongoDB documents can be up to 16 MB in size; with GridFS, even larger assets can be natively stored within the database.
Unlike some NoSQL databases that push enforcement of data quality controls back into the application code, MongoDB provides built-in schema validation. Users can enforce checks on document structure, data types, data ranges and the presence of mandatory fields. As a result, DBAs can apply data governance standards, while developers maintain the benefits of a flexible document model.
DynamoDB is a key-value store with added support for JSON to provide document-like data structures that better match with objects in application code. An item or record cannot exceed 400KB. Compared to MongoDB, DynamoDB has limited support for different data types. For example, it supports only one numeric type and does not support dates. As a result, developers must preserve data types on the client, which adds application complexity and reduces data re-use across different applications. DynamoDB does not have native data validation capabilities.
MongoDB's query language enables developers to build applications that can query and analyze their data in multiple ways – by single keys, ranges, faceted search, graph traversals, and geospatial queries through to complex aggregations, returning responses in milliseconds. Complex queries are executed natively in the database without having to use additional analytics frameworks or tools. This helps users avoid the latency that comes from syncing data between operational and analytical engines.
MongoDB ensures fast access to data by any field with full support for secondary indexes. Indexes can be applied to any field in a document, down to individual values in arrays.
MongoDB 4.0 and higher supports multi-document transactions, making it the only database to combine the ACID guarantees of traditional relational databases; the speed, flexibility, and power of the document model; and the intelligent distributed systems design to scale-out and place data where you need it.
Multi-document transactions feel just like the transactions developers are familiar with from relational databases – multi-statement, similar syntax, and easy to add to any application. Through snapshot isolation, transactions provide a globally consistent view of data and enforce all-or-nothing execution. MongoDB allows reads and writes against the same documents and fields within the transaction. For example, users can check the status of an item before updating it. MongoDB best practices advise up to 1,000 operations in a single transaction. The addition of multi-document transactions makes it even easier for developers to address more use-cases with MongoDB. Learn more about MongoDB transactions here.
Supported indexing strategies such as compound, unique, array, partial, TTL, geospatial, sparse, hash, and text ensure optimal performance for multiple query patterns, data types, and application requirements. Indexes are strongly consistent with the underlying data.
DynamoDB supports key-value queries. For queries requiring aggregations, graph traversals, or search, data must be copied into additional AWS technologies, such as Elastic MapReduce or Redshift, increasing latency, cost, and complexity. The database supports two types of indexes: Global secondary indexes (GSIs) and local secondary indexes (LSIs). Users can define up to 20 GSIs per table. Indexes can be defined as hash or hash-range indexes; more advanced indexing strategies are not supported.
GSIs, which are eventually consistent with the underlying data, do not support ad-hoc queries and usage requires knowledge of data access patterns in advance. GSIs can also not index any element below the top level record structure – so you cannot index sub-documents or arrays. LSIs can be queried to return strongly consistent data, but must be defined when the table is created. They cannot be added to existing tables and they cannot be removed without dropping the table.
DynamoDB indexes are sized and provisioned separately from the underlying tables, which may result in unforeseen issues at runtime. The DynamoDB documentation explains, "Because some or all writes to a DynamoDB table result in writes to related GSIs, it is possible that a GSI’s provisioned throughput can be exhausted. In such a scenario, subsequent writes to the table will be throttled. This can occur even if the table has available write capacity units."
DynamoDB also supports multi-record ACID transactions. Unlike MongoDB transactions, each DynamoDB transaction is limited to just 10 write operations; the same item also cannot be targeted with multiple operations as a part of the same transaction. As a result, complex business logic may require multiple, independent transactions, which would add more code and overhead to the database, while also resulting in the possibility of more conflicts and transaction failures. Only base data in a DynamoDB table is transactional. Secondary indexes, backups and streams are updated “eventually”. This can lead to “silent data loss”. Subsequent queries against indexes can return data that is has not been updated data from the base tables, breaking transactional semantics. Similarly data restored from backups may not be transactionally consistent with the original table.
Thermo Fisher Scientific, one of the leading companies in the world in the genetic testing and precision laboratory equipment markets, migrated from DynamoDB to MongoDB Atlas for an IoT application that allows researchers to monitor tests and equipment from their mobile devices. When asked to compare the two databases, the Senior Software Architect explained that, “MongoDB offers a more powerful query language for richer queries to be run and allows for much simpler schema evolution.”
MongoDB is strongly consistent by default as all read/writes go to the primary in a MongoDB replica set , and scaled across multiple partitions (shards). If desired, consistency requirements for read operations can be relaxed. Through secondary consistency controls, read queries can be routed only to secondary replicas that fall within acceptable consistency limits with the primary server.
DynamoDB is eventually consistent by default. Users can configure read operations to return only strongly consistent data, but this doubles the cost of the read (see Pricing and Commercial Considerations) and adds latency. There is also no way to guarantee read consistency when querying against DynamoDB’s global secondary indexes (GSIs); any operation performed against a GSI will be eventually consistent, returning potentially stale or deleted data, and therefore increasing application complexity.
MongoDB Atlas allows users to deploy, manage, and scale their MongoDB deployments using built in operational and security best practices, such as end-to-end encryption, network isolation, role-based access control, VPC peering, and more. Atlas deployments are guaranteed to be available and durable with distributed and auto-healing replica set members and continuous backups with point in time recovery to protect against data corruption. MongoDB Atlas is fully elastic with zero downtime configuration changes that can all be triggered by the user. Atlas also grants organizations deep insights into how their databases are performing with a comprehensive monitoring dashboard, a real-time performance panel, and customizable alerting.
For organizations that would prefer to run MongoDB on their own infrastructure, MongoDB, Inc. offers advanced operational tooling to handle the automation of the entire database lifecycle, comprehensive monitoring (tracking 100+ metrics that could impact performance), and continuous backup. Product packages like MongoDB Enterprise Advanced bundle operational tooling and visualization and performance optimization platforms with end-to-end security controls for applications managing sensitive data.
Finally, MongoDB’s deployment flexibility allows single clusters to span racks, data centers and continents. With replica sets supporting up to 50 members and geo-aware sharding across regions, administrators can provision clusters that support globally deployments, with write local/read global access patterns and data locality. Using Atlas Global Clusters, developers can deploy fully managed “write anywhere” active-active clusters, allowing data to be localized to any region. With each region mastering its own data, the risks of data loss and eventual consistency imposed by multi-master approach used by DynamoDB are eliminated, and customers can meet the data sovereignty demands of new privacy regulations.
Offered only as a managed service on AWS, DynamoDB abstracts away its underlying partitioning and replication schemes. And while provisioning is simple, other key operational tasks are lacking when compared to MongoDB:
Less than 20 database metrics are reported by AWS Cloudwatch, which limits visibility into real-time database behavior
Limited toolset to allow developers and/or DBAs to optimize performance by visualizing schema or graphically profiling query performance
DynamoDB supports cross region replication with multi-master global tables, however these add further application complexity and cost, with eventual consistency, risks of data loss due to write conflicts between regions, and no automatic client failover
In this section we will again compare DynamoDB with its closest analog from MongoDB, Inc., MongoDB Atlas.
DynamoDB's pricing model is based on throughput. Users pay for a certain capacity on a given table and AWS automatically throttles any reads or writes that exceed that capacity.
This sounds simple in theory, but the reality is that correctly provisioning throughput and estimating pricing is far more nuanced.
Below is a list of all the factors that could impact the cost of running DynamoDB:
Size of the data set per month
Size of each item
Number of reads per second (pricing is based on “read capacity units”, which are equivalent to reading a 4KB object) and whether those reads need to be strongly consistent or eventually consistent (the former is twice as expensive)
If accessing a JSON object, the entire document must be retrieved, even if the application needs to read only a single element
Number of writes per second (pricing is based on “write capacity units”, which are the equivalent of writing a 1KB object)
Size and throughput requirements for each index created against the table
Data transferred by Dynamo streams per month
Data transfers both in and out of the database per month
Cross-regional data transfers, EC2 instances, and SQS queues needed for cross-regional deployments
The use of additional AWS services to address what is missing from DynamoDB’s limited key value query model
Use of on-demand or reserved instances
Number of metrics pushed into CloudWatch for monitoring
Number of events pushed into CloudTrail for database auditing
Key things to point out from the list above are that indexes affect pricing and consistent reads are twice as expensive. Another important fact to keep in mind with DynamoDB is that throughput pricing actually dictates the number of partitions, not total throughput. Since users don’t have precise control over partitioning, if any individual partition is saturated, one may have to dramatically increase capacity to split partitions rather than scale linearly. Very careful design of the data model is essential to ensure that provisioned throughput can be realized. AWS has introduced the concept of Adaptive Capacity, which will automatically increase the available resources for a single partition when it becomes saturated, however it is not without limitations. For example, total read and write volume to a single partition cannot exceed 3,000 read capacity units and 1,000 write capacity units per second. The required throughput increase cannot exceed the total provisioned capacity for the table. Adaptive capacity doesn’t grant more resources as much as borrow resources from lower utilized partitions. And finally, DynamoDB may take up to 30 minutes to provision additional capacity.
For customers frustrated with capacity planning exercises for DynamoDB, AWS recently introduced DynamoDB On-Demand, which will allow the platform to automatically provision additional resources based on workload demand. On-demand is suitable for low-volume workloads with short spikes in demand. However, it can get expensive quick — when the database’s utilization rate exceeds 14% of the equivalent provisioned capacity, DynamoDB On-Demand becomes more expensive than provisioning throughput.
Compared to DynamoDB, pricing for MongoDB Atlas is relatively straightforward:
Select the instance size with enough RAM to accommodate the portion of your data (included indexes) that clients access most often
Determine your IOPS requirement
Add storage as necessary
Adjust your capacity on demand
DynamoDB may work for organizations that are:
Looking for a database to support relatively simple key-value workloads
Heavily invested in AWS with no plans to change their deployment environment in the future
For organizations that need their database to support a wider range of use cases with more deployment flexibility and no platform lock-in, MongoDB would likely be a better fit. Biotechnology giant Thermo Fisher migrated from DynamoDB to MongoDB for their Instrument Connect IoT app, citing that while both databases were easy to deploy, MongoDB Atlas allowed for richer queries and much simpler schema evolution.
This guide describes the best practices to help you get the most out of the MongoDB Atlas service, including: schema design, capacity planning, security, and performance optimization.
This document will provide you with an understanding of MongoDB Atlas' Security Controls and Features as well as a view into how many of the underlying mechanisms work.