Evaluating Amazon DocumentDB Performance

Amazon DocumentDB performance was compared against MongoDB Atlas, the fully managed MongoDB service, using two benchmarks: YCSB and TPC-C. The tests were run twice, once against a DocumentDB cluster using three r54.4xlarge instances and their M40 equivalents in MongoDB Atlas, and once using r5.xlarge and M50 configurations respectively. This approach produced configurations with nearly identical costs. All writes in these tests were performed with w:majority, even those which would ordinarily use w:1 on Atlas, to normalize the test results.

Both benchmarks showed very significant performance advantages for MongoDB Atlas. DocumentDB's performance was close only for workloads that were 95% read and 5% write, but quickly dropped off even there once throughput increased.

YCSB

YCSB is a "lowest common denominator" type benchmark, and only uses primary key queries. We loaded benchmark data and then ran two YCSB workloads, once with a 95/05 ratio of reads to writes and the other with a 50/50 ratio. The results of the two tests were extremely similar; the graph for the M40/r5.large configuration is shown below.

YCSB

MongoDB Atlas outperformed DocumentDB on all workloads tested except the 95% reads / 5% writes with low numbers of threads. As the number of threads rose, Atlas’ performance outpaced DocumentDB even in this case.

TPC-C

TPC Benchmark C (TPC-C) is an industry-standard on-line transaction processing (OLTP) benchmark. TPC-C involves a mix of five concurrent transactions of different types and complexity either executed on-line or queued for deferred execution. As such, its results are extremely significant for anyone intending to rely on database transaction support in their application. While TPC-C is a benchmark that assumes a traditional tabular database layout, MongoDB’s adaptation makes every attempt to stay consistent with the spirit of the original benchmark specification as well as to be compliant to all specification requirements where possible. All the source code used and validation scripts are published in Github to allow the reader to recreate and verify our results. The same TPC-C benchmark tests were used for both MongoDB Atlas and DocumentDB in order for the results to be comparable.

MongoDB Atlas performance in the TPC-C benchmark was nearly 60% greater than DocumentDB with an equivalent hardware configuration.

TPCC

Much as with YCSB, performance was close as long as the number of concurrent threads remained low, although even then with an advantage to MongoDB Atlas. Beyond four threads, the gap opens up and becomes extremely significant, with Atlas’ performance continuing to improve with higher numbers of threads for equivalent hardware configurations.

In addition to the raw throughput numbers above, the p99 latency results show DocumentDB has 20% higher latency compared to the same workload on an equivalent configuration of MongoDB Atlas.

TPCC p99 latency

The testing harness used to obtain these results is publicly available, so you can run them in your own environment.

Overall, the results found that DocumentDB performs well with extremely simple find() statements, either for single documents or for ranges, using only primary keys. It begins to suffer, however, when writes are introduced into the mix, falls behind badly on write-heavy loads, and it has serious difficulties when anything beyond rudimentary query language operations were used.

It is also worth noting that each time the MongoDB team has run these benchmarks against DocumentDB, significant code changes have been required to accommodate limitations in DocumentDB’s emulation of the MongoDB API. These failures are important because they are representative of the sorts of issues that developers will encounter in the real world when trying to use DocumentDB. Particularly notable are lack of support for $expr, $toLong, any expressive $lookup, and indeed any other type conversions, such as $toDecimal. Even worse, error messages are ambiguous or generic, leaving it to individual developers to figure out which expression failed — lengthening debugging times and delaying rollouts.

Finally, the data load times for both benchmarks are also worth highlighting, as they showcase the differences in completion times that users can expect for comparable configurations of DocumentDB as opposed to MongoDB Atlas.

Latency