Amazon DocumentDB performance was compared against MongoDB Atlas, the fully managed MongoDB service, using two benchmarks: YCSB and Socialite. The DocumentDB cluster used three r4.4xlarge instances, and the Atlas cluster used three M60 instances. This produced configurations with nearly identical costs. All writes in these tests were performed with w:majority, even those which would ordinarily use w:1 on Atlas, to normalize the test results.
YCSB is a "lowest common denominator" type benchmark, and only uses primary key queries. Three YCSB workloads were run, each on two data sets. One data set was small enough to fit entirely in RAM, while the other was much larger than RAM. Based on our knowledge of how customers use MongoDB, all data sets used 2.5Kb documents containing 25 fields.
MongoDB Atlas outperformed DocumentDB on all workloads tested except the 95% reads / 5% writes workloads. During this testing, we found that DocumentDB crashed frequently during YCSB's load phase when we tried to run it on datasets containing more than 200 million documents. We were unable to determine the root cause of these crashes, but we measured failover times between two and four minutes.
Socialite is a benchmark MongoDB engineers developed years ago to test performance as part of standard regression testing. Its workload simulates a social networking application, so it uses a more real-world access pattern that includes complex querying. Unlike YCSB, it can only be run against the MongoDB API, and until now has not been used to compare MongoDB to other databases.
Socialite exposed the serious difficulties DocumentDB has with sophisticated queries. In multiple scenarios, DocmentDB's query optimizer ignored indexes and used collection scans, leading to very weak performance:
The testing harness used to obtain these results is publicly available, so you can run them in your own environment.
Overall, the results found that DocumentDB performs well with extremely simple find() statements, either for single documents or for ranges, using only primary keys. It begins to suffer, however, when writes are introduced into the mix, falls behind badly on write-heavy loads, and it has serious difficulties when anything beyond rudimentary query language operations were used.