May 7, 2013 by Jared Rosoff Comments
One of the largest factors affecting the performance of MongoDB is the choice of storage configuration. As data sets exceed the size of memory, the random IOPS rate of your storage will begin to dominate database performance. How you split your logs, journal and data files across drives will impact performance and the maintainability of your database. Even choice of filesystem and read-ahead settings can have a major impact. A large number of performance issues we encounter in the field are related to misconfigured or under-provisioned storage. Storage configuration is often more important than instance size in determining the expected performance of a MongoDB server.
That’s why we’re excited to announce the availability of MongoDB with bundled storage configurations on the Amazon Web Services (AWS) Marketplace. Working closely with the Marketplace and EBS teams, we’ve made available a new set of MongoDB AMI’s that not only include the world’s leading document database software installed and configured according to our best practices, but also include high performance storage configurations that leverage Amazon’s Provisioned IOPS storage volumes, including Amazon’s new 4000 IOP pIOPS drives. These options take a lot of the guess work out of running MongoDB on EC2 and help ensure a great out-of-the-box experience without needing to do any additional setup yourself.
These configurations offer radically improved performance for MongoDB, even on datasets much larger than RAM. If you want to take MongoDB for a spin, or set up your first production cluster, we recommend starting with these images. We plan to keep extending this set of configurations to give you more choices to address different workloads and use cases.
The MongoDB with Bundled Storage AMI is available today in 3 configurations:
The choice of configuration will depend on how much storage capacity you want to put behind your MongoDB instance. For comparison, we have found that ephemeral storage and regular (non-pIOPS) EBS volumes can reliably deliver about 100 IOPS on a sustained basis. That means that these configurations can deliver up to 10x-40x higher out-of-memory throughput than non-pIOPS based setups.
There’s no charge from 10gen for using these AMI’s. You pay only the EC2 usage charges for the instances and disk volumes used by your setup. Take them for a test-drive and please let us know what you think.
Here’s what you get when you use these instances:
When you launch the AMI on an EC2 instance, there will be three EBS volumes attached. One each for Data, Journal and Logs. By separating these onto separate volumes, we help decrease contention for disk access during high load scenarios and avoid head-of-line blocking that can occur.
The data volume is provisioned at 200GB or 400GB, with IOPS rates at 1000, 2000 and 4000. For write-heavy workloads, this helps ensure that the background flush can get synced quickly to disk. For read-heavy workloads, the IOPS rate of the drive determines the rate at which a random document, or b-tree bucket can be loaded from disk into memory.
The journal gets its own 25GB drive provisioned at 250 IOPS. While 25GB is large for the journal, we wanted to make sure we had enough IOPS to handle the journal load and to provide sufficient capacity for reading the journal during a recovery. In order to maintain the 10:1 ratio of size to IOPS imposed by EBS, we made it a little bigger than needed. Separating the journal onto a separate volume ensures that a journal flush is never queued behind the big IO’s that happen when data files are synced.
The log volumes are provisioned at 10GB, 15GB and 20GB sizes with 100, 150, and 200 IOPS. This gives you plenty of room for storage of logs as well as predictable storage performance for collection of log data.
We’ve pre-configured EXT4 filesystem, sensible mount options, read-aheads and ulimit settings into the AMI. pIOPS EBS volumes are rated for 16KB IO’s, so using read-aheads higher than this size actually lead to decreased throughput. We’ve set this up out of the box.
We started with Amazon’s latest and greatest Linux AMI, and then added in 10gen’s RPM repo. No more adding a repo to get access to the latest software version. We’ve also pre-installed MongoDB, the 10gen MMS Agent and various useful software utilities like sysstat (which contains the useful iostat utility) and munin-node (which MMS can use to access host statistics).
The MMS agent is deactivated by default, but can be activated simply by adding your MMS account ID and then starting the agent.
A significant percentage of MongoDB applications are currently deployed in the cloud. We expect this percentage to continue to grow as enterprises discover the cost and agility benefits of running their applications on clouds like AWS. As such, it's critically important that MongoDB run exceptionally well on Amazon, and with the addition of pIOPS to the MongoDB AMI's on Marketplace, MongoDB performance in the cloud just got a big boost. We look forward to continuing to work closely with Amazon to facilitate MongoDB performance improvements on AWS.