MongoDB Case Study: Craigslist

Craigslist is a popular classifieds and job posting community which serves 570 cities in 50 countries. With 1.5 million new classified ads posted every day, Craigslist must archive billions of records in many different formats, and must be able to query and report on these archives at runtime. Historically, Craigslist stored its information in a MySQL cluster but the lack of flexibility and management costs became barriers for continued use. In 2011, Craigslist migrated over two billion documents to MongoDB for its scalability and flexible schema. For much of the history of Craigslist, MySQL was the only option for data storage, including the archive. The original Craigslist archive application took the existing live database data and copied it to the archive system. But using a relational database system limited flexibility and caused lengthy delays because changes to the live database schema needed to be propagated to the archive system. When making changes to billions of rows in their MySQL cluster, Craigslist could not move data to the archive. Archive-ready data would pile up in the production database; performance on the live database deteriorated. To prevent further impediments to the company's growth and ability to serve its customers, the team began looking for alternative explanations. After evaluating several NoSQL options, Craigslist settled upon MongoDB. One compelling reason is that MongoDB boasts built-in scalability. Each post and its metadata can be stored as a single document. As the schema changes on the live database, MongoDB can accommodate these changes without costly schema migrations. In addition, MongoDB's support for auto-sharding and high availability eased operational pain points for Craigslist. MongoDB enabled Craiglist to scale horizontally across commodity hardware without having to write and maintain complex, custom sharding code. Using auto-sharding, Craigslist's initial MongoDB deployment was designed to hold over 5 billion documents and 10TB of data. MongoDB concepts and features are similar, in many respects, to relational databases so Craigslist’s developers found the transition seamless. Lead developer Jeremy Zawodny, the author of High Performance MySQL, describes the transition: ...Coming from a relational background, specifically a MySQL background, a lot of the concepts carry over…. It makes it very easy to get started.“ To learn more, please visit the Craigslist case study on 10gen.com

Tagged with: 10gen, Craigslist, MongoDB, mongo, nosql, open source database, case study