Databases and Predictability of Performance



A subject which perhaps doesn’t get enough attention is whether the performance of a database is predictable. What we are asking is: are there ever any surprises or gotchas in the time it takes for a db operation to execute?  For traditional database management systems, the answer is yes.

For example, statistical query optimizers can be unpredictable: if the statistics for a table change in production, the query plan may change.  This could result in a big change in performance – perhaps better, perhaps worse – but it certainly wasn’t an expected change.  Query plans and performance profiles that were never tested in QA may go into effect.

Another potential issue is locking.  A lock from one transaction may cause another operation that is normally very fast to be slow.

If a system is simple enough, it is predictable.  memcached is very predictable in performance: perhaps that is one reason it is so widely used.  Yet we also need more sophisticated tools, and as they become more advanced, predictability is hard.  A goal of the MongoDB project is to be reasonably predictable in performance.  Note this is a goal: the database is far from perfect in this regard today, but we think it certainly moves things in the right direction.

For example, the MongoDB query optimizer utilitizes concurrent query plan evaluation to assure good worst-case performance on queries, at a slight expense to average query time.  Further, the lockless design eliminates unpredictability from locking.  Other areas of the system could still use improvement: particularly concurrent query execution.  That said, this is certainly considered an important area for the project and will only get better over time.