MongoDB for Collaborative Science

Dan Gunter

Computer Scientist at LBNL

Shreyas Cholia

Computer Systems Engineer at NERSC/LBNL

May 10, 2013

Scientific data sets are messy (loose data structures, evolving schemas) and large. MongoDB is becoming increasingly popular in the scientific computing space for precisely these reasons. We discuss the advantages of using MongoDB in scientific computing, and describe how we've built the Scientific Computing infrastructure for The Materials Project using MongoDB. We also discuss "warts" in the MongoDB implementation that affect our choices of how and when to use it.