Storing Large Objects and Files in MongoDB



Large objects, or “files”, are easily stored in MongoDB.  It is no problem to store 100MB videos in the database.  For example, MusicNation uses MongoDB to store its videos.

This has a number of advantages over files stored in a file system.  Unlike a file system, the database will have no problem dealing with millions of objects.  Additionally, we get the power of the database when dealing with this data: we can do advanced queries to find a file, using indexes; we can also do neat things like replication of the entire file set.

MongoDB stores objects in a binary format called BSON.  BinData is a BSON data type for a binary byte array.  However, MongoDB objects are typically limited to 4MB in size.  To deal with this, files are “chunked” into multiple objects that are less than 4MB each.  This has the added advantage of letting us efficiently retrieve a specific range of the given file.

While we could write our own chunking code, a standard format for this chunking is predefined, call GridFS.  GridFS support is included in many MongoDB drivers and also in the mongofiles command line utility.

A good way to do a quick test of this facility is to try out the mongofiles utility.  See the MongoDB documentation for more information on GridFS.

More Information

This post was updated in December 2014 to include additional resources and updated links.