What is the Right Data Model?



There is certainly plenty of activity in the nonrelational (“NOSQL”) db space right now.  We know for these projects the data model is not relational.  But what is the data model?  What is the right model?

There are many possibilities, the most popular of which are:

Key/Value. Pure key/value stores are blobs stored by key.

Tabular. Some projects use a Google BigTable-like data model which we call “tabular” here – or one can think of it as “multidimensional tabular”.

Document-Oriented. Typical of these are JSON-style data stores.

We think this is a very important topic.  What is the right data model?  Should there be standardization?

Below are some thoughts on the approaches above.  Of course, as MongoDB committers, we are biased – you know which one we’re going to like.

Key/value has the advantage of being simple.  It is easy to make such systems fast and scalable.  Con is that it is too simple for easy implementation of some real world problems.  We’d like to see something more general purpose.

The tabular space brings more flexibility.  But why are we sticking to tables?  Shouldn’t we do something closer to the data model of our programming languages?  Tabular jettisons the theoretical underpinnings of relational algebra, yet we still have significant mapping work from program objects to “tables”.  If I were going to work with tables, I’d really like to have full relational power.

We really like the document-oriented approach.  The programming languages we use today, not to mention web services, map very nicely to say, JSON.  A JSON store gives us an object-like representation, yet also is not tied too tightly to any one single language, which seems wrong for a database.

Would love to hear the thoughts of others.

See also: the BSON blog post