3 Things to Know When You Switch from SQL to MongoDB
Rate this article
Now that we have an understanding of the terminology as well as why MongoDB is worth the effort of changing your mindset, let's talk about three key ways you need to change your mindset.
Your first instinct might be to convert your existing columns and rows to fields and documents and stick with your old ways of modeling data. We've found that people who try to use MongoDB in the same way that they use a relational database struggle and sometimes fail.
We don't want that to happen to you.
Let's discuss three key ways to change your mindset as you move from SQL to MongoDB.
For those of us with SQL backgrounds, this is going to feel uncomfortable and probably a little odd at first. I promise it will be ok. Embrace document diversity. It gives us so much flexibility and power to model our data.
Let's take a look at an example that builds on the Polymorphic Pattern. Let's say we decided to keep a list of each user's social media followers inside of each
Userdocument. Lauren and Leslie don't have very many followers, so we could easily list their followers in their documents. For example, Lauren's document might look something like this:
This approach would likely work for most of our users. However, since Ron built a chair that appeared in the very popular Bloosh Magazine, Ron has millions of followers. If we try to list all of his followers in his
Userdocument, it may exceed the . The question arises: do we want to optimize our document model for the typical use case where a user has a few hundred followers or the outlier use case where a user has millions of followers?
We can begin modeling Ron's document just like Lauren's and include a list of followers. When we begin to approach the document size limit, we can add a new
has_extrasfield to Ron's document. (The field can be named anything we'd like.)
Then we can create a new document where we will store the rest of Ron's followers.
If Ron continues to gain more followers, we could create another overflow document for him.
The great thing about the Outlier Pattern is that we are optimizing for the typical use case but we have the flexibility to handle outliers.
So, embrace document diversity. Resist the urge to force all of your documents to have identical structures just because it's what you've always done.
When relational databases became popular, disk space was extremely expensive. Financially, it made sense to normalize data and save disk space. Take a look at the chart below that shows the cost per megabyte over time.
The cost has drastically gone down. Our phones, tablets, laptops, and flash drives have more storage capacity today than they did even five to ten years ago for a fraction of the cost. When was the last time you deleted a photo? I can't remember when I did. I keep even the really horribly unflattering photos. And I currently backup all of my photos on two external hard drives and multiple cloud services. Storage is so cheap.
Storage has become so cheap that we've seen a shift in the cost of software development. Thirty to forty years ago storage was a huge cost in software development and developers were relatively cheap. Today, the costs have flipped: storage is a small cost of software development and developers are expensive.
Instead of optimizing for storage, we need to optimize for developers' time and productivity.
As a developer, I like this shift. I want to be able to focus on implementing business logic and iterate quickly. Those are the things that matter to the business and move developers' careers forward. I don't want to be dragged down by data storage specifics.
So, optimize your data model for developer productivity and query optimization. Resist the urge to normalize your data for the sake of normalizing your data.
Data that is accessed together should be stored together. If you end up repeating data in your database, that's ok—especially if you won't be updating the data very often.
But here's the thing. Relying on transactions is a bad design smell.
Why? This builds on our first two points in this section.
First, not all documents need to have the same fields. Perhaps you're breaking up data between multiple collections because it's not all of identical structure. If that's the only reason you've broken the data up, you can probably put it back together in a single collection.
Second, data that is accessed together should be stored together. If you're following this principle, you won't need to use transactions. Some use cases call for transactions. Most do not. If you find yourself frequently using transactions, take a look at your data model and consider if you need to restructure it.
Today we discussed the three things you need to know as you move from SQL to MongoDB:
In summary, don't be like Ron. (I mean, don't be like him in this particular case, because Ron is amazing.)
Change your mindset and get the full value of MongoDB.