How to Scale MongoDB


Your app just went viral—congratulations! Each day brings new users, new possibilities, and new burdens for your servers to handle. For many applications, the ability to scale infrastructure, especially database infrastructure, can be a significant growing pain.

As such, scaling quickly and effectively is a key capability of any production-ready database. In this article, we’ll take a look at the scaling options of MongoDB, when to use them, and how to configure them in MongoDB Atlas.

What is MongoDB?

MongoDB is a modern, document database. In contrast to relational databases such as MySQL and Oracle, non-relational databases such as MongoDB were designed for more modern architectures, such as cloud, where you need to scale out, reach a global audience, and maintain sovereignty within a single cluster. Non-relational databases excel at handling large datasets, and typically require less upfront design than a relational database.

What is Scaling in MongoDB?

As your application grows, each piece of the application must scale along with the size of your user base and your data needs. Historically, database scaling has been a major pain point for large applications or applications with above average throughput, and options have been either limited in number or costly to implement.

In contrast, MongoDB has a full range of scaling options available, and they are built into MongoDB Atlas, MongoDB’s database-as-a-service offering. Let’s look at a few different ways MongoDB can scale.

Vertical Versus Horizontal

Vertical Scaling

vertical scaling in mongodb

Vertical scaling refers to increasing the processing power of a single server or cluster. Both relational and non-relational databases can scale up, but eventually, there will be a limit in terms of maximum processing power and throughput. Additionally, there are increased costs with scaling up to high-performing hardware, as costs do not scale linearly.

Horizontal Scaling

horizontal scaling in mongodb

Horizontal scaling, also known as scale-out, refers to bringing on additional nodes to share the load. This is difficult with relational databases due to the difficulty in spreading out related data across nodes. With non-relational databases, this is made simpler since collections are self-contained and not coupled relationally. This allows them to be distributed across nodes more simply, as queries do not have to “join” them together across nodes.

Scaling MongoDB horizontally is achieved through sharding (preferred) and replica sets.

Sharding Versus Replica Sets


horizontal scaling with sharding in mongodb

As mentioned above, sharding is horizontal scaling by spreading data across multiple nodes. Each node contains a subset of the overall data. This is especially effective for increasing throughput for use cases that involve significant amounts of write operations, as each operation only affects one of the nodes and the partition of data it is managing.

While sharding happens automatically in MongoDB Atlas, it is still up to us to configure the shard key, which is used by MongoDB for partitioning the data in a non-overlapping fashion across shards. This can be done automatically through either ranged or hashed sharding, or customized using zoned sharding. For more information on these options, see this post on sharding from the official MongoDB blog.

Over time, datasets typically do not grow uniformly, and various shards will grow at faster rates than others. As your workloads evolve and data sets grow, there will be a need to rebalance data to ensure an even distribution of load across the cluster. This uneven distribution of data is addressed through shard balancing. In MongoDB, this is handled automatically by the sharded cluster balancer.

Replica Sets

replica sets diagram in mongodb

Replica sets seem similar to sharding, but they differ in that the dataset is duplicated. Replication allows for high availability, redundancy/failover handling, and decreased bottlenecks on read operations. However, they can also introduce issues for applications with large amounts of write transactions, as each update must be propagated over to every replica set member.

Scaling Proactively Versus Scaling Reactively

Proactive scaling refers to scaling your database in advance of foreseen load or high-traffic events. This could be based upon a regular pattern (e.g., day of the week or certain times of the year), or it could be done before specific events, such as launching a marketing campaign.

In contrast, reactive scaling refers to scaling in response to metrics. These could be warning signs such as slow transactions and query response times, or it could even be error messages coming from your database monitoring. In the worst-case scenario, this could be an outage due to excessive load.

Naturally, proactive scaling is preferable when possible.

Now, let’s take a look at how to scale MongoDB to meet the needs of your application.

MongoDB Atlas Scaling

As a service offering, MongoDB Atlas makes scaling as easy as setting the right configuration. Both horizontal and vertical scaling are supported.

Vertical scaling is as simple as configuring a cluster tier. Note that even within a tier, further scaling is possible (including auto scaling from the M10 tier upwards). We'll look at that later.

configuring scaling in mongodb atlas

Horizontal scaling comes through the deployment of a sharded cluster.

Deploying a Sharded Cluster

In MongoDB, a sharded cluster consists of shards, routers/balancers, and config servers for metadata. While setting this up manually would require some infrastructure setup and configuration, Atlas makes this quite simple. Just toggle the option on for your MongoDB cluster and select the number of shards.

sharding your cluster in mongodb atlas

(Please note: This is only available for M30 clusters and up.)

The default setup creates replica sets and mongods for each of the shards and the config servers. This provides high availability, redundancy, and increased read and write performance through the use of both types of horizontal scaling. The routers, or mongos, distribute queries and write operations across the shards according to the data which is on that shard.

Don’t forget, a shard key needs to be configured, and there are a few different options available. For more information, see the MongoDB documentation on shard keys.

MongoDB Auto-Scaling

MongoDB Atlas has cluster auto-scaling, which scales vertically based on cluster usage. This is as simple as configuring the cluster tier:

setting up auto-scaling in mongodb atlas

Both cluster tier/CPU power and storage amount can be auto-scaled. This gives you automated and reactive vertical scaling both up and down, without having to worry about setting up new servers, transferring data, or even downtime in between. If necessary, the cluster can also be paused, effectively scaling the whole cluster to 0 except for storage.


In this article, we reviewed different types of scaling as well as how to implement each of these in MongoDB Atlas. For more information about MongoDB and case studies, check out MongoDB at Scale.

FAQs on Scaling MongoDB

Is MongoDB good for large data?

Yes! As a modern, non-relational database, MongoDB is designed to efficiently handle large datasets through both horizontal and vertical scaling.

Why is MongoDB scalable?

As a NoSQL database, MongoDB is scalable as its data is not coupled relationally. Data is stored as JSON-like documents which are self-contained. This allows those documents to be easily distributed across multiple nodes through horizontal scaling.

What is horizontal scaling in MongoDB?

Horizontal scaling, also known as scale-out, refers to bringing on additional nodes to share the load. Scaling MongoDB horizontally is achieved primarily through sharding.

What is vertical scaling in MongoDB?

Vertical scaling refers to increasing the processing power of a single server or cluster through adding CPU(s), RAM, and I/O. It is simple and effective for small to mid-range use cases, but can become expensive on the higher end.

Is horizontal scaling cheaper?

Yes, in some cases. Lower-tier hardware is more of a commodity, and horizontal scaling can be replicated to handle extremely large datasets. In some smaller applications, vertical scaling is cheaper due to the overhead of setting up shards or replica sets.

Is horizontal or vertical scaling better?

It depends on your use case! For most applications, you want the ability to do both, as that gives flexibility in meeting the throughput needs of your application. If the needs of your application can be met with a single instance, vertical scaling tends to be the simpler, more straightforward option.

For workloads which are more than a single instance can handle, horizontal scaling becomes necessary. Horizontal scaling also supports low latency in globally-distributed applications as well as aids in complying with data-sovereignty requirements. From an administrative and maintenance perspective, horizontal scaling tends to be the more difficult task. Having this feature provided by your database platform can be a huge time-saver.

Ready to get started?

Launch a new cluster or migrate to MongoDB Atlas with zero downtime.