Do you have an application with a growing user base, or do you have an application that you anticipate will grow in the future? If so, then the load on your database is most likely growing as your application saves larger amounts of data. Whether it’s the number of connections needed, the amount of data to be stored, or the increased processing power, any database will eventually hit a limit on what it can handle.
This article will explore:
Scalability is the ability to expand or contract the capacity of system resources in order to support the changing usage of your application. This can refer both to increasing and decreasing usage of the application.
Increased usage of your application brings three main challenges to your database server:
The CPU and/or memory becomes overloaded, and the database server either cannot respond to all the request throughput or do so in a reasonable amount of time.
Your database server runs out of storage, and thus cannot store all the data.
Your network interface is overloaded, so it cannot support all the network traffic received.
The first action you might take to address the need for increased capacity is application and database optimization. Examples include optimizing the application code, caching, and appropriately indexing your query patterns . These optimizations increase the efficiency of your application and should bring some relief. However, there comes a point when system resource limits are reached. At this point, you will want to consider scaling your database vertically, horizontally, or both.
Let’s discuss what these terms mean in more detail.
Vertical scaling refers to increasing the processing power of a single server or cluster. Both relational and non-relational databases can scale up, but eventually, there will be a limit in terms of maximum processing power and throughput. Additionally, there are increased costs with high-performance hardware, as costs do not scale linearly.
The main benefit of vertical scaling is that nothing changes about your database infrastructure other than the hardware specifications of the machine running the database.
As such, it’s transparent to the application. The only difference is that you have more CPUs, memory, and/or storage space.
Vertical scaling is a good option to try first if massive storage and processing are not required.
The downside of scaling up is that servers with more storage and processing power can be a lot more expensive.
There is also a physical limit on the amount of CPUs, memory, network interfaces, and hard-drives that can be used on a single machine. For those scaling up using a cloud platform provider, you will eventually reach the highest tier of machine available.
If scaling vertically requires a migration between hardwares, it could result in downtime or service disruption.
When cost and/or machine limitations become an issue, be sure to consider horizontal scaling.
Horizontal scaling, also known as scale-out, refers to bringing on additional nodes to share the load. This is difficult with relational databases due to the difficulty in spreading out related data across nodes. With non-relational databases, this is made simpler since collections are self-contained and not coupled relationally. This allows them to be distributed across nodes more simply, as queries do not have to “join” them together across nodes.
Scaling MongoDB horizontally is achieved through sharding (preferred).
MongoDB horizontal scaling (sharding) tries to be as transparent as possible, but may require application architecture and code changes. How you store and query the data can significantly affect your application performance.
Database systems that are scaled horizontally are also more complicated to manage and maintain, leading to more work for you and your team. This is where MongoDB Atlas can help with its out-of-the-box sharding.
Now that we have an understanding of vertical and horizontal scaling, let’s dive deeper into horizontal scaling and consider some implementation strategies.
There is a variety of scaling techniques which depend on the database system and what components are used. However, they all use the concept of a node, which is an individual machine storing some or all of the data. A group of nodes that work together is called a cluster.
There are two commonly used horizontal database scaling techniques: replication and horizontal partitioning (or sharding). MongoDB is a modern, document-based database which supports both of these.
Replication refers to creating copies of a database or database node. Replication adds fault-tolerance to a system. Each node in a cluster contains a copy of the data. If one of the nodes goes down, the cluster is still able to serve client requests because the other nodes in the cluster can respond to the requests.
Replication is also a form of scaling because client requests can be spread across all the nodes in the cluster instead of overwhelming a single node. This increases the capacity of the system to handle more database read requests.
In MongoDB, a set of replicated nodes is called a replica set. One of the nodes in a replica set is the primary node, and the other nodes are secondary nodes. Read requests are distributed between each of the nodes. However, only the primary node can be written to, and updates made to the primary node are then replicated to the other nodes.
Replication increases neither the total storage capacity of the system nor its ability to handle write requests. For these, we will need to look to partitioning.
Partitioning distributes data across multiple nodes in a cluster. Each replica set (known in MongoDB as a shard) in a cluster only stores a portion of the data based on a collection sharding key (sharding strategy), which determines the distribution of the data. This makes it possible to scale the storage capacity of the cluster virtually without limit. Since each node is only responsible for processing the data it stores, overall processing capacity for both reads and writes is increased as well.
However, partitioning is a more complex scaling strategy than replication. Because each node only stores part of the data, for each request, the database queries need to determine which node or nodes contain the relevant data. In MongoDB, the client application connects to a sharded cluster through a router which directs the requests to the appropriate nodes.
If the data is stored across multiple nodes, the reads and writes could be done in parallel. For large volume data reads, performance is improved because each node can read its section of data in parallel with the other nodes.
There is an overhead to reading from multiple nodes. The data from all the nodes still needs to be transferred over the network and then combined into a query result set. For small data reads, the network latency could be a significant portion of the overall query time. For those scenarios, it’s more efficient to query using targeted operations.
MongoDB has the ability to store both sharded and unsharded collections in a sharded cluster. This allows the application to take full advantage of the cluster for large data sets while using a primary shard for small data sets.
In order to take advantage of both scalability and fault tolerance, you need to combine partitioning and replication. You can configure multiple groups of nodes (replica sets in MongoDB) with replication and then run a sharded cluster on top of them. Each node in a replica set will hold a copy of the shard data. So, if a server goes down, the replica set can still respond to queries for the shard it holds. It would look something like this:
A partitioned and replicated configuration like this is a best practice and the default configuration for sharded clusters on MongoDB Atlas, MongoDB’s Database-as-a-Service offering.
Confused about your data infrastructure? Atlas also includes data autoscaling, which can help simplify your scaling setup.
For more information on scaling MongoDB specifically, check out How to Scale MongoDB.
Yes. MongoDB allows you to scale your clusters vertically by adding more resources to the cluster, or horizontally by partitioning the data via sharding.
Databases are scaled either vertically (by adding more resources to existing machines) or horizontally (by adding more machines, distributing data, and processing across those machines).
Scaling in DBMS is the ability to expand the capacity of a database system in order to support larger amounts or requests and/or store more data without sacrificing performance.
Scalability issues are problems caused by a system’s inability to support growing demand on resources such as storage/memory, processing, and network bandwidth. These are usually manifested as a system’s degraded performance, errors, or unresponsiveness.
Scalability is usually tested through load testing, e.g., simulating a real-life large number of requests and ensuring that the system can support those requests and adapt to differing amounts of load.
Yes. However, relational databases are not as easily scaled as more modern, non-relational databases. MongoDB, for example, is built from the ground up to scale massively and has high availability.
NoSQL databases are usually built by design for a distributed database environment, allowing them to take advantage of more availability and partition networking built-in solutions, which sometimes comes as a tradeoff for consistency.
Most relational database management systems (RDBMS), such as SQL Server and Oracle, choose consistency over availability. These systems often focus on storing business transaction information, and so consistency is critical to their operation.
On the other hand, most non-relational and NoSQL databases choose availability over consistency because their main focus is the ability to support large amounts of users and data volume even when some of the database nodes go down. This assists the support of scalability using the “partition of data” approach.
Generally, you start by scaling vertically by adding more storage, CPUs, and memory. You could also enable replication and serve some of the read requests from different nodes in the cluster. However, this may require that the application be aware of the different nodes.
Horizontal scaling involves adding additional servers and partitioning the system dataset and load over those servers. Vertical scaling involves expanding the resources used by a single server/replica set.
Neither is better. You should choose the type of scaling that meets your use case. Vertical scaling is generally simpler but more limited. Horizontal scaling is more complex but supports larger loads in terms of number of requests served as well as total data storage.