Overview
This tutorial shows you how to set up a sharded cluster in MongoDB Atlas and
build a Spring Boot application that uses Spring Data MongoDB to work with a
sharded users collection.
Sharding
Sharding distributes a dataset across multiple machines. This is useful for applications with large datasets or high read and write throughput. MongoDB supports two scaling approaches: vertical scaling and horizontal scaling.
Vertical scaling increases the capacity of a single server by upgrading its CPU, RAM, or storage. This approach has limits based on hardware capabilities and can become expensive.
Horizontal scaling divides your dataset and workload across multiple servers. Each server handles a subset of the overall workload, which can provide better efficiency than a single powerful server. MongoDB Atlas simplifies the management of sharded clusters.
To learn more about sharding, see Sharding in the MongoDB Server manual.
Tutorial
This tutorial shows you how to perform the following actions:
Verify the prerequisites
Set up a sharded cluster in MongoDB Atlas
Configure sharding for a collection
Implement the Spring Boot application
Configure the Spring Boot application
Verify the prerequisites
Before you begin, verify that you have the following:
A MongoDB account. You need a cluster M30 or higher to enable sharding.
A Spring Boot project with Spring Data MongoDB and Spring Web dependencies. You can create a project by using Spring Initializr.
Set up a sharded cluster in MongoDB Atlas
To create a sharded cluster, perform the following actions:
In the cluster settings of your M30 or higher cluster, navigate to Additional Settings.
Select Sharding and toggle it on.
Set the number of shards to deploy. For production applications, use more than one shard. You can deploy between 1 and 70 shards. To learn more, see the Deploy a Sharded Cluster in the Atlas documentation.
Once your sharded cluster is created, select Load Sample Dataset to load the sample data.
Configure sharding for a collection
Spring Data MongoDB does not automatically configure sharding
for collections. You must perform these operations manually by
using mongosh.
To connect to your cluster and configure sharding, run the following command in your terminal:
mongosh "mongodb+srv://<username>:<password>@<cluster-url>/admin"
Replace the <username>, <password>, and
<cluster-url> placeholders with your MongoDB Atlas
credentials and connection string.
To shard the users collection by the email field, run
the following command:
sh.shardCollection("sample_mflix.users", { email: 1 })
To verify that sharding is enabled and the collection is sharded, run the following command:
sh.status()
Implement the Spring Boot application
To implement the Spring Boot application, you must define the entity, repository, service, and controller layers.
To define an entity class with sharding, use the
@Shardedannotation to specify the shard key fields. The following code shows an exampleUserentity with theemailfield as the shard key:import org.springframework.data.annotation.Id; import org.springframework.data.mongodb.core.mapping.Document; import org.springframework.data.mongodb.core.mapping.Field; import org.springframework.data.mongodb.core.mapping.Sharded; public class User { private String id; private String name; private String email; private String password; // Getters and Setters } The
@Shardedannotation helps Spring Data MongoDB optimize operations in a sharded environment. It ensures thatreplaceOnequeries include the shard key during upserts.To create a repository for the
Userentity, define an interface that extendsMongoRepository. The following code shows an exampleUserRepositoryinterface:import org.springframework.data.mongodb.repository.MongoRepository; import com.mongodb.sharded.model.User; public interface UserRepository extends MongoRepository<User, String> { } To handle business logic, create a service class that interacts with the repository. The following code shows an example
UserServiceclass:import java.util.List; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.stereotype.Service; import com.mongodb.sharded.model.User; import com.mongodb.sharded.repository.UserRepository; public class UserService { private UserRepository userRepository; public List<User> getAllUsers() { return userRepository.findAll(); } public User saveUser(User user) { return userRepository.save(user); } } To expose REST endpoints for the
Userentity, create a controller class. The following code shows an exampleUserControllerclass:import org.springframework.beans.factory.annotation.Autowired; import org.springframework.web.bind.annotation.*; import java.util.List; public class UserController { private UserService userService; public List<User> getAllUsers() { return userService.getAllUsers(); } public User createUser( User user) { return userService.saveUser(user); } }
Configure the Spring Boot application
To connect your Spring Boot application to MongoDB Atlas, add
the MongoDB connection URI to your application.properties
or application.yml file. The following code shows an
example configuration:
spring.data.mongodb.uri=mongodb+srv://<username>:<password>@<cluster-url>/myDatabase?retryWrites=true&w=majority
Replace the <username>, <password>, and
<cluster-url> placeholders with your MongoDB Atlas
credentials and connection string.
Shard Key Selection
When you choose a shard key, verify that it distributes data evenly across shards. The choice of a shard key is critical to the performance and scalability of your sharded cluster. To learn more about selecting a shard key, see Choose a Shard Key in the MongoDB Server manual.
An ideal shard key has the following characteristics:
High cardinality: The key has many unique values to distribute data evenly across shards. The key does not need to be entirely unique.
Even distribution: The key distributes documents evenly across all shards to avoid hotspots where one shard handles more data or requests than others.
Support for common queries: Choose a key that aligns with your most common query patterns to reduce query scatter and improve performance.
For the users collection in the sample_mflix database,
the email field works well as a shard key if:
Emails are unique and well-distributed.
Queries frequently filter or sort by email.
Additional Resources
To learn more about sharding in MongoDB, see Sharding in the MongoDB Server manual.