Tiered Storage Models in MongoDB: Optimizing Latency and Cost
By Rohit Nijhawan, Senior Consulting Engineer at MongoDB with André Spiegel and Chad Tindel
For a user-facing application, speed and uptime are critical to success. There are a number of ways you can tune your application and hardware setup to provide the best experience for your customers – the trick is doing so at optimal cost. Here we provide an example for improving performance and lowering costs with MongoDB using Tiered Storage, a method of prioritizing data storage based on latency requirements.
In this example, we will be segmenting data by date: recent data is more frequently accessed and should exhibit lower latency than less recent data. However, the idea applies to other ways of segmenting data, such as location, user, source, size, or other criteria. This approach takes advantage of a powerful feature in MongoDB called tag-aware sharding that has been available since MongoDB 2.2.
Example Application: Insurance Claims
In many applications, low-latency access to data becomes less important as data ages. For example, an insurance company might prioritize access to claims from the last 12 months. Users should be able to view recent claims quickly, but once claims are more than a year old they tend to be accessed much less frequently, and the latency requirements tend to become less demanding.
By creating tiers of storage with different performance and cost profiles, the insurance company can provide a better experience for users while optimizing their costs. Older claims can be stored in a storage tier with more cost-effective hardware such as commodity hard drives. More recent data can be stored in a high-performance storage tier that provides lower latency such as SSD. Because the majority of the claims are more than a year old, storing older data in the lower-cost tier can provide significant cost advantages. The insurance company can optimize their hardware spread across the two tiers, providing a great user experience at an optimized cost point.
The requirements for this application can be summarized as:
The trailing 12 months of claims should reside on faster storage tier
Claims over a year old should move to slower storage tier
Over time new claims arrive, and older claims need to move from the faster tier to the slower tier
For simplicity, throughout this overview, we’ll distinguish the claims data by “current” and “tier-2” data.
Building Your Own Process: An Operational Headache
One approach to these requirements is use periodic batch jobs: selecting data, loading it into the archive, and erasing it from the faster storage. However, this is inherently complex:
The move process must be carefully coded to fail gracefully. In the event that a load fails, you don’t want to delete the original data!
If the data to be moved is large, you may wish to throttle the operations.
If moves succeed partially, you have to retry the unfinished data.
Unless you plan on halting your application during the move (generally unacceptable), your application needs custom code to find the data before, during, and after the move.
Your application needs to understand the physical location of the data, which unnecessarily complicates your code to the partitioning logic.
Furthermore, introducing another custom component to your operations requires additional maintenance and monitoring.
It’s an operational headache that many teams are forced to endure, but there is a simpler way: have MongoDB handle the load of migrating documents from the recent storage machines to the tier 2 storage machines, transparently. As it turns out, you can easily implement this approach with a feature called Tag-Aware Sharding.
The MongoDB Way: Tag-aware Sharding
MongoDB provides a feature called sharding to scale systems horizontally across multiple machines. Sharding is transparent to your application - whether you have 1 or 100 shards, your application code is the same. For a comprehensive description of sharding please see the
Sharding Guide
.
A key component of sharding is a process called the balancer. As collections grow, the balancer operates in the background to carefully move documents between shards. Normally the balancer works to achieve a uniform distribution of documents across shards. However, with tag-aware sharding we can create policies that affect where documents are stored. This feature can be applied in many use cases. One example is to keep user data in data centers that are near the user. In our application, we can use this feature to keep current data on our fast servers, and tier 2 data on cheaper, slower servers.
Here’s how it works:
Shards are assigned tags. A tag is an alphanumeric alias like “London-DC”.
Unique shard key ranges are ‘pinned’ to tags.
During normal balancing operations, chunks migrate only to shards whose tag is associated with a key range which contains the chunk’s key range*.
There are a few subtleties regarding what happens when a chunk’s key range overlaps more than one tag range. Please read the documentation carefully regarding
this particular case
This means that we can assign the “tier-2” tag to shards running on slow servers and “current” tags to shards running on fast servers, and the balancer will handle migrating the data between tiers automatically. What’s great is that we can keep all the data in one database, so our application code doesn’t need to change as data moves between storage tiers.
Determining the shard key
When you query a sharded collection, the query router will do its best to only inspect the shards holding your data, but it can only do this if you provide the shard key as part of your query. (See
Sharded Cluster Query Routing
for more information.)
So we need to make sure that the we look up documents by the shard key. We also know that time is the basis for determining the location of documents in our two storage tiers. Accordingly, the shard key must contain an explicit timestamp. In our example, we’ll be using Enron’s email dataset, and we’ll set the top-level “date” as the shard key. Here’s a sample document:
{
"_id" : ObjectId("4f16fc97d1e2d32371003f87"),
"body" : "i say el tiempo\n\n\nTo: Timothy Blanchard/HOU/EES@EES, Bryan Hull/HOU/ECT@ECT, Luis \nMena/NA/Enron@Enron, Lisa Gillette/HOU/ECT@ECT, Susan M Scott/HOU/ECT@ECT, \nShanna Husser/HOU/EES@EES, Eric Bass/HOU/ECT@ECT,mmmarcantel@equiva.com\ncc: \nSubject: \n\ndoes everyone want to meet at tortucas on kirby south of 59 or el tiempo(no \ntequilla shots) or cabos downtown tonight. let's meet around 6-6:30.\n\n",
"date" : ISODate("2001-03-01T09:55:00Z"),
"filename" : "1107.",
"headers" : {
"Content-Transfer-Encoding" : "7bit",
"Content-Type" : "text/plain; charset=us-ascii",
"From" : "eric.bass@enron.com",
"Message-ID" : "
",
"Mime-Version" : "1.0",
"Subject" : "Re:",
"To" : [
"matthew.lenhart@enron.com"
],
"X-FileName" : "ebass.nsf",
"X-Folder" : "\\Eric_Bass_Jun2001\\Notes Folders\\Sent",
"X-From" : "Eric Bass",
"X-Origin" : "Bass-E",
"X-To" : "Matthew Lenhart",
"X-bcc" : "",
"X-cc" : ""
},
"mailbox" : "bass-e",
"subFolder" : "sent"
}
Because the time is stored in the most significant digits of the date, messages from any given day will numerically precede messages from subsequent days.
Implementation
Here are the the steps to set up this system:
Set up an empty, sharded MongoDB cluster
Create a target database to host the sharded collection
Assign tags to different shards corresponding to the storage tiers
Assign tag ranges to the shards
Load data into the MongoDB Cluster
Set up the cluster
The first thing you will want to do is set up your sharded cluster. You can see more information on
how to set this up here
.
In this case we will have a database called “enron” and a collection called “messages” which holds part of the Enron email corpus. In this example, we’ve set up a cluster with three shards. The first, shard0000, is optimized for low-latency access to data. The other two, shard0001 and shard0002, use more cost effective hardware for data that is older than the identified cutoff date.
Here’s our sharded cluster. These are empty machines with no data:
sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"version" : 3,
"minCompatibleVersion" : 3,
"currentVersion" : 4,
"clusterId" : ObjectId("53616554992f1bbce576f9fd")
}
shards:
{ "_id" : "shard0000", "host" : "Server1:27017" }
{ "_id" : "shard0001", "host" : "Server2:27017" }
{ "_id" : "shard0002", "host" : "Server3:27017" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
Adding the tags
We can “tag” each of these shards to associate them with documents that should belong to our “current” tier or those that should belong to “tier-2.” In the absence of tags and range based tags, balancing will try to ensure that the number of chunks on each shard are equal without regard to any other data in the fields. Before we add the data to our collection, let’s tag shard0000 as “current” and the other two as “tier-2”:
sh.addShardTag('shard0000', 'current')
sh.addShardTag('shard0001', 'tier-2')
sh.addShardTag('shard0002', 'tier-2')
Now we can verify our tags by calling sh.status():
sh.status()
--- Sharding Status ---
...
"clusterId" : ObjectId("53616554992f1bbce576f9fd")
}
shards:
{ "_id" : "shard0000", "host" : "Server1:27017", "tags" : [ "current" ] }
{ "_id" : "shard0001", "host" : "Server2:27017", "tags" : [ "tier-2" ] }
{ "_id" : "shard0002", "host" : "Server3:27017", "tags" : [ "tier-2" ] }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "test", "partitioned" : false, "primary" : "shard0000" }
Next, we need to set up a database and collection for the Enron emails. We’ll set up a new database ‘enron’ with a collection called ‘messages’ and enable sharding on that collection:
use enron
switched to db enron
db.createCollection("messages")
{ "ok" : 1 }
sh.enableSharding("enron")
{ "ok" : 1 }
Since we’re going to shard the collection, we’ll need to set up a shard key. We will use the ‘date’ field as our shard key since this is the field that will define how the documents are distributed across shards:
db.messages.ensureIndex({date:1})
sh.shardCollection('enron.messages',{date:1})
{ "collectionsharded" : "enron.messages", "ok" : 1 }
Defining the cutoff date between tiers
The cutoff point between “current” data and “tier-2” data is a point in time that we will update periodically to keep the most recent documents in our “current” shard. We will start with a cutoff of July 1, 2001, saved as an ISO Date ISODate(“2001-07-01”). Once we add the data to our collection, we will set this as the tag range. Going forward, when we add documents to the “messages” collection, any documents newer than July 1, 2001 will end up on the “current” shard, and documents older than that will end up on the “tier-2” shard.
//Add the tags
sh.addTagRange('enron.messages',{date:MinKey},{date:ISODate("2001-07-01")},'tier-2')
sh.addTagRange('enron.messages',{date:ISODate("2001-07-01")},{date:MaxKey},'current')
It’s important that the two ranges overlap at exactly the same point in time. The lower bound of a tag range is inclusive, and the upper bound is exclusive. This means a document that has an date of exactly ‘ISODate(“2001-07-01”)’ will go on the “current” shard, not the “tier-2” shard.
Below you will see each of the shard’s new tag ranges:
sh.status()
--- Sharding Status ---
...
"clusterId" : ObjectId("53616554992f1bbce576f9fd")
}
shards:
{ "_id" : "shard0000", "host" : "Server1:27017", "tags" : [ "current" ] }
{ "_id" : "shard0001", "host" : "Server2:27017", "tags" : [ "tier-2" ] }
{ "_id" : "shard0002", "host" : "Server3:27017", "tags" : [ "tier-2" ] }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
///delete above { "_id" : "test", "partitioned" : false, "primary" : "shard0000" }
{ "_id" : "enron", "partitioned" : true, "primary" : "shard0000" }
enron.messages
shard key: { "date" : 1 }
chunks:
shard0001 1
shard0000 1
{ "date" : { "$minKey" : 1 } } -->> { "date" : ISODate("2001-07-01T00:00:00Z") } on : shard0001 Timestamp(2, 0)
{ "date" : ISODate("2001-07-01T00:00:00Z") } -->> { "date" : { "$maxKey" : 1 } } on : shard0000 Timestamp(2, 1)
tag: tier-2 { "date" : { "$minKey" : 1 } } -->> { "date" : ISODate("2001-07-01T00:00:00Z") }
tag: current { "date" : ISODate("2001-07-01T00:00:00Z") } -->> { "date" : { "$maxKey" : 1 } }
As a final check, look in the config database for the tag range definitions.
var configdb = db.getSiblingDB("config")
configdb.tags.find().pretty()
{
"_id" : {
"ns" : "enron.messages",
"min" : {
"date" : { "$minKey" : 1 }
}
},
"ns" : "enron.messages",
"min" : {
"date" : { "$minKey" : 1 }
},
"max" : {
"date" : ISODate("2001-07-01T00:00:00Z")
},
"tag" : "tier-2"
}
{
"_id" : {
"ns" : "enron.messages",
"min" : {
"date" : ISODate("2001-07-01T00:00:00Z")
}
},
"ns" : "enron.messages",
"min" : {
"date" : ISODate("2001-07-01T00:00:00Z")
},
"max" : {
"date" : { "$maxKey" : 1 }
},
"tag" : "current"
}
Now, that all the shards and ranges are defined, we are ready to load the message data into the server. The collection will follow the instructions given by the tag ranges and land on the correct machines.
mongorestore -d enron -c messages enron/messages.bson
Now, let’s check the sharding status to see where the documents reside
sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"version" : 3,
"minCompatibleVersion" : 3,
"currentVersion" : 4,
"clusterId" : ObjectId("53616554992f1bbce576f9fd")
}
shards:
{ "_id" : "shard0000", "host" : "Server1:27017", "tags" : [ "current" ] }
{ "_id" : "shard0001", "host" : "Server2:27017", "tags" : [ "tier-2" ] }
{ "_id" : "shard0002", "host" : "Server3:27017", "tags" : [ "tier-2" ] }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "test", "partitioned" : false, "primary" : "shard0000" }
{ "_id" : "enron", "partitioned" : true, "primary" : "shard0000" }
enron.messages
shard key: { "date" : 1 }
chunks:
shard0002 6
shard0001 7
shard0000 3
{ "date" : { "$minKey" : 1 } } -->> { "date" : ISODate("2001-02-01T07:48:00Z") } on : shard0002 Timestamp(5, 0)
{ "date" : ISODate("2001-02-01T07:48:00Z") } -->> { "date" : ISODate("2001-02-09T11:24:00Z") } on : shard0002 Timestamp(6, 0)
{ "date" : ISODate("2001-02-09T11:24:00Z") } -->> { "date" : ISODate("2001-02-23T08:09:00Z") } on : shard0002 Timestamp(7, 0)
{ "date" : ISODate("2001-02-23T08:09:00Z") } -->> { "date" : ISODate("2001-03-06T09:11:00Z") } on : shard0002 Timestamp(8, 0)
{ "date" : ISODate("2001-03-06T09:11:00Z") } -->> { "date" : ISODate("2001-03-19T19:21:00Z") } on : shard0002 Timestamp(9, 0)
{ "date" : ISODate("2001-03-19T19:21:00Z") } -->> { "date" : ISODate("2001-03-28T15:04:00Z") } on : shard0002 Timestamp(10, 0)
{ "date" : ISODate("2001-03-28T15:04:00Z") } -->> { "date" : ISODate("2001-04-10T16:06:49Z") } on : shard0001 Timestamp(10, 1)
{ "date" : ISODate("2001-04-10T16:06:49Z") } -->> { "date" : ISODate("2001-04-23T17:00:35Z") } on : shard0001 Timestamp(3, 10)
{ "date" : ISODate("2001-04-23T17:00:35Z") } -->> { "date" : ISODate("2001-05-06T23:20:00Z") } on : shard0001 Timestamp(3, 11)
{ "date" : ISODate("2001-05-06T23:20:00Z") } -->> { "date" : ISODate("2001-05-11T15:19:00Z") } on : shard0001 Timestamp(4, 1)
{ "date" : ISODate("2001-05-11T15:19:00Z") } -->> { "date" : ISODate("2001-05-23T10:06:00Z") } on : shard0001 Timestamp(4, 2)
{ "date" : ISODate("2001-05-23T10:06:00Z") } -->> { "date" : ISODate("2001-06-05T16:24:00Z") } on : shard0001 Timestamp(3, 14)
{ "date" : ISODate("2001-06-05T16:24:00Z") } -->> { "date" : ISODate("2001-07-01T00:00:00Z") } on : shard0001 Timestamp(3, 15)
{ "date" : ISODate("2001-07-01T00:00:00Z") } -->> { "date" : ISODate("2001-07-27T09:08:00Z") } on : shard0000 Timestamp(3, 2)
{ "date" : ISODate("2001-07-27T09:08:00Z") } -->> { "date" : ISODate("2001-09-01T03:05:08Z") } on : shard0000 Timestamp(3, 3)
{ "date" : ISODate("2001-09-01T03:05:08Z") } -->> { "date" : { "$maxKey" : 1 } } on : shard0000 Timestamp(4, 0)
tag: tier-2 { "date" : { "$minKey" : 1 } } -->> { "date" : ISODate("2001-07-01T00:00:00Z") }
tag: current { "date" : ISODate("2001-07-01T00:00:00Z") } -->> { "date" : { "$maxKey" : 1 } }
That’s it! The mongos process automatically moves documents to comply with the tag ranges. In this example, it took all documents still on the “current” shard with an ISODate older than ISODate(“2001-07-01T00:00:00Z”) and move them to the “tier-2” shard.
The tag ranges must be updated on a regular basis to keep the cutoff point at the correct interval of time in the past (1 year, in our case). In order to do this, both ranges need to be updated. To perform this change the balancer should temporarily be disabled, so there is no point where the ranges overlap. Stopping the balancer temporarily is a safe operation - it will not affect the application or the experience of users.
If you wanted to move the cutoff back another month, to August 1, 2001, you just need to follow these three steps:
Stop the balancer
sh.setBalancerState(false)
Create a chunk split at August 1
sh.splitAt('enron.messages', {"date" : ISODate("2001-08-01")})
Move the cutoff date to ISODate(“2001-08-01T00:00:00Z”)
var configdb=db.getSiblingDB("config");
configdb.tags.update({tag:"tier-2"},{$set:{'max.date':ISODate("2001-08-01")}})
configdb.tags.update({tag:"current"},{$set:{'min.date':ISODate("2001-08-01")}})
Re-start the balancer
sh.setBalancerState(true)
Verify the sharding status
sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"version" : 3,
"minCompatibleVersion" : 3,
"currentVersion" : 4,
"clusterId" : ObjectId("5314f1487abd6cb2803696d6")
}
shards:
{ "_id" : "shard0000", "host" : "Server1:27017", "tags" : [ "current" ] }
{ "_id" : "shard0001", "host" : "Server1:27017", "tags": [ "tier-2" ] }
{ "_id" : "shard0002", "host" : "Server1:27017", "tags": [ "tier-2" ] }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "test", "partitioned" : false, "primary" : "shard0000" }
{ "_id" : "enron", "partitioned" : true, "primary" : "shard0000" }
enron.messages
shard key: { "date" : 1 }
chunks:
shard0001 7
shard0002 6
shard0000 2
{ "date" : { "$minKey" : 1 } } -->> { "date" : ISODate("2001-02-16T10:55:00Z") } on : shard0001 Timestamp(2, 0)
{ "date" : ISODate("2001-02-16T10:55:00Z") } -->> { "date" : ISODate("2001-03-06T10:04:00Z") } on : shard0002 Timestamp(3, 0)
{ "date" : ISODate("2001-03-06T10:04:00Z") } -->> { "date" : ISODate("2001-03-20T22:12:00Z") } on : shard0001 Timestamp(4, 0)
{ "date" : ISODate("2001-03-20T22:12:00Z") } -->> { "date" : ISODate("2001-04-03T13:37:14Z") } on : shard0002 Timestamp(5, 0)
{ "date" : ISODate("2001-04-03T13:37:14Z") } -->> { "date" : ISODate("2001-04-17T13:50:16Z") } on : shard0001 Timestamp(6, 0)
{ "date" : ISODate("2001-04-17T13:50:16Z") } -->> { "date" : ISODate("2001-04-27T16:29:00Z") } on : shard0002 Timestamp(7, 0)
{ "date" : ISODate("2001-04-27T16:29:00Z") } -->> { "date" : ISODate("2001-05-09T12:27:00Z") } on : shard0001 Timestamp(8, 0)
{ "date" : ISODate("2001-05-09T12:27:00Z") } -->> { "date" : ISODate("2001-05-18T07:46:00Z") } on : shard0002 Timestamp(9, 0)
{ "date" : ISODate("2001-05-18T07:46:00Z") } -->> { "date" : ISODate("2001-05-30T19:21:00Z") } on : shard0001 Timestamp(10, 0)
{ "date" : ISODate("2001-05-30T19:21:00Z") } -->> { "date" : ISODate("2001-06-08T12:49:56Z") } on : shard0002 Timestamp(11, 0)
{ "date" : ISODate("2001-06-08T12:49:56Z") } -->> { "date" : ISODate("2001-07-01T00:00:00Z") } on : shard0001 Timestamp(12, 0)
{ "date" : ISODate("2001-07-01T00:00:00Z") } -->> { "date" : ISODate("2001-07-12T20:25:00Z") } on : shard0002 Timestamp(13, 0)
{ "date" : ISODate("2001-07-12T20:25:00Z") } -->> { "date" : ISODate("2001-08-01T00:00:00Z") } on : shard0001 Timestamp(14, 0)
{ "date" : ISODate("2001-08-01T00:00:00Z") } -->> { "date" : ISODate("2001-08-20T19:34:00Z") } on : shard0000 Timestamp(14, 1)
{ "date" : ISODate("2001-08-20T19:34:00Z") } -->> { "date" : { "$maxKey" : 1 } } on : shard0000 Timestamp(1, 12)
tag: tier-2 { "date" : { "$minKey" : 1 } } -->> { "date" : ISODate("2001-08-01T00:00:00Z") }
tag: current { "date" : ISODate("2001-08-01T00:00:00Z") } -->> { "date" : { "$maxKey" : 1 } }
By updating the chunk split to August 1, we have migrated all the documents with a date after July 1 but before August 1 from the “current” shard to the “tier-2” shards. The good news is that we were able to perform this operation without changing our application code and with no database downtime. We can also see that it would be simple to schedule this process to run automatically through an external process.
From Operational Headache to Simplicity
The end result is one collection spread across three shards and two different storage systems. This solution allows you to lower your storage costs without adding complexity to the architecture of your system. Instead of a complex setup with different databases on different machines we have one database to query, and instead of a data migration we update some simple rules to control the location of data in the system.
Like what you see? Sign up for the
MongoDB Newsletter
May 14, 2014