Navigation
This version of the documentation is archived and no longer supported.

Merge Chunks in a Sharded Cluster

On this page

Overview

The mergeChunks command allows you to collapse empty chunks into neighboring chunks on the same shard. A chunk is empty if it has no documents associated with its shard key range.

Important

Empty chunks can make the balancer assess the cluster as properly balanced when it is not.

Empty chunks can occur under various circumstances, including:

  • If a pre-split creates too many chunks, the distribution of data to chunks may be uneven.
  • If you delete many documents from a sharded collection, some chunks may no longer contain data.

This tutorial explains how to identify chunks available to merge, and how to merge those chunks with neighboring chunks.

Procedure

Note

Examples in this procedure use a users collection in the test database, using the username filed as a shard key.

Identify Chunk Ranges

In the mongo shell, identify the chunk ranges with the following operation:

sh.status()

The output of the sh.status() will resemble the following:

--- Sharding Status ---
sharding version: {
     "_id" : 1,
     "version" : 4,
     "minCompatibleVersion" : 4,
     "currentVersion" : 5,
     "clusterId" : ObjectId("5260032c901f6712dcd8f400")
}
shards:
     {  "_id" : "shard0000",  "host" : "localhost:30000" }
     {  "_id" : "shard0001",  "host" : "localhost:30001" }
  databases:
     {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
     {  "_id" : "test",  "partitioned" : true,  "primary" : "shard0001" }
             test.users
                     shard key: { "username" : 1 }
                     chunks:
                             shard0000       7
                             shard0001       7
                     { "username" : { "$minKey" : 1 } } -->> { "username" : "user16643" } on : shard0000 Timestamp(2, 0)
                     { "username" : "user16643" } -->> { "username" : "user2329" } on : shard0000 Timestamp(3, 0)
                     { "username" : "user2329" } -->> { "username" : "user29937" } on : shard0000 Timestamp(4, 0)
                     { "username" : "user29937" } -->> { "username" : "user36583" } on : shard0000 Timestamp(5, 0)
                     { "username" : "user36583" } -->> { "username" : "user43229" } on : shard0000 Timestamp(6, 0)
                     { "username" : "user43229" } -->> { "username" : "user49877" } on : shard0000 Timestamp(7, 0)
                     { "username" : "user49877" } -->> { "username" : "user56522" } on : shard0000 Timestamp(8, 0)
                     { "username" : "user56522" } -->> { "username" : "user63169" } on : shard0001 Timestamp(8, 1)
                     { "username" : "user63169" } -->> { "username" : "user69816" } on : shard0001 Timestamp(1, 8)
                     { "username" : "user69816" } -->> { "username" : "user76462" } on : shard0001 Timestamp(1, 9)
                     { "username" : "user76462" } -->> { "username" : "user83108" } on : shard0001 Timestamp(1, 10)
                     { "username" : "user83108" } -->> { "username" : "user89756" } on : shard0001 Timestamp(1, 11)
                     { "username" : "user89756" } -->> { "username" : "user96401" } on : shard0001 Timestamp(1, 12)
                     { "username" : "user96401" } -->> { "username" : { "$maxKey" : 1 } } on : shard0001 Timestamp(1, 13)

The chunk ranges appear after the chunk counts for each sharded collection, as in the following excerpts:

Chunk counts:

chunks:
        shard0000       7
        shard0001       7

Chunk range:

{ "username" : "user36583" } -->> { "username" : "user43229" } on : shard0000 Timestamp(6, 0)

Verify a Chunk is Empty

The mergeChunks command requires at least one empty input chunk. In the mongo shell, check the amount of data in a chunk using an operation that resembles:

db.runCommand({
   "dataSize": "test.users",
   "keyPattern": { username: 1 },
   "min": { "username": "user36583" },
   "max": { "username": "user43229" }
})

If the input chunk to dataSize is empty, dataSize produces output similar to:

{ "size" : 0, "numObjects" : 0, "millis" : 0, "ok" : 1 }

Merge Chunks

Merge two contiguous chunks on the same shard, where at least one of the contains no data, with an operation that resembles the following:

db.runCommand( { mergeChunks: "test.users",
                 bounds: [ { "username": "user68982" },
                           { "username": "user95197" } ]
             } )

On success, mergeChunks produces the following output:

{ "ok" : 1 }

On any failure condition, mergeChunks returns a document where the value of the ok field is 0.

View Merged Chunks Ranges

After merging all empty chunks, confirm the new chunk, as follows:

sh.status()

The output of sh.status() should resemble:

--- Sharding Status ---
sharding version: {
     "_id" : 1,
     "version" : 4,
     "minCompatibleVersion" : 4,
     "currentVersion" : 5,
     "clusterId" : ObjectId("5260032c901f6712dcd8f400")
}
shards:
     {  "_id" : "shard0000",  "host" : "localhost:30000" }
     {  "_id" : "shard0001",  "host" : "localhost:30001" }
  databases:
     {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
     {  "_id" : "test",  "partitioned" : true,  "primary" : "shard0001" }
             test.users
                     shard key: { "username" : 1 }
                     chunks:
                             shard0000       2
                             shard0001       2
                     { "username" : { "$minKey" : 1 } } -->> { "username" : "user16643" } on : shard0000 Timestamp(2, 0)
                     { "username" : "user16643" } -->> { "username" : "user56522" } on : shard0000 Timestamp(3, 0)
                     { "username" : "user56522" } -->> { "username" : "user96401" } on : shard0001 Timestamp(8, 1)
                     { "username" : "user96401" } -->> { "username" : { "$maxKey" : 1 } } on : shard0001 Timestamp(1, 13)