How to speed up chunk Movement process

Hello ,

we have a 3 node 5 sharded cluster , 3 shards tagged as HOT Data Set and 2 Shards Tagged as WARM Data set , we are looking at a data insert of around 3 million documents on a daily basis and ~120 million on a monthly basis. To identify HOT and WARM cluster , I have a date range assigned and TAGS assigned for example : current to 2021-06-01 is HOT data set and 2021-06-01 to min is WARM data set. Our strategy is for every 15 days we update the TAGS and do a data rollover from HOT to WARM for example after TAGS update, date range looks like this : current to 2021-06-15 is HOT data set and 2021-06-15 to min is WARM data set, we are looking at about 45 million documents on top of regular data insert happening via application. we have tested this scenario with heavy loads and what we noticed is Chunk movement from HOT to WARM is very slow and we had a buffer of 15 days for the data to get moved from HOT to WARM. Based on current chunk movement status, we are concerned that we might not be able to move all the data from HOT to WARM which is supposed to move before next TAG update (2021-06-30).

After doing some analysis and we came up with process to do manual chunk movement , so as soon as TAGS are updated , based on a script identified all the chunks needs to be moved from HOT to WARM and moved them, this is a metadata update and actual data move happens later, I see this process is also very slow.

My question , is there any we can speed up the chunk movement process, I am currently using forceJumbo : true even though i don’t have any Jumbo chunks. Have you run into scenarios like this, is there any other better way to fast move the data. Please let me know.

Thanks in advance,
Bhasha :slight_smile: