Troubleshoot or Stop Long-Running Shard Migration

Just as an update here, we stepped down our primary config machine and that seemed to get some things unstuck. We’re able to successfully call sh.stopBalancer() and sh.startBalancer() now and the timestamp of the migration has been updated.

  shards:
        {  "_id" : "rs1",  "host" : "rs1/shard1a:27018,shard1b:27018,shard1c:27018" }
        {  "_id" : "rs2",  "host" : "rs2/shard2a:27018,shard2b:27018,shard2c:27018",  "state" : 1 }
  active mongoses:
        "3.6.23" : 1
  autosplit:
        Currently enabled: yes
  balancer:
        Currently enabled:  yes
        Currently running:  no
        Collections with active migrations:
                mydatabase.mycollection started at Fri Dec 09 2022 10:47:11 GMT-0800 (PST)
        Failed balancer rounds in last 5 attempts:  0
        Migration Results for the last 24 hours:
                No recent migrations
  databases:
        {  "_id" : "mydatabase",  "primary" : "rs1",  "partitioned" : true }
                mydatabase.mycollection
                        shard key: { "userId" : "hashed" }
                        unique: false
                        balancing: true
                        chunks:
                                rs1	44215
                                rs2	21
                        too many chunks to print, use verbose if you want to force print

That said, even after calling sh.startBalancer() (and that running successfully) sh.isBalancerRunning() returns false.

We’re giving it a bit of time to see if it starts work to migrate chunks, but wondering if this is potentially an issue.

1 Like