I have x3 replica set where each node is a VPS running pretty much only MongoDB. The usual CPU during a normal day hovers around the 50% mark, and Disk IO less than 10 blocks/s.
At a very distinct point in time last night, the CPU on my primary tripled to 150% and has stayed over 100% all day long. That’s bad enough, but simultaneously, the Disk IO has gone though the roof, hitting nearly 8,000 blocks per second and staying there. In the nearly 24 hour period since the event, it’s averaged over 6,000 blocks per second.
The mystery event doesn’t coincide with even the slightest increase in web traffic via the front end servers, nor any increased throughput of data from other servers to the database server.
I continue to evaluate whether some user behaviour caused (is causing) this, but can anyone suggest a way to analyse why these resources are still so elevated?
I have examined
mongotop and also the
db.currentOp console helper to no avail. Although I do get some slow queries, they don’t really explain why the disk IO is so astronomical.