Degrading performance over time

I’m testing MongoDB performance through a in house platform and seeing how various metrics are affected, both when doing “unreasonable much” work, as well as the workload we expect. I’m looking at:

  • CPU usage, memory usage, HD R/W, mongostat metrics

The work being done for this particular case is a large amount of single inserts into a time series for MongoDB 5. In the order of 5000 to 20000 per second.

If I overwork it, it quickly goes sour and I see CPU 100% on all cores. Even here I’m not quite sure what it is doing that causes CPU to be the bottleneck? Anyways, If I instead do a “reasonable to heavy” amount of work, I can see it starting out fine.

  1. I start out seeing about half memory used. Medium+ CPU utilization across all cores. HD W in the 50-80M/s range. Dirty in mongostat is seen rising. Requests seem to be doing ~10 000 per second in mongostat. Lots of mongod processes in htop, with one doing most of the CPU work.
  2. After a while things go sour. CPU usage goes 100% on every core. Dirty has grown to around 5% (perhaps slightly below) in mongostat. One mongod process is doing 500% CPU usage (8 cores). HD W seems fine. net_in has remained stable at about 2.5m per second. Memory usage is perhaps up slightly, but not really much.

Why is it that the CPU usage goes through the roof (I would perhaps think it would be memory or HD W)? What can I do to mitigate this? This is a test where my machine doesn’t represent the production hardware, but I want to see good results.

Please do clarify if there is some is some other metric I should be watching.