Mongod consumes a lot of system cpu

my environment:

  1. 1 primary + 2 secondary + 1 arbiter; version 4.0.8

  2. primary node Host 80 core cpu, 252g memory,the secondary node ,24 core ,252 cpu, all raid10 sas disk.ext4 filesystem

  3. IO Scheduler and Read-Ahead have been configured according to the official website,numa disable,transparent_hugepage disable

  4. datebase size 135G,Of which oplogs 80g,so the real data is only 40g

  5. The number of daily connections is 3200,and there has been no business adjustment or launch recently

  6. machine uptime, 1456 days

my question:
1、Suddenly the primary node, system cpu is 95%,user cpu is 5%,mongod.log shows that there are a lot of timeout operations, we verified that all are normal business operations
2. In this case, the front desk business feedback, unable to write data
3. top -Hp mongod process,140 active threads, all conn threads, no mongo system threads, about 95% of the CPU per thread
4. The database was repeatedly restarted and master-slave switchover, the failure phenomenon did not disappear. After 2 hours, for unknown reasons, the active sessions automatically dropped to about 5, the system cpu dropped below 3%, and the normal situation was restored.

I want to know what caused the massive consumption of system cpu
thanks

First, you should not have such a replica set configuration - arbiters are needed to make the number of voting nodes odd, yours is actually making number of voting nodes even! You should remove the arbiter and just run primary and two secondaries.

As far as what caused the spike in CPU, without logs from mongod from the time performance was impacted it’s impossible to tell. I would first look at the logs, then at any system logs that you might be tracking (were there issues with disk? other processes impacted?)

In particular, in mongod logs, I would look to see what operations may have started around the time issues first surfaced, or what operations ended when the problems stopped.

Asya