On-line CPU 100% problem checking process

Keywords: Java Linux

There are many times when we find that online cpu usage is too high or memory overflow. In fact, in linux environment, we can see its usage and specific error information.

View processes that occupy a high cpu

 [log@task-a-shprod-1 ~]$ top
top - 12:00:19 up 20 days, 19:46,  1 user,  load average: 2.42, 1.71, 2.40
Tasks:  98 total,   2 running,  96 sleeping,   0 stopped,   0 zombie
%Cpu(s): 53.1 us, 16.9 sy,  0.0 ni, 27.8 id,  0.0 wa,  0.0 hi,  2.3 si,  0.0 st
KiB Mem : 16267724 total,   353600 free,  8349840 used,  7564284 buff/cache
KiB Swap:        0 total,        0 free,        0 used.  7557728 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
15980 root      20   0 8375964 3.148g  14824 S 175.7 20.3   3329:10 java
  348 root      20   0   62560  20068  19612 R  85.3  0.1  13592:20 systemd-journal
 8978 root      20   0 7898304 1.205g  14804 S  22.3  7.8  56:33.51 java
10214 root      20   0 8065148 1.695g  14800 S   1.3 10.9  15:43.18 java
 1038 root      10 -10  128800  12248   9300 S   1.0  0.1 294:46.14 AliYunDun
 9605 root      20   0 7970496 1.689g  14784 S   1.0 10.9   4:29.24 java
    3 root      20   0       0      0      0 S   0.3  0.0  12:25.10 ksoftirqd/0
    9 root      20   0       0      0      0 S   0.3  0.0  43:15.84 rcu_sched
   13 root      20   0       0      0      0 S   0.3  0.0  14:10.83 ksoftirqd/1
   18 root      20   0       0      0      0 S   0.3  0.0  13:32.39 ksoftirqd/2
   23 root      20   0       0      0      0 S   0.3  0.0  16:09.67 ksoftirqd/3
 1044 root      20   0  263504  41520   5936 S   0.3  0.3  40:29.56 ilogtail
    1 root      20   0   43384   3788   2496 S   0.0  0.0   0:25.62 systemd

As you can see, the highest occupancy is that the java process PID takes up 175.7% for 15980.

View the sub-threads that consume the most cpu in the process

 [log@task-a-shprod-1 ~]$ top -Hp 15980
top - 12:01:25 up 20 days, 19:48,  1 user,  load average: 4.98, 2.55, 2.64
Threads:  58 total,   2 running,  56 sleeping,   0 stopped,   0 zombie
%Cpu(s): 65.4 us, 15.2 sy,  0.0 ni, 17.2 id,  0.1 wa,  0.0 hi,  2.1 si,  0.0 st
KiB Mem : 16267724 total,   322392 free,  8380124 used,  7565208 buff/cache
KiB Swap:        0 total,        0 free,        0 used.  7527436 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
16131 root      20   0 8375964 3.223g  14824 S 44.9 20.8 651:07.49 java
16130 root      20   0 8375964 3.223g  14824 S 35.9 20.8 628:32.23 java
16132 root      20   0 8375964 3.223g  14824 R 30.9 20.8 569:00.13 java
16133 root      20   0 8375964 3.223g  14824 S 25.9 20.8 638:04.25 java
16129 root      20   0 8375964 3.223g  14824 R 12.0 20.8 678:13.62 java
15982 root      20   0 8375964 3.223g  14824 S  0.7 20.8  12:06.16 java
15983 root      20   0 8375964 3.223g  14824 S  0.7 20.8  12:07.24 java
16149 root      20   0 8375964 3.223g  14824 S  0.7 20.8  25:09.56 java
15984 root      20   0 8375964 3.223g  14824 S  0.3 20.8  12:07.52 java
15985 root      20   0 8375964 3.223g  14824 S  0.3 20.8  12:04.10 java
15987 root      20   0 8375964 3.223g  14824 S  0.3 20.8   5:59.05 java

Converting the thread id of the most cpu-consuming to 16-bit output

[log@task-a-shprod-1 ~]$  printf "%x \n" 16131
3f03

Query the code location of the specific problem

[log@task-a-shprod-1 ~]$  jstack 15980 | grep 3f03 -A 30

You can locate the problem code.

Posted by DannyM on Mon, 30 Sep 2019 03:56:50 -0700