Performance Analysis and Monitoring of Linux CPU-vmstat, top

Keywords: Linux Python network

Summary of Linux Performance Monitoring Tools:
- iostat disk performance monitoring
- vmstat virtual memory performance monitoring, CPU monitoring (process context switching, CPU utilization)
- top system load, CPU usage, detailed report of each process (CPU usage, memory usage, etc.)
- free memory usage.
- The ps command is not a performance monitoring tool, but you can use ps to cooperate with the above commands to find processes with higher system resources.

This article focuses on the use of vmstat. Others, such as top, are not explained. Basically, the report of vmstat can see that the use of top is similar. If you don't know anything about top report, you can man top

I. Usage of vmstat

The vmstat command is mainly used to view virtual memory, but you can also view the use of other resources of the system, such as CPU.

vmstat [interval] [count]

vmstat options

- a Displays active and inact memory
 - f Displays the number of fork s from system startup to the present 
- m display slabinfo
 - s static display of memory-related information

First, run a default command to explain the meaning of each field according to the output results, which is helpful for the following analysis.

[root@master ~]# vmstat 
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  0      0 431340  44840 211744    0    0     5     2  149    9  2  4 95  0  0    

[root@minion ~]# vmstat -a
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
 r  b   swpd   free  inact active   si   so    bi    bo   in   cs us sy id wa st
 1  0      0 757496  64916  83772    0    0    85     7   56   42  1  3 96  0  0

Items corresponding to each field have the following meanings
procs

- Number of processes r is waiting to run
 - b Number of processes in uninterruptible sleep

memory

- swpd to use swap space
 - free Remaining Physical Memory
- buff    buffer
- cache    cache
 - inact inactive inner number (- a option)
- Number of active memory (- a option)

swap
- Memory size of si swap from disk
- so swap to disk memory size

- block/s received by bi from block devices
 - block/s sent to block devices. If this value is not zero for a long time, memory may be problematic because caching is not used (of course, direct I/O is not excluded, but there are usually few direct I/O).

system

- The number of interrupts per second in, including clock interrupts
 - cs process context switching times

cpu

- Ratio of CPU time consumed by us user processes
 - CPU time ratio of sy system
 - ID CPU idle time ratio
 - Wa IO Wait Time Ratio (when IO Wait High, Disk Performance Probably Problem)
- st    steal time

Two, CPU

2.1 Monitoring Indicators

CPU utilization. According to experience, the proportion of CPU occupied by user-space processes ranges from 65% to 70%, the proportion of CPU occupied by kernel (system) ranges from 30% to 35%, and the proportion of idle CPU occupied by user-space processes ranges from 0% to 5%. Generally, it can't exceed this ratio. If this ratio is exceeded, the system performance will decrease and the average load will increase. This will be seen in the following tests.
Process context switching. Context switching should be associated with CPU utilization. If CPU utilization is low, context switching is acceptable. Context switching also consumes CPU resources. Frequent switching will increase CPU utilization.
The number of processes waiting to run in the run queue. The number of processes to be processed in each CPU core should not exceed three threads/processes. For a 4-core machine, the maximum queue should not exceed 12.
Average load. The average load value is that the average load per core CPU should be controlled at 0.7. Better not exceed 1.

In general, I use top command and vmstat command to see together, top command can see the overall situation, can also see the consumption of resources per task. Use the vmstat command to view task books and process context switching in the queue.

The following is a CPU-intensive program that uses multithreading (20 threads) to loop a global variable + 1.
This procedure is as follows

#!/usr/bin/python

import threading

count = 0

class Test(threading.Thread):
    def __init__(self):
        threading.Thread.__init__(self)

    def run(self):
        global count
        for i in xrange(1,100000000):    #100000000
            count += 1
        print count
if __name__ == '__main__':
    threads = []
    for i in range(10):
        thread = Test()
        threads.append(thread)
        thread.start()
    for thread in threads:
        thread.join()
    print count

2.2 Before the program runs

I use top and vmstat commands to view reports before running the program

vmstat Report

[root@master ~]# vmstat 2 10
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0      0 432696  43672 211724    0    0     6     2  148  715  2  4 94  0  0    
 0  0      0 432688  43672 211724    0    0     0     0   20   17  0  0 100  0  0   
 0  0      0 432688  43672 211724    0    0     0     0   16   17  0  0 100  0  0   
 0  0      0 432432  43672 211724    0    0     0     0   54   43  0  1 99  0  0    
 0  0      0 432400  43672 211724    0    0     0     0   38   37  0  0 100  0  0   
 0  0      0 432376  43672 211724    0    0     0     0   88   65  0  1 99  0  0    
 0  0      0 432120  43672 211724    0    0     0     0   49   35  0  1 99  0  0    
 0  0      0 432152  43672 211724    0    0     0     0   31   28  0  0 99  0  0    
 0  0      0 432152  43672 211724    0    0     0     0   29   26  0  0 100  0  0   
 0  0      0 432152  43672 211724    0    0     0     0   15   16  0  0 100  0  0

top Report

[root@master ~]# top
top - 12:11:37 up  7:07,  3 users,  load average: 0.00, 0.00, 0.02
Tasks: 114 total,   1 running, 113 sleeping,   0 stopped,   0 zombie
Cpu0  :  1.8%us,  3.3%sy,  0.0%ni, 94.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :  1.3%us,  3.8%sy,  0.0%ni, 94.8%id,  0.0%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:   1004412k total,   573856k used,   430556k free,    45160k buffers
Swap:  2047992k total,        0k used,  2047992k free,   211748k cached

According to top report and vmstat report, we can safely conclude that the system has good performance. Because the value of each index is not high.

2.3 Program Runtime

vmstat Report

[root@master ~]# vmstat 2 10
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
21  0      0 423072  43640 211724    0    0     6     2  142  506  2  4 95  0  0    
20  0      0 423064  43640 211724    0    0     0     0 2085 78125 26 73  1  0  0   
20  0      0 423064  43640 211724    0    0     0     0 2038 79752 24 74  1  0  0   
20  0      0 423064  43640 211724    0    0     0     0 2057 78022 25 74  1  0  0   
20  0      0 423064  43640 211724    0    0     0     0 2045 85145 25 73  2  0  0   
20  0      0 423032  43640 211724    0    0     0    12 2002 68602 25 73  2  0  0   
20  0      0 422908  43640 211724    0    0     0     0 2065 79101 25 73  1  0  0   
20  0      0 422908  43640 211724    0    0     0     0 2048 78424 26 73  1  0  0   
10  0      0 422940  43640 211724    0    0     0     0 2039 69779 22 76  2  0  0   
21  0      0 422940  43640 211724    0    0     0     0 2050 81712 26 73  2  0  0

top Report

[root@master ~]# top
top - 10:55:10 up  5:51,  3 users,  load average: 15.01, 11.58, 6.21
Tasks: 115 total,   1 running, 114 sleeping,   0 stopped,   0 zombie
Cpu0  : 27.8%us, 72.2%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  : 22.2%us, 77.8%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   1004412k total,   581836k used,   422576k free,    43656k buffers
Swap:  2047992k total,        0k used,  2047992k free,   211724k cached

The runtime situation looks very bad. I focused on these values:
top Report Analysis
- load average: 15.01, 11.58, 6.21
-% sys: They're all over 70%.
- id% is 0

vmstat report analysis
- The threads/processes waiting in the r run queue are basically around 20
- cs context switching is about 80,000 times per second

From the above analysis, we can see that the system performance is very poor. Here I try to do an analysis, why does the system occupy so high CPU utilization? I think it is caused by frequent process context switching when the program runs. This is because the scheduling room of the process/thread is scheduled by the process scheduler of the kernel subsystem. Frequent switching indicates that the process scheduler also occupies the CPU very frequently, which leads to the high CPU occupancy time of the system. In addition, system calls also consume a part of the time. Therefore, the system occupies a high proportion of CPU, while the user space process occupies a relatively low proportion of CPU. The system is to provide services to users. If most of the resources are occupied by the system, what else can the user do?

Supplementary top field meaning:

%hi, IRQ. If this value is not balanced, then no interrupt equilibrium is set. The setup method can be referred to in my previous article. Multiqueue Network Card Interrupt Equilibrium

-% Si soft IRQ, i.e. soft interrupt
 - PR Priority
 - NI nice value
- VIRT    virtual image(kb)
- RES Resident size. Used non-swap physical memory
 - SHR shared memory size (kb)
-% CPU as a percentage of total CPU time. This value is the CPU utilization from the last update to the present. If the top command updates the data once in three seconds, this value is the CPU utilization in three seconds.
-% Ratio of available physical memory currently used by MEM tasks.

Posted by aldernon on Tue, 02 Apr 2019 10:45:30 -0700

Programmer Group