cgroup
cgroup introduction
cgroup, a control group, provides a mechanism for controlling the use of resources by a specific group of processes. cgroup binds a process set to one or more subsystems.
Why cgroup
Group the processes uniformly, monitor the processes and manage the resources on the basis of grouping.
subsystem
cgroup is a mechanism to manage processes by groups under Linux. In the view of user layer, cgroup technology is to organize all processes in the system into an independent tree, each tree contains all processes of the system, each node of the tree is a process group, and each tree is associated with one or more subsystem s, and the role of the tree is to group processes.
A subsystem is a kernel module. After it is associated with a cgroup tree, it will perform specific operations on each node (process group) of the tree. Subsystem is often called resource controller, because it is mainly used to schedule or limit the resources of each process group, but this statement is not completely accurate, because sometimes we group processes just to do some monitoring, and observe their status, such as perf event subsystem. So far, Linux supports 12 kinds of subsystems, such as limiting CPU usage time, limiting memory usage, counting CPU usage, freezing and recovering a group of processes, etc., which will be introduced one by one later.
centos installs the following packages,
libcgroup libcgroup-tools
View subsystem
$ lssubsys -a cpuset cpu,cpuacct memory devices freezer net_cls,net_prio blkio perf_event hugetlb pids
cpu subsystem, which mainly limits the cpu utilization of the process. cpuacct subsystem can count cpu usage reports of processes in cgroups. cpu use subsystem, which can allocate separate cpu nodes or memory nodes for processes in cgroups. Memory subsystem, which can limit the memory usage of the process. blkio subsystem, which can limit the process's block device io. Devices subsystem, which can control the process to access some devices. Net CLS subsystem can mark the network packets of the process in cgroups, and then use tc module to control the packets. Net prio - this subsystem is used to design the priority of network traffic freezer subsystem, which can suspend or resume processes in cgroups. ns subsystem, which can make processes under different cgroups use different namespace s HugeTLB - this subsystem is mainly limited to the HugeTLB system, which is a large page file system.
hierarchy
A hierarchical tree is a collection of cgroup s. The tree made up of these sets is called hierarchy.
A hierarchy can be understood as a cgroup tree. Each node of the tree is a process group. Each tree is associated with zero to multiple subsystem s.
The core uses the cgroup structure to represent the resource limit of a control group to one or several cgroups subsystems. cgroup structure can be organized into a tree, and each cgroup structure is called a cgroups hierarchy.
The cgroups hierarchy can attach one or several cgroups subsystems, and the current hierarchy can restrict the resources of its attached cgroups subsystem. Each cgroups subsystem can only be attached to a cpu hierarchy.
After a node (cgroup structure) in the cgroups hierarchy is created, processes can be added to the control task list of a node, and all processes in the control list of a node will be limited by the resources of the current node. At the same time, a process can also be added to the nodes of different cgroups hierarchy, because different cgroups hierarchy can be responsible for different system resources. So the process and cgroup structure is a many to many relationship.
The bottom P represents a process. A pointer in the descriptor of each process points to a secondary data structure, CSS set (cgroups subsystem set). The process that points to a CSS set will be added to the process list of the current CSS set. A process can only belong to one CSS set. A CSS set can contain multiple processes. Processes belonging to the same CSS set are limited by the resources associated with the same CSS set.
The "M × N Linkage" in the figure above shows that the CSS set can be associated with cgroups node from many to many through auxiliary data structure. However, the implementation of cgroups does not allow CSS set to associate multiple nodes in the same cgroups hierarchy at the same time. This is because cgroups do not allow multiple constraint configurations for the same resource.
When a CSS set is associated with multiple nodes in the cgroups hierarchy, it indicates that it is necessary to control multiple resources for processes under the current CSS set. When a cgroups node is associated with multiple CSS sets, it indicates that the process list under multiple CSS sets is limited by the same resource.
cgroupfs
All operations related to cgroup are based on the cgroup virtual file system in the kernel. Using cgroup is very simple. It is OK to mount this file system. Generally, it is mounted to / sys/fs/cgroup directory, of course, it doesn't matter to mount to any other directory.
Mount a cgroup tree associated with all subsystem s to / sys/fs/cgroup
mount -t cgroup xxx /sys/fs/cgroup
Mount a cgroup tree associated with the cpuse subsystem to / sys / FS / cgroup / cpuse
mkdir /sys/fs/cgroup/cpuset mount -t cgroup -o cpuset xxx /sys/fs/cgroup/cpuset
Mount a cgroup tree, but it is not associated with any subsystem. Here is how systemd is used
mkdir /sys/fs/cgroup/systemd mount -t cgroup -o none,name=systemd xxx /sys/fs/cgroup/systemd
System D has helped us associate and mount the subsystem s and cgroup trees
$ mount -t cgroup cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd) cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb) cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_prio,net_cls) cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpuacct,cpu) cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
Note:
-
The first time you mount a cgroup tree associated with the specified subsystem, a new cgroup tree will be created. When you mount it with the same parameters again, the existing cgroup tree will be reused, that is, the content seen by the two mount points is the same.
-
When you mount a cgroup tree, you can specify multiple subsystems to associate with it, but a subsystem can only be associated with one cgroup tree. Once you associate and create a child cgroup on the tree, the subsystems and the cgroup tree become a whole, and can't be recombined.
-
You can create any number of cgroup trees that are not associated with any subsystem. Name is the only mark of this tree. When name specifies a new name, a new cgroup tree will be created. However, if a cgroup tree with the same name already exists in the kernel, the existing cgroup tree will be mount ed
View cgroup of process
$ cat /proc/672/cgroup 11:blkio:/system.slice/crond.service 10:cpuacct,cpu:/system.slice/crond.service 9:cpuset:/ 8:freezer:/ 7:pids:/system.slice/crond.service 6:devices:/system.slice/crond.service 5:net_prio,net_cls:/ 4:memory:/system.slice/crond.service 3:perf_event:/ 2:hugetlb:/ 1:name=systemd:/system.slice/crond.service
Each row contains three columns separated by colons. They mean
- The ID of the cgroup tree corresponds to the ID in the / proc/cgroups file one by one.
- All subsystems bound to cgroup tree are separated by commas. Here, name=systemd means that there is no subsystem binding, just a name named systemd.
- The path of a process in the cgroup tree, that is, the cgroup to which the process belongs, is the relative path to the mount point.
practice
View mount point
$ mount -t cgroup cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd) cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb) cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event) cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory) cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_prio,net_cls) cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices) cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids) cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer) cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset) cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpuacct,cpu) cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
Create isolation group
cd /sys/fs/cgroup/cpu mkdir cpu_test
After creation, a series of files are included
$ ls cgroup.clone_children cpuacct.stat cpu.cfs_period_us cpu.rt_runtime_us notify_on_release cgroup.event_control cpuacct.usage cpu.cfs_quota_us cpu.shares tasks cgroup.procs cpuacct.usage_percpu cpu.rt_period_us cpu.stat
By default - 1, changing to 20000 can be understood as limiting the cpu to 20%.
echo 20000 > /sys/fs/cgroup/cpu/cpu_test/cpu.cfs_quota_us
Find the process number and add it to the task
/sys/fs/cgroup/cpu/cpu_test/tasks
cgroup plug-in of Netdata
Read configuration
Read the configuration file.
The main configuration is acquisition cycle.