1. What is a Namespace?
Namespace is a feature of the Linux kernel, which can isolate resources such as process ID, host name, user ID, file name, network and inter process communication in the same host system. Docker uses the namespace feature of Linux kernel to isolate the resources of each container, so as to ensure that only the resources of its own namespace can be accessed inside the container.
We create a container.
[root@master ~]# docker run -it busybox /bin/sh / #
At this point, we will find some interesting things by executing the following ps command in the container.
/ # ps PID USER TIME COMMAND 1 root 0:00 /bin/sh 9 root 0:00 ps
As you can see, the / bin/sh that we first executed in Docker is the No. 1 process inside the container (PID=1), and there are only two processes running in the container. This means that the / bin/sh previously executed and the ps we just executed have been isolated by Docker in a completely different world from the host.
Originally, whenever we run a / bin/sh program on the host, the operating system will assign it a process number, such as PID = 1000. This number is the unique identification of the process. Just like the employee's work card, it can be roughly understood that / bin/sh is the 1000th employee in the company, and the first employee is naturally the person who leads the overall situation like the boss.
Now we run the / bin/sh program in a container through Docker. At this time, Docker will impose a "cover up" on the No. 1000 employee when he enters the job, so that he will never see the 999 employees in front of him, so he mistakenly thinks he is the No. 1 employee of the company.
This mechanism actually tampers with the process space of isolated applications, so that these processes can only see the recalculated process number, such as PID=1. But in fact, they are still the original process 1000 in the host's operating system.
This technology is the Namespace mechanism in Linux.
Eight types of namespaces are provided in the Linux 5.6 kernel:
Namespace name | effect | Kernel version |
---|---|---|
Mount(mnt) | Isolated mount point | 2.4.19 |
Process ID(pid) | Quarantine process ID | 2.6.24 |
Network (net) | Isolate network equipment, port number, etc | 2.6.19 |
Interprocess Communication (ipc) | Isolate System V IPC and POSIX message queues | 2.6.19 |
UTS Namespace(uts) | Isolate host names and domain names | 2.6.19 |
User Namespace (user) | Isolate users and user groups | 3.8 |
Control group (cgroup) Namespace | Isolate Cgroups root | 4.6 |
Time Namespace | Isolation system time | 5.6 |
2. Various Namespace functions?
(1),Mount Namespace
The implementation sees different mount directories in different processes. Using Mount Namespace, you can only see your own mount information in the container, and the mount operation in the container will not affect the host's Mount directory.
We use the following command to create a bash process and create a new Mount Namespace:
[root@master ~]# unshare --mount --fork /bin/bash [root@master ~]#
After executing the above command, we have created a new Mount Namespace on the host, and the newly created Mount Namespace is added to the current command line window. Below, I use an example to verify that creating the mount directory in an independent Mount Namespace does not affect the host's Mount directory.
First create a directory under the / tmp directory.
[root@master ~]# mkdir /tmp/tmpfs
After creating the directory, mount a tmpfs type directory with the mount command. The command is as follows:
[root@master ~]# mount -t tmpfs -o size=20m tmpfs /tmp/tmpfs/
Then use the df command to view the mounted directory information:
[root@master ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/centos-root 26G 4.6G 22G 18% / devtmpfs 1.9G 0 1.9G 0% /dev tmpfs 1.9G 0 1.9G 0% /dev/shm tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup tmpfs 1.9G 20M 1.9G 2% /run tmpfs 378M 0 378M 0% /run/user/0 /dev/sda1 1014M 183M 832M 18% /boot tmpfs 20M 0 20M 0% /tmp/tmpfs
You can see that the / tmp/tmpfs directory has been mounted correctly. In order to verify that the directory is not mounted on the host, we open a new command line window and execute the df command to view the mounting information of the host:
[root@master ~]# df -h Filesystem Size Used Avail Use% Mounted on devtmpfs 1.9G 0 1.9G 0% /dev tmpfs 1.9G 0 1.9G 0% /dev/shm tmpfs 1.9G 20M 1.9G 2% /run tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup /dev/mapper/centos-root 26G 4.6G 22G 18% / /dev/sda1 1014M 183M 832M 18% /boot tmpfs 378M 0 378M 0% /run/user/0
From the above output, you can see that / tmp/tmpfs is not mounted on the host. It can be seen that the mount operation in our independent Mount Namespace will not affect the host.
To further verify our idea, we continue to check the Namespace information of the current process in the current command line window. The command is as follows:
[root@master ~]# ls -l /proc/self/ns/ total 0 lrwxrwxrwx 1 root root 0 Dec 2 18:39 ipc -> ipc:[4026531839] lrwxrwxrwx 1 root root 0 Dec 2 18:39 mnt -> mnt:[4026532476] lrwxrwxrwx 1 root root 0 Dec 2 18:39 net -> net:[4026531956] lrwxrwxrwx 1 root root 0 Dec 2 18:39 pid -> pid:[4026531836] lrwxrwxrwx 1 root root 0 Dec 2 18:39 user -> user:[4026531837] lrwxrwxrwx 1 root root 0 Dec 2 18:39 uts -> uts:[4026531838]
Then, in the newly opened command line window, use the same command to view the Namespace information on the host:
[root@master ~]# ls -l /proc/self/ns/ total 0 lrwxrwxrwx 1 root root 0 Dec 2 18:39 ipc -> ipc:[4026531839] lrwxrwxrwx 1 root root 0 Dec 2 18:39 mnt -> mnt:[4026531840] lrwxrwxrwx 1 root root 0 Dec 2 18:39 net -> net:[4026531956] lrwxrwxrwx 1 root root 0 Dec 2 18:39 pid -> pid:[4026531836] lrwxrwxrwx 1 root root 0 Dec 2 18:39 user -> user:[4026531837] lrwxrwxrwx 1 root root 0 Dec 2 18:39 uts -> uts:[4026531838]
By comparing the output results of the two commands, we can see that the ID values of other namespaces are the same except that the ID values of Mount Namespace are different.
From the above results, we can conclude that you can create a new Mount Namespace by using the unshare command, and the mount in the new Mount Namespace is completely isolated from the outside.
(2),PID Namespace
PID Namespace is used to isolate processes. In different PID namespaces, processes can have the same PID number. Using PID Namespace, the main process of each container can be realized as process 1, while the processes in the container have different PIDs on the host. For example, a process has a PID of 122 on the host. Using PID Namespace, the process can see a PID of 1 in the container.
We use the following command to create a bash process and create a new PID Namespace:
[root@master ~]# unshare --pid --fork --mount-proc /bin/bash [root@master ~]#
After executing the above command, we create a new PID Namespace on the host, and the newly created PID Namespace is added to the current command line window. In the current command line window, use the ps aux command to view the process information:
[root@master ~]# ps aux USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1 0.0 0.0 115684 2144 pts/0 S 18:47 0:00 /bin/bash root 13 0.0 0.0 155452 1848 pts/0 R+ 18:49 0:00 ps aux
Through the output result of the above command, we can see that bash is process 1 under the current Namespace, and we can't see other process information on the host.
(3),UTS Namespace
UTS Namespace is mainly used to isolate host names. It allows each UTS Namespace to have an independent host name.
For example, our host name is master. Using UTS Namespace, the host name in the container can be docker or any other user-defined host name.
Similarly, we verify the function of UTS Namespace through an example. First, we use the unshare command to create a UTS Namespace:
[root@master ~]# Hostname / / view the current hostname master [root@master ~]# unshare --uts --fork /bin/bash
After the UTS Namespace is created, the current command line window is already in an independent UTS Namespace. Let's use the hostname command (hostname can be used to view the host name) to set the following host name:
[root@master ~]# hostname -b docker [root@master ~]# hostname docker
Then we open a new command line window and use the same command to view the host's hostname:
[root@master ~]# hostname master
You can see that the name of the host is still master and has not been modified. Thus, it can be verified that the UTS Namespace can be used to isolate host names.
(4),IPC Namespace
IPC Namespace is mainly used to isolate inter process communication. For example, when PID Namespace and IPC Namespace are used together, processes in the same IPC Namespace can communicate with each other, but processes in different IPC namespaces cannot communicate.
We use the unshare command to create an IPC Namespace:
[root@master ~]# unshare --ipc --fork /bin/bash [root@docker ~]#
Next, we need two commands to verify IPC Namespace.
- ipcs -q: used to view the list of communication queues between systems
- ipcmk -Q: used to create inter system communication queues
First, use the ipcs -q command to view the list of system communication queues under the current IPC Namespace:
[root@docker ~]# ipcs -q ------ Message Queues -------- key msqid owner perms used-bytes messages
It can be seen from the above that there is no system communication queue at present. Then we use the ipcmk -Q command to create a system communication queue:
[root@docker ~]# ipcmk -Q Message queue id: 0
Use the ipcs -q command again to view the system communication queue list under the current IPC Namespace:
[root@docker ~]# ipcs -q ------ Message Queues -------- key msqid owner perms used-bytes messages 0x4a19cc47 0 root 644 0 0
You can see that we have successfully created a system communication queue. Then, open a new command line window and use the ipcs -q command to view the system communication queue of the host:
[root@master ~]# ipcs -q ------ Message Queues -------- key msqid owner perms used-bytes messages
Through the above experiments, it can be found that the system communication queue created in a separate IPC Namespace cannot be seen on the host. That is, IPC Namespace realizes the isolation of system communication queue.
(5),User Namespace
User Namespace is mainly used to isolate users and user groups. A typical application scenario is that processes running as non root users on the host can be mapped to root users in a separate User Namespace. Using User Namespace, the process can have root permission in the container, but it is only an ordinary user on the host.
User Namespace can be created without root permission. Let's create a User Namespace as an ordinary user. The command is as follows:
[root@docker ~]# su - test Last login: Thu Dec 2 19:11:29 CST 2021 on pts/0 [test@docker ~]$ unshare --user -r /bin/bash [root@docker ~]#
By default, the user namespace allowed to be created by CentOS7 is 0. If the above command fails (the error returned by the unshare command is unshare: unshare failed: Invalid argument), you need to use the following command to modify the user namespace allowed to be created by the system User Namespace quantity
The command is: echo 65535 > / proc / sys / user / max_ user_ Namespaces, and then try to create the User Namespace again.
Then execute the id command to view the current user information:
[root@docker ~]# id uid=0(root) gid=0(root) groups=0(root) [root@docker ~]#
From the above output, we can see that we are already the root user in the new User Namespace. Next, we use the reboot command that can only be executed by the host root user to verify that the reboot command is executed in the current command line window:
[root@docker ~]# reboot Failed to open /dev/initctl: Permission denied Failed to talk to init daemon.
You can see that although we are the root user in the newly created User Namespace, we do not have permission to execute the reboot command. This shows that the root permission of the host cannot be obtained in the isolated User Namespace, that is, the User Namespace realizes the isolation of users and user groups.
(6),Net Namespace
Net Namespace is used to isolate network equipment, IP address, port and other information. Net Namespace allows each process to have its own independent IP address, port and network card information. For example, if the host IP address is 192.168.209.1, an independent IP address 172.16.4.1 can be set in the container.
We use the ip a command to view the network information on the following host:
[root@docker ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether 00:0c:29:30:6d:8c brd ff:ff:ff:ff:ff:ff inet 192.168.209.148/24 brd 192.168.209.255 scope global noprefixroute dynamic ens32 valid_lft 1760sec preferred_lft 1760sec inet6 fe80::8081:c385:2b72:fe59/64 scope link noprefixroute valid_lft forever preferred_lft forever 3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default link/ether 02:42:d7:5e:07:6e brd ff:ff:ff:ff:ff:ff inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0 valid_lft forever preferred_lft forever inet6 fe80::42:d7ff:fe5e:76e/64 scope link valid_lft forever preferred_lft forever
We create a Net Namespace using the following command:
[root@docker ~]# unshare --net --fork /bin/bash [root@docker ~]# ip a 1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 [root@docker ~]#
It can be seen that there are lo, eth0, docker0 and other network devices on the host, while the new Net Namespace is different from the network devices on the host.
3. Why does Docker need a Namespace?
When Docker creates a new container, it will create these six namespaces, and then add the processes in the container to these namespaces, so that the processes in the Docker container can only see the system resources in the current Namespace.
It is precisely because Docker uses these Namespace technologies of Linux that Docker container isolation is realized. It can be said that without Namespace, there will be no Docker container.