Linux boot process and troubleshooting

Keywords: Linux

I. Boot-up process of Linux system

1.1. Start-up self-test (BIOS)

  • Start-up self-check (BIOS): After the server host is powered on, CPU, memory, graphics card, keyboard and other devices will be initially detected according to the settings in the motherboard BIOS. After successful detection, system control will be transferred according to the preset startup order, most of the time to the local hard disk.
    Summary: Detect the first device that can boot the system, such as a hard disk or a CD-ROM drive

1.2, MBR boot

  • MBR boot: When a system is booted from a local hard disk, system control is first passed to the partition containing the operating system boot file based on the settings of the MBR (master boot record) in the first sector of the hard disk; or the boot menu (such as GRUB) is invoked directly from the boot information in the MBR record.
    Summary: Run startup GRUB bootstrapper in MBR sector
    Hexdump-C-n 512/dev/sda View Top 512 Bytes
[root@localhost ~]# hexdump -C -n 512 /dev/sda
00000000  eb 63 90 10 8e d0 bc 00  b0 b8 00 00 8e d8 8e c0  |.c..............|
00000010  fb be 00 7c bf 00 06 b9  00 02 f3 a4 ea 21 06 00  |...|.........!..|
00000020  00 be be 07 38 04 75 0b  83 c6 10 81 fe fe 07 75  |....8.u........u|
00000030  f3 eb 16 b4 02 b0 01 bb  00 7c b2 80 8a 74 01 8b  |.........|...t..|
00000040  4c 02 cd 13 ea 00 7c 00  00 eb fe 00 00 00 00 00  |L.....|.........|
00000050  00 00 00 00 00 00 00 00  00 00 00 80 01 00 00 00  |................|
00000060  00 00 00 00 ff fa 90 90  f6 c2 80 74 05 f6 c2 70  |...........t...p|
00000070  74 02 b2 80 ea 79 7c 00  00 31 c0 8e d8 8e d0 bc  |t....y|..1......|
00000080  00 20 fb a0 64 7c 3c ff  74 02 88 c2 52 be 05 7c  |. ..d|<.t...R..||
00000090  b4 41 bb aa 55 cd 13 5a  52 72 3d 81 fb 55 aa 75  |.A..U..ZRr=..U.u|
000000a0  37 83 e1 01 74 32 31 c0  89 44 04 40 88 44 ff 89  |7...t21..D.@.D..|
000000b0  44 02 c7 04 10 00 66 8b  1e 5c 7c 66 89 5c 08 66  |D.....f..\|f.\.f|
000000c0  8b 1e 60 7c 66 89 5c 0c  c7 44 06 00 70 b4 42 cd  |..`|f.\..D..p.B.|
000000d0  13 72 05 bb 00 70 eb 76  b4 08 cd 13 73 0d 5a 84  |.r...p.v....s.Z.|
000000e0  d2 0f 83 de 00 be 85 7d  e9 82 00 66 0f b6 c6 88  |.......}...f....|
000000f0  64 ff 40 66 89 44 04 0f  b6 d1 c1 e2 02 88 e8 88  |d.@f.D..........|
00000100  f4 40 89 44 08 0f b6 c2  c0 e8 02 66 89 04 66 a1  |.@.D.......f..f.|
00000110  60 7c 66 09 c0 75 4e 66  a1 5c 7c 66 31 d2 66 f7  |`|f..uNf.\|f1.f.|
00000120  34 88 d1 31 d2 66 f7 74  04 3b 44 08 7d 37 fe c1  |4..1.f.t.;D.}7..|
00000130  88 c5 30 c0 c1 e8 02 08  c1 88 d0 5a 88 c6 bb 00  |..0........Z....|
00000140  70 8e c3 31 db b8 01 02  cd 13 72 1e 8c c3 60 1e  |p..1......r...`.|
00000150  b9 00 01 8e db 31 f6 bf  00 80 8e c6 fc f3 a5 1f  |.....1..........|
00000160  61 ff 26 5a 7c be 80 7d  eb 03 be 8f 7d e8 34 00  |a.&Z|..}....}.4.|
00000170  be 94 7d e8 2e 00 cd 18  eb fe 47 52 55 42 20 00  |..}.......GRUB .|
00000180  47 65 6f 6d 00 48 61 72  64 20 44 69 73 6b 00 52  |Geom.Hard Disk.R|
00000190  65 61 64 00 20 45 72 72  6f 72 0d 0a 00 bb 01 00  |ead. Error......|
000001a0  b4 0e cd 10 ac 3c 00 75  f4 c3 00 00 00 00 00 00  |.....<.u........|
000001b0  00 00 00 00 00 00 00 00  bf 32 0a 00 00 00 80 20  |.........2..... |
000001c0  21 00 83 aa 28 82 00 08  00 00 00 00 20 00 00 aa  |!...(....... ...|
000001d0  29 82 8e fe ff ff 00 08  20 00 00 f8 5f 0c 00 00  |)....... ..._...|
000001e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 55 aa  |..............U.|
00000200

1.3, GRUB menu

  • GRUB menu: For Linux operating systems, GRUB (Unified Boot Loader) is the most widely used multi-system bootstrapper program. When system control is passed to GRUB, the boot menu is displayed to the user to choose from, the Linux kernel file is loaded according to the selected (or default) value, and the system control is transferred to the kernel.
    CentOS 7 uses GRUB2 boot bootstrapper.
    Summary: The GRUB bootstrapper obtains kernel and mirror file system settings and path locations by reading the GRUB configuration file/boot/grub2/grub.cfg
[root@localhost ~]# cd /boot/
[root@localhost boot]# ls
config-3.10.0-693.el7.x86_64  grub2                                                    initramfs-3.10.0-693.el7.x86_64kdump.img  System.map-3.10.0-693.el7.x86_64
efi                           initramfs-0-rescue-8690cbde6528435b819458e5644933a5.img  initrd-plymouth.img                       vmlinuz-0-rescue-8690cbde6528435b819458e5644933a5
grub                          initramfs-3.10.0-693.el7.x86_64.img                      symvers-3.10.0-693.el7.x86_64.gz          vmlinuz-3.10.0-693.el7.x86_64
[root@localhost boot]# cd grub2
[root@localhost grub2]# ls
device.map  fonts  grub.cfg  grubenv  i386-pc  locale

1.4, Loading Linux Kernel

  • Load Linux Kernel: The Linux Kernel is a special pre-compiled binary file that is responsible for resource allocation and scheduling between various hardware resources and system programs. After taking over control of the system, the kernel will have full control over the entire Linux operating system.
    On CentOS 7 systems, the default kernel file is located at'/boot/vmlinuz-3.10.0-514.el7.x86_64'.
    Summary: Load the kernel and mirror file system into memory

1.5, init process initialization

  • Init process initialization: In order to complete the further system boot process, the Linux kernel first loads the'/sbin/init'program in the system into memory to run (the running program is called a process), the init process is responsible for the initialization of the entire system, and finally waits for the user to log in.
    Summary: Load hardware drivers and the kernel loads init processes into memory to run

2. System Initialization Processes (init and systemd)

2.1, init process

  • Run/sbin/init programs loaded by the Linux kernel
  • The init process is the first process in the system and the parent of all processes
  • The init process's PID (process tag) number will always be 1
    init runlevelSystemd's targetExplain
    0targetShutdown status, the host will be shut down when using this level
    1rescue.targetSingle user mode, no password validation required to log on to the system, mostly for system maintenance
    2multi-user.targetUser-defined/domain-specific runlevel, default equivalent to 3
    3multi-user.targetFull user mode for character interfaces, at which most server hosts run
    4multi-user.targetUser-defined/domain-specific runlevel, default equivalent to 3
    5graphical.targetMulti-user mode of GUI, providing a graphical desktop operating environment
    6reboot.targetRestart, host will be restarted when using this level

2.2, systemd process

  1. Systemd is an init software for Linux operating system
  2. The first init process running in CentOS7 is/liblsystemd/systemd
  3. New Systemd startup mode replaces traditional SysVinit in CentOS7
[root@localhost ~]# pstree
systemd─┬─ModemManager───2*[{ModemManager}]
       ├─NetworkManager───2*[{NetworkManager}]
       ├─VGAuthService
       ├─abrt-dbus───3*[{abrt-dbus}]
       ├─2*[abrt-watch-log]
       ├─abrtd
       ├─accounts-daemon───2*[{accounts-daemon}]
       ├─alsactl
       ├─at-spi-bus-laun─┬─dbus-daemon───{dbus-daemon}
       │                 └─3*[{at-spi-bus-laun}]
       ├─at-spi2-registr───2*[{at-spi2-registr}]
       ├─atd
       ├─auditd─┬─audispd─┬─sedispatch
       │        │         └─{audispd}
       │        └─{auditd}
       ├─avahi-daemon───avahi-daemon
       ├─bluetoothd
       ├─chronyd
       ├─colord───2*[{colord}]
       ├─crond
       ├─cupsd
       ├─2*[dbus-daemon───{dbus-daemon}]
       ├─dbus-launch
       ├─dnsmasq───dnsmasq
       ├─firewalld───{firewalld}
       ├─gdm─┬─X───5*[{X}]
       │     ├─gdm-session-wor─┬─gnome-session-b─┬─gnome-settings-───4*[{gnome-settings-}]
       │     │                 │                 ├─gnome-shell─┬─ibus-daemon─┬─ibus-dconf───3*[{ibus-dconf}]
       │     │                 │                 │             │             ├─ibus-engine-sim───2*[{ibus-engine-sim}]
       │     │                 │                 │             │             └─2*[{ibus-daemon}]
       │     │                 │                 │             └─14*[{gnome-shell}]
       │     │                 │                 └─3*[{gnome-session-b}]
       │     │                 └─2*[{gdm-session-wor}]
       │     └─3*[{gdm}]
       ├─gssproxy───5*[{gssproxy}]
       ├─ibus-x11───2*[{ibus-x11}]
       ├─irqbalance
       ├─ksmtuned───sleep
       ├─libvirtd───15*[{libvirtd}]
       ├─lsmd
       ├─lvmetad
       ├─master─┬─pickup
       │        └─qmgr
       ├─packagekitd───2*[{packagekitd}]
       ├─polkitd───5*[{polkitd}]
       ├─pulseaudio───2*[{pulseaudio}]
       ├─rngd
       ├─rsyslogd───2*[{rsyslogd}]
       ├─rtkit-daemon───2*[{rtkit-daemon}]
       ├─smartd
       ├─sshd───sshd───bash───pstree
       ├─systemd-journal
       ├─systemd-logind
       ├─systemd-udevd
       ├─tuned───4*[{tuned}]
       ├─upowerd───2*[{upowerd}]
       ├─vmtoolsd───{vmtoolsd}
       ├─wpa_supplicant
       └─xdg-permission-───2*[{xdg-permission-}]
Unit typeExtensionintroduce
Service.serviceDescribe a system service
Socket.socketSocket describing inter-process communication
Device.deviceDevice file describing a kernel recognition
Mount.mountDescribes a mount point for a file system
Automount.automountDescribes an automatic mount point for a file system
Swap.swapDescribe a memory swap device or directory
Timer.timerDescribe a timer
Path.pathDescribes a file or directory in a file system
Snapshot.snapshotUsed to save the state of a systemd
Scpoe.scopeUse bus interfaces to create external processes programmatically
Slice.sliceDescribes a set of management system processes residing through a hierarchical organization
Target.targetDescribe a group of units

3. Troubleshooting start-up type failures (repairing and repairing MBR sector failures)

Failure Reason
Damage caused by viruses, trojans, etc.

Incorrect partition operation, disk read-write error operation

Failure Phenomena
Bootstrapper not found, start interrupt

Unable to load operating system, black screen after power on
Solution ideas
Backup files should be done well in advance

Guided into first aid mode with installation disc

Restore from backup file
Operation process:
1. Turn selinux off for security

[root@localhost ~]# setenforce 0

2. Add a hard disk and back up MBR sector data to other disks

dd if=/dev/sda of=/data/why.bak bs=512 count=1 Backup (save) the MBR of SDA under dev to sdb1 under data named why.back if=file to copy of=where to copy bs=size count=number of copies

[root@localhost ~]# echo "- - -"> /sys/class/scsi_host/host0/scan 
[root@localhost ~]# echo "- - -"> /sys/class/scsi_host/host1/scan 
[root@localhost ~]# echo "- - -"> /sys/class/scsi_host/host2/scan 
[root@localhost ~]# lsblk 
NAME            MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda               8:0    0  120G  0 disk 
├─sda1            8:1    0    8G  0 part /boot
└─sda2            8:2    0   84G  0 part 
  ├─centos-root 253:0    0   80G  0 lvm  /
  └─centos-swap 253:1    0    4G  0 lvm  [SWAP]
sdb               8:16   0   20G  0 disk 
sr0              11:0    1  4.2G  0 rom  /run/media/root/CentOS 7 x86_64
[root@localhost ~]# fdisk /dev/sdb
 Welcome fdisk (util-linux 2.23.2). 

Changes will remain in memory until you decide to write them to disk.
Think twice before using the Write command.

Device does not contain a recognized partition table
 Use disk identifier 0 xe57aa7f7 Create a new DOS Disk label.

command(input m get help): n
Partition type:
   p   primary (0 primary, 0 extended, 4 free)
   e   extended
Select (default p): p
 Partition Number (1-4,Default 1): 1
 Starting Sector (2048-41943039,Default is 2048): 
Default value 2048 will be used
Last A sector, +A sector or +size{K,M,G} (2048-41943039,Default is 444339): 
Default value 41943039 will be used
 Partition 1 is set to Linux Type, size set to 20 GiB

command(input m get help): p

disk /dev/sdb: 21.5 GB, 21474836480 Bytes, 41943040 sectors
Units = A sector of 1 * 512 = 512 bytes
 Sector Size(logic/Physics): 512 byte / 512 byte
I/O Size(Minimum/optimum): 512 byte / 512 byte
 Disk label type: dos
 Disk identifier: 0 xe57aa7f7

   equipment Boot      Start         End      Blocks   Id  System
/dev/sdb1            2048    41943039    20970496   83  Linux

command(input m get help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Synchronizing disks.
[root@localhost ~]# mkdir /data
[root@localhost ~]# mkfs.xfs /dev/sdb1
meta-data=/dev/sdb1              isize=512    agcount=4, agsize=1310656 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=0, sparse=0
data     =                       bsize=4096   blocks=5242624, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
[root@localhost ~]# mount /dev/sdb1 /data/
[root@localhost ~]# df -TH
 file system                type      Capacity used Available Used% mount point
/dev/mapper/centos-root xfs        86G  3.9G   82G    5% /
devtmpfs                devtmpfs  940M     0  940M    0% /dev
tmpfs                   tmpfs     956M     0  956M    0% /dev/shm
tmpfs                   tmpfs     956M  9.5M  947M    1% /run
tmpfs                   tmpfs     956M     0  956M    0% /sys/fs/cgroup
/dev/sda1               xfs       8.6G  187M  8.4G    3% /boot
tmpfs                   tmpfs     192M   21k  192M    1% /run/user/0
/dev/sr0                iso9660   4.6G  4.6G     0  100% /run/media/root/CentOS 7 x86_64
/dev/sdb1               xfs        22G   34M   22G    1% /data
[root@localhost ~]# dd if=/dev/sda of=/data/why.bak bs=512 count=1
 Recorded 1+0 Read in of
 Recorded 1+0 Write out of
512 byte(512 B)Copied, 0.000130418 Seconds, 3.9 MB/second
[root@localhost ~]# ll
 Total usage 8
-rw-------. 1 root root 2066 9 Month 1016:59 anaconda-ks.cfg
-rw-r--r--. 1 root root 2114 9 January 1017:15 initial-setup-ks.cfg
drwxr-xr-x. 2 root root    6 9 January 1017:15 public
drwxr-xr-x. 2 root root    6 9 January 1017:15 Template
drwxr-xr-x. 2 root root    6 9 January 1017:15 video
drwxr-xr-x. 2 root root    6 9 January 1017:15 picture
drwxr-xr-x. 2 root root    6 9 January 1017:15 File
drwxr-xr-x. 2 root root    6 9 January 1017:15 download
drwxr-xr-x. 2 root root    6 9 January 1017:15 Music
drwxr-xr-x. 2 root root    6 9 January 1017:15 desktop
[root@localhost ~]# cd /data/
[root@localhost data]# ll
 Total usage 4
-rw-r--r--. 1 root root 512 9 January 1104:26 why.bak
[root@localhost data]# hexdump -C -n 512 /dev/sda
00000000  eb 63 90 10 8e d0 bc 00  b0 b8 00 00 8e d8 8e c0  |.c..............|
00000010  fb be 00 7c bf 00 06 b9  00 02 f3 a4 ea 21 06 00  |...|.........!..|
00000020  00 be be 07 38 04 75 0b  83 c6 10 81 fe fe 07 75  |....8.u........u|
00000030  f3 eb 16 b4 02 b0 01 bb  00 7c b2 80 8a 74 01 8b  |.........|...t..|
00000040  4c 02 cd 13 ea 00 7c 00  00 eb fe 00 00 00 00 00  |L.....|.........|
00000050  00 00 00 00 00 00 00 00  00 00 00 80 01 00 00 00  |................|
00000060  00 00 00 00 ff fa 90 90  f6 c2 80 74 05 f6 c2 70  |...........t...p|
00000070  74 02 b2 80 ea 79 7c 00  00 31 c0 8e d8 8e d0 bc  |t....y|..1......|
00000080  00 20 fb a0 64 7c 3c ff  74 02 88 c2 52 be 05 7c  |. ..d|<.t...R..||
00000090  b4 41 bb aa 55 cd 13 5a  52 72 3d 81 fb 55 aa 75  |.A..U..ZRr=..U.u|
000000a0  37 83 e1 01 74 32 31 c0  89 44 04 40 88 44 ff 89  |7...t21..D.@.D..|
000000b0  44 02 c7 04 10 00 66 8b  1e 5c 7c 66 89 5c 08 66  |D.....f..\|f.\.f|
000000c0  8b 1e 60 7c 66 89 5c 0c  c7 44 06 00 70 b4 42 cd  |..`|f.\..D..p.B.|
000000d0  13 72 05 bb 00 70 eb 76  b4 08 cd 13 73 0d 5a 84  |.r...p.v....s.Z.|
000000e0  d2 0f 83 de 00 be 85 7d  e9 82 00 66 0f b6 c6 88  |.......}...f....|
000000f0  64 ff 40 66 89 44 04 0f  b6 d1 c1 e2 02 88 e8 88  |d.@f.D..........|
00000100  f4 40 89 44 08 0f b6 c2  c0 e8 02 66 89 04 66 a1  |.@.D.......f..f.|
00000110  60 7c 66 09 c0 75 4e 66  a1 5c 7c 66 31 d2 66 f7  |`|f..uNf.\|f1.f.|
00000120  34 88 d1 31 d2 66 f7 74  04 3b 44 08 7d 37 fe c1  |4..1.f.t.;D.}7..|
00000130  88 c5 30 c0 c1 e8 02 08  c1 88 d0 5a 88 c6 bb 00  |..0........Z....|
00000140  70 8e c3 31 db b8 01 02  cd 13 72 1e 8c c3 60 1e  |p..1......r...`.|
00000150  b9 00 01 8e db 31 f6 bf  00 80 8e c6 fc f3 a5 1f  |.....1..........|
00000160  61 ff 26 5a 7c be 80 7d  eb 03 be 8f 7d e8 34 00  |a.&Z|..}....}.4.|
00000170  be 94 7d e8 2e 00 cd 18  eb fe 47 52 55 42 20 00  |..}.......GRUB .|
00000180  47 65 6f 6d 00 48 61 72  64 20 44 69 73 6b 00 52  |Geom.Hard Disk.R|
00000190  65 61 64 00 20 45 72 72  6f 72 0d 0a 00 bb 01 00  |ead. Error......|
000001a0  b4 0e cd 10 ac 3c 00 75  f4 c3 00 00 00 00 00 00  |.....<.u........|
000001b0  00 00 00 00 00 00 00 00  b6 62 09 00 00 00 80 20  |.........b..... |
000001c0  21 00 83 fe ff ff 00 08  00 00 00 00 00 01 00 fe  |!...............|
000001d0  ff ff 8e fe ff ff 00 08  00 01 00 40 80 0a 00 00  |...........@....|
000001e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000001f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 55 aa  |..............U.|
00000200

3. Simulate breaking MBR boot sector and restart repair
DD if=/dev/zero of=/dev/sda bs=512 count=1/dev/zero: Zero setting file
init 6 restart or reboot

[root@localhost data]# dd if=/dev/zero of=/dev/sda bs=512 count=1
 Recorded 1+0 Read in of
 Recorded 1+0 Write out of
512 byte(512 B)Copied, 0.000183102 Seconds, 2.8 MB/second
[root@localhost data]# hexdump -C -n 512 /dev/sda
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000200
[root@localhost data]# init 6
Connection closing...Socket close.

Connection closed by foreign host.

Disconnected from remote host(Virtual Machine 4) at 20:33:50.

Type `help' to learn how to use Xshell prompt.

4. Guide the interface into first aid mode to recover MBR sector data from backup files





Repair complete

4. Troubleshooting Startup Class Failures (Repairing GRUB Boot Failures)

Failure Reason
The GRUB bootstrapper in the MBR was damaged

Missing grub.conf file, incorrect boot configuration

Failure Phenomena
System boot stall

Show the "grub>" prompt
Solution ideas
Try entering boot commands manually (not recommended)

Enter first aid mode, rewrite or restore grub.conf from backup

Rebuild grub program to MBR sector
Here we use the third method and rebuild the grub program into the MBR sector
1. Turn selinux off for security

[root@localhost ~]# setenforce 0

2. Simulate analog failures and restart

[root@localhost ~]# cd /boot/
[root@localhost boot]# ls
config-3.10.0-693.el7.x86_64  initramfs-0-rescue-ef205c1dc172400d8664e7eb13f72ce4.img  symvers-3.10.0-693.el7.x86_64.gz
efi                           initramfs-3.10.0-693.el7.x86_64.img                      System.map-3.10.0-693.el7.x86_64
grub                          initramfs-3.10.0-693.el7.x86_64kdump.img                 vmlinuz-0-rescue-ef205c1dc172400d8664e7eb13f72ce4
grub2                         initrd-plymouth.img                                      vmlinuz-3.10.0-693.el7.x86_64
[root@localhost boot]# cd grub2
[root@localhost grub2]# ls
device.map  fonts  grub.cfg  grubenv  i386-pc  locale
[root@localhost grub2]# mv grub.cfg /opt/
[root@localhost grub2]# ls
device.map  fonts  grubenv  i386-pc  locale
[root@localhost grub2]# reboot
Connection closing...Socket close.

Connection closed by foreign host.

Disconnected from remote host(Virtual Machine 4) at 20:47:30.

3. Enter first aid mode, load disc image, switch to system root environment




Repair succeeded

Five. Troubleshooting Startup Class Failures (Forget root User's Password)

Failure Reason

Forget the root user's password

Failure Phenomena
Unable to perform administrative operations that require root privileges

You will not be able to log on to the operating system without other available accounts

Solution ideas
Enter first aid mode and reset your password





Posted by rsassine on Mon, 13 Sep 2021 09:24:11 -0700