This article uses the golang 1.17 code. If you have any problems, please point out.
The process by which Golang code is run by the operating system
1, Compile
The go source code must first be compiled into an executable file through go build and an ELF format executable file on the linux platform. In the compilation stage, the executable file will be finally generated through three processes: compiler, assembler and linker.
- 1. Compiler:. Go source code generates plan9 assembly code of. s through the go compiler. The go compiler entry is compile/internal/gc/main.go main function of the file;
- 2. Assembler: convert the. s assembly language generated by the compiler into machine code through the go assembler, and write the final target program. o file, src/cmd/internal/obj Package implements go assembler;
- 3. Linker: the *. o object files generated by the assembler get the final executable program through link processing, src/cmd/link/internal/ld Package implements linker;
2, Run
After the go source code generates the executable file through the above steps, the binary file will go through the following stages when it is loaded and run by the operating system:
- 1. Read the executable program into memory from disk;
- 2. Create process and main thread;
- 3. Allocate stack space for the main thread;
- 4. Copy the parameters entered by the user on the command line to the stack of the main thread;
- 5. Put the main thread into the running queue of the operating system and wait for the scheduled execution to run;
Golang program startup process analysis
1. Start the process by gdb debugging the analyzer
Here, a simple go program is used to analyze the start-up process through one-step debugging:
main.go
package main import "fmt" func main() { fmt.Println("hello world") }
Compile the program and debug it using gdb. When debugging with gdb, first set a breakpoint at the program entrance, and then conduct one-step debugging to see the code execution process during the startup of the program.
$ go build -gcflags "-N -l" -o main main.go $ gdb ./main (gdb) info files Symbols from "/home/gosoon/main". Local exec file: `/home/gosoon/main', file type elf64-x86-64. Entry point: 0x465860 0x0000000000401000 - 0x0000000000497893 is .text 0x0000000000498000 - 0x00000000004dbb65 is .rodata 0x00000000004dbd00 - 0x00000000004dc42c is .typelink 0x00000000004dc440 - 0x00000000004dc490 is .itablink 0x00000000004dc490 - 0x00000000004dc490 is .gosymtab 0x00000000004dc4a0 - 0x0000000000534b90 is .gopclntab 0x0000000000535000 - 0x0000000000535020 is .go.buildinfo 0x0000000000535020 - 0x00000000005432e4 is .noptrdata 0x0000000000543300 - 0x000000000054aa70 is .data 0x000000000054aa80 - 0x00000000005781f0 is .bss 0x0000000000578200 - 0x000000000057d510 is .noptrbss 0x0000000000400f9c - 0x0000000000401000 is .note.go.buildid (gdb) b *0x465860 Breakpoint 1 at 0x465860: file /home/gosoon/golang/go/src/runtime/rt0_linux_amd64.s, line 8. (gdb) r Starting program: /home/gaofeilei/./main Breakpoint 1, _rt0_amd64_linux () at /home/gaofeilei/golang/go/src/runtime/rt0_linux_amd64.s:8 8 JMP _rt0_amd64(SB) (gdb) n _rt0_amd64 () at /home/gaofeilei/golang/go/src/runtime/asm_amd64.s:15 15 MOVQ 0(SP), DI // argc (gdb) n 16 LEAQ 8(SP), SI // argv (gdb) n 17 JMP runtime·rt0_go(SB) (gdb) n runtime.rt0_go () at /home/gaofeilei/golang/go/src/runtime/asm_amd64.s:91 91 MOVQ DI, AX // argc ...... 231 CALL runtime·mstart(SB) (gdb) n hello world [Inferior 1 (process 39563) exited normally]
Through one-step debugging, you can see that the program entry function is in runtime / RT0_ linux_ Line 8 in the AMD64. S file will eventually execute the call runtime · mstart(SB) instruction and output "hello world", and then the program will exit.
Start the process. The function calls in the process are as follows:
rt0_linux_amd64.s -->_rt0_amd64 --> rt0_go-->runtime·settls -->runtime·check-->runtime·args-->runtime·osinit-->runtime·schedinit-->runtime·newproc-->runtime·mstart
2. golang start process analysis
Through gdb debugging in the previous section, we have seen that the golang program will execute a series of assembly instructions during startup. In this section, we will specifically analyze the meaning of each instruction in the startup process. Only by understanding these can we understand the operation of the golang program during startup.
src/runtime/rt0_linux_amd64.s
#include "textflag.h" TEXT _rt0_amd64_linux(SB),NOSPLIT,$-8 JMP _rt0_amd64(SB) TEXT _rt0_amd64_linux_lib(SB),NOSPLIT,$0 JMP _rt0_amd64_lib(SB)
Line 8 of the first execution is JMP_ rt0_amd64, which runs on the AMD64 platform_ rt0_ The file where the AMD64 function is located is src/runtime/asm_amd64.s.
TEXT _rt0_amd64(SB),NOSPLIT,$-8 // Process argc and argv parameters. Argc refers to the number of command line input parameters. Argv stores all command line parameters MOVQ 0(SP), DI // argc // argv is a pointer type LEAQ 8(SP), SI // argv JMP runtime·rt0_go(SB)
_ rt0_ The AMD64 function saves argc and argv parameters to DI and SI registers and jumps to rt0_go function, RT0_ Main functions of go function:
- 1. Copy the argc and argv parameters to the main process stack;
- 2. Initialize the global variable g0, allocate about 64K stack space for g0 on the main process stack, and set the stackguard0, stackguard1 and stack fields of g0;
- 3. Execute CPUID instruction to detect CPU information;
- 4. Execute the nocpinfo code block to determine whether cgo needs to be initialized;
- 5. Execute needtls code block and initialize tls and m0;
- 6, execute ok code block, first bind m0 and g0, then call runtime args function to process the parameters and environment variables, call runtime osinit function to initialize cpu quantity, call runtime schedinit initialize scheduler, call Runtime / runtime to create the first execution function, call the "X" to start the main thread, The main thread will execute the first goroutine to run the main function, which will be blocked until the process exits;
TEXT runtime·rt0_go(SB),NOSPLIT|TOPFRAME,$0 // Code for handling command line arguments MOVQ DI, AX // AX = argc MOVQ SI, BX // BX = argv // Expand the stack by 39 bytes. Why expand the stack by 39 bytes is not clear yet SUBQ $(4*8+7), SP ANDQ $~15, SP // Adjust to 16 byte alignment MOVQ AX, 16(SP) //argc is placed at SP + 16 bytes MOVQ BX, 24(SP) //argv is placed at SP + 24 bytes // Start initializing g0. runtime · g0 is a global variable. The variable is defined in src/runtime/proc.go. The global variable will be saved in the data area of the process memory space. The following describes the method to view the code data and global variables in the elf binary file // The stack of g0 is allocated from the memory area of the process stack, and g0 occupies about 64k. MOVQ $runtime·g0(SB), DI // Put the address of g0 into the DI register LEAQ (-64*1024+104)(SP), BX // BX = SP - 64*1024 + 104 // Start initializing the three fields of stackguard0, stackguard1 and stack of the g0 object MOVQ BX, g_stackguard0(DI) // g0.stackguard0 = SP - 64*1024 + 104 MOVQ BX, g_stackguard1(DI) // g0.stackguard1 = SP - 64*1024 + 104 MOVQ BX, (g_stack+stack_lo)(DI) // g0.stack.lo = SP - 64*1024 + 104 MOVQ SP, (g_stack+stack_hi)(DI) // g0.stack.hi = SP
After executing the above instructions, the process memory space layout is as follows:
Then start to execute the instructions to obtain cpu information and related to cgo initialization. This code can be ignored for the time being.
// Execute CPUID instruction, try to obtain CPU information, and probe the code of CPU and instruction set MOVL $0, AX CPUID MOVL AX, SI CMPL AX, $0 JE nocpuinfo // Figure out how to serialize RDTSC. // On Intel processors LFENCE is enough. AMD requires MFENCE. // Don't know about the rest, so let's do MFENCE. CMPL BX, $0x756E6547 // "Genu" JNE notintel CMPL DX, $0x49656E69 // "ineI" JNE notintel CMPL CX, $0x6C65746E // "ntel" JNE notintel MOVB $1, runtime·isIntel(SB) MOVB $1, runtime·lfenceBeforeRdtsc(SB) notintel: // Load EAX=1 cpuid flags MOVL $1, AX CPUID MOVL AX, runtime·processorVersionInfo(SB) nocpuinfo: // CGO initialization related_ cgo_init is a global variable MOVQ _cgo_init(SB), AX // Check if AX is 0 TESTQ AX, AX // Jump to needtls JZ needtls // arg 1: g0, already in DI MOVQ $setg_gcc<>(SB), SI // arg 2: setg_gcc CALL AX // If the CGO feature is enabled, some fields of g0 will be modified MOVQ $runtime·g0(SB), CX MOVQ (g_stack+stack_lo)(CX), AX ADDQ $const__StackGuard, AX MOVQ AX, g_stackguard0(CX) MOVQ AX, g_stackguard1(CX)
Next, execute the needtls code block and initialize tls and m0. tls is stored locally by the thread. During the operation of the golang program, each m needs to be associated with a working thread. How does the working thread know its associated m? At this time, the thread local storage will be used. The thread local storage is the thread private global variable, Through thread local storage, you can initialize a private global variable m for each thread, and then you can use the same global variable name in each working thread to access different M structure objects. It will be analyzed later that each worker thread M uses the thread local storage mechanism to implement a private global variable pointing to the instance object of the M structure for the worker thread just before it is created and enters the scheduling cycle.
In the following code analysis, you will often see calling the getg function. The getg function will get the currently running g from the thread local storage, and the g0 associated with m obtained here.
The tls address will be written to m0, and m0 will be bound to g0, so g0 can be obtained directly from tls.
// Next, initialize tls(thread local storage), set m0 as the thread private variable, and bind m0 to the main thread needtls: LEAQ runtime·m0+m_tls(SB), DI //DI = & m0.tls, take the address of the TLS member of M0 to the DI register // Call the runtime · settls function to set the local storage of the thread, and the parameters of the runtime · settls function are in the DI register // Set the address of m0.tls[1] to the address of TLS in the runtime · settls function // Runtime · settls function in runtime/sys_linux_amd64.s#599 CALL runtime·settls(SB) // This is to verify whether the local storage works properly and ensure that the value is correctly written to m0.tls, // If there is a problem, abort exits the program // get_tls is a macro located in runtime/go_tls.h get_tls(BX) // Put the address of TLS into BX, i.e. BX = & m0.tls [1] MOVQ $0x123, g(BX) // BX = 0x123, i.e. m0.tls[0] = 0x123 MOVQ runtime·m0+m_tls(SB), AX // AX = m0.tls[0] CMPQ AX, $0x123 JEQ 2(PC) // If equal, jump back two instructions to the ok code block CALL runtime·abort(SB) // Interrupt execution using INT instruction
Continue to execute the ok code block. The main logic is:
- Bind m0 and g0 to start the main thread;
- Call the runtime · osinit function to initialize the number of CPUs. The scheduler needs to know how many CPU cores the current system has during initialization;
- Calling the runtime · schedinit function initializes the m0 and p objects, and sets the maxmcount member of the global variable sched to 10000, limiting the maximum number of 10000 operating system threads that can be created to work;
- Call runtime · newproc to create goroutine for the main function;
- Call runtime · mstart to start the main thread and execute the main function;
// First save the g0 address in TLS, that is, m0.tls [0] = & g0, and then bind M0 and g0 // That is, m0.g0 = g0, g0.m = m0 ok: get_tls(BX) // Get TLS address to BX register, i.e. BX = m0.tls[0] LEAQ runtime·g0(SB), CX // CX = &g0 MOVQ CX, g(BX) // m0.tls[0]=&g0 LEAQ runtime·m0(SB), AX // AX = &m0 MOVQ CX, m_g0(AX) // m0.g0 = g0 MOVQ AX, g_m(CX) // g0.m = m0 CLD // convention is D is always left cleared // The check function checks various types and whether there is a problem with type conversion, which is located in runtime/runtime1.go#137 CALL runtime·check(SB) // Move argc and argv to SP+0 and SP+8 // This is to use argc and argv as arguments to the runtime · args function MOVL 16(SP), AX MOVL AX, 0(SP) MOVQ 24(SP), AX MOVQ AX, 8(SP) // The args function reads parameters and environment variables from the stack for processing // The args function is located at runtime/runtime1.go#61 CALL runtime·args(SB) // osinit function is used to initialize the number of CPUs. The function is located in runtime/os_linux.go#301 CALL runtime·osinit(SB) // The schedinit function is used to initialize the scheduler. The function is located at runtime/proc.go#654 CALL runtime·schedinit(SB) // Create the first goroutine and execute the runtime. Main function. Get the address of runtime.main and call newproc to create g MOVQ $runtime·mainPC(SB), AX PUSHQ AX // runtime.main is put on the stack as the second parameter of newproc PUSHQ $0 // The first parameter of newproc is put on the stack. This parameter represents the parameter size required by the runtime.main function. Runtime.main has no parameters, so here is 0 // newproc creates a new goroutine and places it in the waiting queue. The goroutine will execute the runtime.main function, which is located in runtime/proc.go#4250 CALL runtime·newproc(SB) // Pop up data at the top of the stack POPQ AX POPQ AX // The mstart function will start the main thread to enter the scheduling loop, and then run the goroutine just created. Mstart will block unless the function exits. The mstart function is located in runtime/proc.go#1328 CALL runtime·mstart(SB) CALL runtime·abort(SB) // mstart should never return RET // Prevent dead-code elimination of debugCallV2, which is // intended to be called by debuggers. MOVQ $runtime·debugCallV2<ABIInternal>(SB), AX RET
At this time, the process memory space layout is as follows:
View ELF binary file structure
You can view the structure of the ELF binary file through the readelf command. You can see the contents of the code area and data area in the binary file. Global variables are saved in the data area and functions are saved in the code area.
$ readelf -s main | grep runtime.g0 1765: 000000000054b3a0 376 OBJECT GLOBAL DEFAULT 11 runtime.g0 // _ cgo_init is a global variable $ readelf -s main | grep -i _cgo_init 2159: 000000000054aa88 8 OBJECT GLOBAL DEFAULT 11 _cgo_init
summary
This paper mainly introduces the key codes in the startup process of Golang program. The main codes in the startup process are compiled through Plan9. If you haven't done the underlying related things, it seems very difficult. The author doesn't fully understand some of the details. If you are interested, you can discuss some detailed implementation details in private, Some hard coded numbers and operating system and hardware related specifications are relatively difficult to understand. Relevant analysis articles will also be written for several major components in Golang runtime.
reference resources:
https://loulan.me/post/golang-boot/
https://mp.weixin.qq.com/s/W9D4Sl-6jYfcpczzdPfByQ
https://programmerall.com/article/6411655977/
https://ld246.com/article/1547651846124
https://zboya.github.io/post/go_scheduler/#mstartfn