Delve into the principles of computer composition ELF and static links: why can't programs run under Linux and Windows at the same time?

Keywords: Linux

In the past three sections, you and I have seen how our programs become machine instructions through some simple code; How conditional jumps like if... else are executed; How do loops like for/while execute; How mutual calls between functions occur.

Since our programs are eventually turned into machine codes to execute, why is the same program on the same computer Linux Can it run under windows, but not under windows? Conversely, programs on Windows cannot be executed on Linux. But our CPU has not been replaced. Should it recognize the same instructions?

If you have the same question as me, let's solve this section together.

Compiling, linking, and loading: disassembling program execution

In Section 5, we said that the written C language code can be compiled into assembly code through the compiler, and then the assembly code can be transformed into machine code understandable by the CPU through the assembler, so the CPU can execute these machine codes. You should be familiar with this process now, but this description greatly simplifies the process. Next, let's look at how a C language program becomes an executable program.

I don't know if you have noticed that in the past few sections, we have some small problems with the files generated by gcc and the assembly instructions obtained by objdump. We split the previous example of the add function into two files, add_lib.c and link_example.c.

// add_lib.c
int add(int a, int b)
    return a+b;
// link_example.c

#include <stdio.h>
int main()
    int a = 10;
    int b = 5;
    int c = add(a, b);
    printf("c = %d\n", c);

We compile these two files through gcc, and then look at their assembly code through the objdump command.

gcc -g -c add_lib.c link_example.c
$ objdump -d -M intel -S add_lib.o
$ objdump -d -M intel -S link_example.o
add_lib.o:     file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <add>:
   0:   55                      push   rbp
   1:   48 89 e5                mov    rbp,rsp
   4:   89 7d fc                mov    DWORD PTR [rbp-0x4],edi
   7:   89 75 f8                mov    DWORD PTR [rbp-0x8],esi
   a:   8b 55 fc                mov    edx,DWORD PTR [rbp-0x4]
   d:   8b 45 f8                mov    eax,DWORD PTR [rbp-0x8]
  10:   01 d0                   add    eax,edx
  12:   5d                      pop    rbp
  13:   c3                      ret    
link_example.o:     file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <main>:
   0:   55                      push   rbp
   1:   48 89 e5                mov    rbp,rsp
   4:   48 83 ec 10             sub    rsp,0x10
   8:   c7 45 fc 0a 00 00 00    mov    DWORD PTR [rbp-0x4],0xa
   f:   c7 45 f8 05 00 00 00    mov    DWORD PTR [rbp-0x8],0x5
  16:   8b 55 f8                mov    edx,DWORD PTR [rbp-0x8]
  19:   8b 45 fc                mov    eax,DWORD PTR [rbp-0x4]
  1c:   89 d6                   mov    esi,edx
  1e:   89 c7                   mov    edi,eax
  20:   b8 00 00 00 00          mov    eax,0x0
  25:   e8 00 00 00 00          call   2a <main+0x2a>
  2a:   89 45 f4                mov    DWORD PTR [rbp-0xc],eax
  2d:   8b 45 f4                mov    eax,DWORD PTR [rbp-0xc]
  30:   89 c6                   mov    esi,eax
  32:   48 8d 3d 00 00 00 00    lea    rdi,[rip+0x0]        # 39 <main+0x39>
  39:   b8 00 00 00 00          mov    eax,0x0
  3e:   e8 00 00 00 00          call   43 <main+0x43>
  43:   b8 00 00 00 00          mov    eax,0x0
  48:   c9                      leave  
  49:   c3                      ret    

Now that the code has been "compiled" into instructions, we might as well try running it. / link_example.o.

Unfortunately, the file does not have execution permission, and we encountered a Permission denied error. Even if link is given through the chmod command_ Example. O file executable permissions, run. / link_example.o will still only get an error of cannot execute binary file:Exec format error.

If we take a closer look at the codes of the two files from objdump, we will find that the addresses of the two programs start from 0. If the address is the same, how does the program know which file to jump to if it needs to call the function through the call instruction?

Let's put it this way. Either the run error here or the duplicate address in the assembly code from objdump is due to add_lib.o and link_example.o is not an Executable Program, but an Object File. Only by linking multiple object files and various function libraries called through Linker can we get an executable file.

Through the - o parameter of gcc, we can generate the corresponding executable file. After the corresponding execution, we can get the result of this simple addition call function.

gcc -o link-example add_lib.o link_example.o
$ ./link_example
c = 15

In fact, the process of "C language code assembly code machine code" is composed of two parts when it is carried out on our computer.

The first part consists of three stages: compile, Assemble and Link. After these three stages are completed, we generate an executable file.

In the second part, we Load the executable file into memory through the Loader. The CPU reads instructions and data from memory to start the real program execution.

ELF format and linking: understanding the linking process

The program is finally transformed into instructions and data through the loader, so the executable code we generate is not just instructions. Let's take out the contents of the executable file through the objdump instruction.

link_example:     file format elf64-x86-64
Disassembly of section .init:
Disassembly of section .plt:
Disassembly of section
Disassembly of section .text:

 6b0:   55                      push   rbp
 6b1:   48 89 e5                mov    rbp,rsp
 6b4:   89 7d fc                mov    DWORD PTR [rbp-0x4],edi
 6b7:   89 75 f8                mov    DWORD PTR [rbp-0x8],esi
 6ba:   8b 55 fc                mov    edx,DWORD PTR [rbp-0x4]
 6bd:   8b 45 f8                mov    eax,DWORD PTR [rbp-0x8]
 6c0:   01 d0                   add    eax,edx
 6c2:   5d                      pop    rbp
 6c3:   c3                      ret    
00000000000006c4 <main>:
 6c4:   55                      push   rbp
 6c5:   48 89 e5                mov    rbp,rsp
 6c8:   48 83 ec 10             sub    rsp,0x10
 6cc:   c7 45 fc 0a 00 00 00    mov    DWORD PTR [rbp-0x4],0xa
 6d3:   c7 45 f8 05 00 00 00    mov    DWORD PTR [rbp-0x8],0x5
 6da:   8b 55 f8                mov    edx,DWORD PTR [rbp-0x8]
 6dd:   8b 45 fc                mov    eax,DWORD PTR [rbp-0x4]
 6e0:   89 d6                   mov    esi,edx
 6e2:   89 c7                   mov    edi,eax
 6e4:   b8 00 00 00 00          mov    eax,0x0
 6e9:   e8 c2 ff ff ff          call   6b0 <add>
 6ee:   89 45 f4                mov    DWORD PTR [rbp-0xc],eax
 6f1:   8b 45 f4                mov    eax,DWORD PTR [rbp-0xc]
 6f4:   89 c6                   mov    esi,eax
 6f6:   48 8d 3d 97 00 00 00    lea    rdi,[rip+0x97]        # 794 <_IO_stdin_used+0x4>
 6fd:   b8 00 00 00 00          mov    eax,0x0
 702:   e8 59 fe ff ff          call   560 <printf@plt>
 707:   b8 00 00 00 00          mov    eax,0x0
 70c:   c9                      leave  
 70d:   c3                      ret    
 70e:   66 90                   xchg   ax,ax
Disassembly of section .fini:

You will find that the contents of the executable code dump are similar to the previous object code, but much longer. Because under Linux, executable files and target files use a file format called ELF (executable and linkable file format), which is called executable and linkable file format in Chinese. It not only stores the compiled assembly instructions, but also retains a lot of other data.

For example, in all our objdump codes in the past, you can see the corresponding function names, such as add, main, etc., and even the globally accessible variable names defined by you are stored in this elf format file. These names and their corresponding addresses are stored in an ELF file in a location called Symbols Table. The symbol table is equivalent to an address book, associating names with addresses.

Let's focus first on the parts related to our add and main functions. You will find that the address of the main function calling add and jump is no longer the address of the next instruction, but the entry address of the add function. This is the credit of the ELF format and linker.

ELF file format saves various information into sections one by one. Elf has a basic File Header to represent the basic attributes of the file, such as whether it is an executable file, the corresponding CPU, operating system, etc. In addition to these basic properties, most programs also have some sections:

1. The first is. text Section, also known as Code Section or Code Section, which is used to save the program code and instructions;

2. Followed by. Data Section, also known as Data Section, which is used to save the initialization data information set in the program;

3. Then there is the. rel.text Section, also known as the Relocation Table. In the redirection table, what is reserved is the current file. In fact, we don't know which jump addresses. Like the link above_ In example. O, we call add and printf in the main function, but before the link occurs, we don't know where to jump to, and these information will be stored in the redirection table;

4. Finally, the. symtab Section is called the Symbol Table. The Symbol Table keeps the address book of the function name and corresponding address defined in the current file.

The linker will scan all the input target files, and then collect all the information in the symbol table to form a global symbol table,. Then, according to the redirection table, correct all the codes that are not sure to jump to the address according to the address stored in the symbol table. Finally, the corresponding segments of all the target files are merged once to become the final executable code. This is why the address of the function call in the executable file is correct.

After the linker turns the program into an executable file, it is much easier for the loader to execute the program. The loader no longer considers the problem of address jump. It only needs to parse the ELF file and load the corresponding instructions and data into memory for CPU execution.

Summary extension

At this point, I believe you have guessed why the same program can be executed under LInux but cannot be executed under Windows. One very important reason is that the format of executable files under the two operating systems is different.

Today we will focus on the ELF file format under Linux, while the executable file format of Windows is a file format called PE (Portable Executable Format). Loaders under Linux can only parse elf format, not PE format.

If we have a loader that can parse the PE format, we may run the windows program under Linux. Does such a program really exist? Yes, the famous open source project Wine under Linux enables us to run windows programs directly under Linux through a loader compatible with PE format. Now Microsoft Windows also provides WSL, that is, WIndows Subsystem for Linux, which can parse and load ELF format files.

When we write a program that can be used, we not only compile and execute all the code in one file, but can split it into different function libraries. Finally, through a static link mechanism, we can not only divide the work between different files, but also "cooperate" through static links to become an executable program.

For ELF format files, in order to implement such a static linking mechanism, it not only lists the instructions that the program needs to execute, but also includes the redirection table and symbol table required for linking.


You can read the symbol table of today's demo program through readelf to see what information is in the symbol table; Then read out the redirection table of today's demo program through objdump to see what information is in it.

Posted by Thuy on Tue, 23 Nov 2021 02:35:13 -0800