Learning c language again (v. static link and link control)

Originally, the static link was also intended to be written in the previous section, but because the previous article was too long, it was postponed to this section.

5.1 static library

5.1.1 what is a static library

After the previous study, you should also know what a static library is.

In fact, the static library is also an object file and an ELF format file. We can use the command to see:

root@ubuntu:~/c_test/05# readelf -h libc.a 

File: libc.a(init-first.o)
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              REL (Relocatable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          0 (bytes into file)
  Start of section headers:          1568 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           0 (bytes)
  Number of program headers:         0
  Size of section headers:           64 (bytes)
  Number of section headers:         12
  Section header string table index: 9

File: libc.a(libc-start.o)
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              REL (Relocatable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          0 (bytes into file)
  Start of section headers:          6544 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           0 (bytes)
  Number of program headers:         0
  Size of section headers:           64 (bytes)
  Number of section headers:         15
  Section header string table index: 12
  
  ...

Many paragraphs are omitted later. In fact, the static library is packed with many. o files. In fact, it is OK to disperse them, but it is inconvenient to manage. Moreover, if my program needs a function, do I have to find the. o file I use, and then link it?

This is very troublesome, so generally, many. o files are provided and packaged directly. When we directly call the functions of the static library, the linker will find the corresponding implementation by itself, which will be described later.

5.1.2 ar command

When it comes to static libraries, let's introduce a new command: ar command.

Linux ar command

The above is the command to guess the introduction of your tutorial. Here I also emphasize two parameters, which we often use.

-r  Insert file into save file
-t  Displays the files contained in the save file
-x  Take the member file from the prepared file

Now let's use - t to see what files are everywhere in libc.a:

root@ubuntu:~/c_test/05# ar -t libc.a | wc -l
1579

There are too many, so I won't list them. Otherwise, it is said that they account for the number of words, so I made a statistics. There are 1579. o files in total.

Here's why there are so many. o files. In fact, as long as we are organized in the form of a function and an. o file.

Why?

Because the linker links based on the. o file. If all functions are in one. o file, other segments will be merged when linking. Therefore, the linked program will be relatively large, so it needs to be split. Each function has a. o file. In this way, useless functions will not be linked.

Unzip the static library:

ar -x libc.a

It seems that the file cannot be specified. This command is indeed flawed

5.1.3 making static library

Making a dynamic library is also relatively simple. I used the. o file compiled in the previous section to make it.

root@ubuntu:~/c_test/05# ar -r fun2.a fun2.o
ar: creating fun2.a
root@ubuntu:~/c_test/05#

Is it also relatively simple? You can check the functions inside:

root@ubuntu:~/c_test/05# ar -t fun2.a
fun2.o
root@ubuntu:~/c_test/05# 

This is just a function, so it's posted directly. Isn't it very reliable.

5.1.4 the linker uses a static library to resolve references

After blowing so much water, I finally came to the focus of this article.

Let's see how to use the static library we made above.

root@ubuntu:~/c_test/05# gcc -static  hello_world.c fun2.a

Alternatively:

root@ubuntu:~/c_test/05# gcc -static  hello_world.c -L. -lfun2

This is to use the linker to match the library of fun2, but in this case, lib is required in front of the name of the static library, so that the linker can match.

The first is directly specified, so there is no need to match with a name.

The above is just a static library. The situation is relatively simple. When there are many static libraries and they call each other, this situation is very troublesome. Here's the point:

How linux parses external references: in the symbol parsing phase, the linker scans relocatable files and archive files in their order from left to right.

During scanning, the linker maintains a set E of relocatable target files, an unresolved symbol set U, and a defined set D.

At the beginning, e, u and D are all empty, and the linker will scan the order of files at one time

  • If the file is a target file, add to set E and update set U and set D
  • If the file is an archive file, it will try to match the archive file with U. if there is a file m, add m to e and update U and D. At this point, any target files that are not in set E are discarded.
  • Until the linker is scanned, if the set U is not empty, it will report that a symbol is not found. If the combined U is empty, the executable file will be output.

Because of this, problems will occur if there are too many libraries. Pay attention to the order of Libraries: the reference of symbols should be before the definition of symbols.

For example, foo.c calls the functions in libx.a, then libx.a calls the library of liby.a, and liby.a calls the library of libx.a (this is mutual call). What should I do?

root@ubuntu:~/c_test/05# gcc foo.c liby.a liba.x

So we need to pay attention to the order.

5.2 link process control

Although we talked a lot about the link process and made it clear, there is still a part we haven't talked about, that is, the link process we can control. In fact, the link process we did before uses the default link script. In this section, let's make a good analysis.

5.2.1 link control script

Linker provides a variety of methods to control the whole linking process.

  1. Use the command line to specify parameters for the linker, such as - o,-e and so on.
  2. The compiler often passes instructions to the linker by storing the linked instructions in the target file.
  3. Use link control scripts. (when we started embedded, we used link scripts)

When we used ld, we didn't specify the link script. In fact, the link script was specified by default. We can use this command to view the link script:

root@ubuntu:/usr/lib/ldscripts# ld -verbose

The default ld link script is stored in / usr/lib/ldscripts /. I really don't know which is.

root@ubuntu:/usr/lib/ldscripts# find . | xargs grep "i386:x86-64"
grep: .: Is a directory
./elf_x86_64.xd:OUTPUT_ARCH(i386:x86-64)
./elf_x86_64.xu:OUTPUT_ARCH(i386:x86-64)
./elf_x86_64.xsc:OUTPUT_ARCH(i386:x86-64)
./elf_x86_64.xr:OUTPUT_ARCH(i386:x86-64)
./elf_x86_64.xc:OUTPUT_ARCH(i386:x86-64)
./elf_x86_64.xs:OUTPUT_ARCH(i386:x86-64)
./elf_x86_64.xw:OUTPUT_ARCH(i386:x86-64)
./elf_x86_64.xbn:OUTPUT_ARCH(i386:x86-64)
./elf_x86_64.xdc:OUTPUT_ARCH(i386:x86-64)
./elf_x86_64.x:OUTPUT_ARCH(i386:x86-64)
./elf_x86_64.xdw:OUTPUT_ARCH(i386:x86-64)
./elf_x86_64.xsw:OUTPUT_ARCH(i386:x86-64)
./elf_x86_64.xn:OUTPUT_ARCH(i386:x86-64)

After matching, there are so many suitable ones. It's not clear which one to use.

If you need to specify our own link script, you can use the following command:

root@ubuntu:/usr/lib/ldscripts# ls -T link.script

5.2.2 introduction to LD link script syntax

In "programmer's self-cultivation - link, load and library", I have realized a minimum program. If you need it, you can take a look. I won't do it here, just simply analyze the syntax, because the syntax of this ld link script is still useless.

root@ubuntu:/usr/lib/ldscripts# ld -verbose
==================================================
/* Script for -z combreloc: combine and sort reloc sections */
/* Copyright (C) 2014-2015 Free Software Foundation, Inc.
   Copying and distribution of this script, with or without modification,
   are permitted in any medium without royalty provided the copyright
   notice and this notice are preserved.  */
OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64",
	      "elf64-x86-64")		
OUTPUT_ARCH(i386:x86-64)    /* Output format */
ENTRY(_start)				/* This is important to specify the entry function of the program */
SEARCH_DIR("=/usr/local/lib/x86_64-linux-gnu"); SEARCH_DIR("=/lib/x86_64-linux-gnu"); SEARCH_DIR("=/usr/lib/x86_64-linux-gnu"); SEARCH_DIR("=/usr/local/lib64"); SEARCH_DIR("=/lib64"); SEARCH_DIR("=/usr/lib64"); SEARCH_DIR("=/usr/local/lib"); SEARCH_DIR("=/lib"); SEARCH_DIR("=/usr/lib"); SEARCH_DIR("=/usr/x86_64-linux-gnu/lib64"); SEARCH_DIR("=/usr/x86_64-linux-gnu/lib");
/* SEARCH_DIR It is the library that the ld linker looks up in the specified directory, which is equivalent to - Lpath */
SECTIONS	/* This is the definition of each segment. Are you familiar with the following segment names */
{
  /* Read-only sections, merged into text segment: */
  /* Define a symbol in the link script that can be used in code */
  PROVIDE (__executable_start = SEGMENT_START("text-segment", 0x400000)); . = SEGMENT_START("text-segment", 0x400000) + SIZEOF_HEADERS;  /* This is the start address of the defined program, sizeof_ Heads can be left for later*/
  .interp         : { *(.interp) }		// *Is a wildcard, indicating that the. interp segments of all files meet the conditions
  .note.gnu.build-id : { *(.note.gnu.build-id) }
  .hash           : { *(.hash) }
  .gnu.hash       : { *(.gnu.hash) }
  .dynsym         : { *(.dynsym) }
  .dynstr         : { *(.dynstr) }
  .gnu.version    : { *(.gnu.version) }
  .gnu.version_d  : { *(.gnu.version_d) }
  .gnu.version_r  : { *(.gnu.version_r) }
  .rela.dyn       :
    {
      *(.rela.init)
      *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*)
      *(.rela.fini)
      *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*)
      *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*)
      *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*)
      *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*)
      *(.rela.ctors)
      *(.rela.dtors)
      *(.rela.got)
      *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*)
      *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*)
      *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*)
      *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*)
      *(.rela.ifunc)
    }
  .rela.plt       :
    {
      *(.rela.plt)
      PROVIDE_HIDDEN (__rela_iplt_start = .);  // Used in linked files and not exported to programs
      *(.rela.iplt)
      PROVIDE_HIDDEN (__rela_iplt_end = .);
    }
  .init           :
  {
    KEEP (*(SORT_NONE(.init)))
    //After the option - GC sections is used in the connection command line, the connector may filter out some sections that it considers useless. At this time, it is necessary to force the connector to retain some specific sections. This can be achieved by using the KEEP() keyword
  }
  .plt            : { *(.plt) *(.iplt) }
.plt.got        : { *(.plt.got) }
.plt.bnd        : { *(.plt.bnd) }
  .text           :
  {
    *(.text.unlikely .text.*_unlikely .text.unlikely.*)
    *(.text.exit .text.exit.*)
    *(.text.startup .text.startup.*)
    *(.text.hot .text.hot.*)
    *(.text .stub .text.* .gnu.linkonce.t.*)
    /* .gnu.warning sections are handled specially by elf32.em.  */
    *(.gnu.warning)
  }
  .fini           :
  {
    KEEP (*(SORT_NONE(.fini)))
  }
  PROVIDE (__etext = .);
  PROVIDE (_etext = .);
  PROVIDE (etext = .);
  .rodata         : { *(.rodata .rodata.* .gnu.linkonce.r.*) }
  .rodata1        : { *(.rodata1) }
  .eh_frame_hdr : { *(.eh_frame_hdr) *(.eh_frame_entry .eh_frame_entry.*) }
  .eh_frame       : ONLY_IF_RO { KEEP (*(.eh_frame)) *(.eh_frame.*) }
  .gcc_except_table   : ONLY_IF_RO { *(.gcc_except_table
  .gcc_except_table.*) }
  .gnu_extab   : ONLY_IF_RO { *(.gnu_extab*) }
  /* These sections are generated by the Sun/Oracle C++ compiler.  */
  .exception_ranges   : ONLY_IF_RO { *(.exception_ranges
  .exception_ranges*) }
  /* Adjust the address for the data segment.  We want to adjust up to
     the same address within the page on the next page up.  */
  . = DATA_SEGMENT_ALIGN (CONSTANT (MAXPAGESIZE), CONSTANT (COMMONPAGESIZE));
  /* Exception handling  */
  .eh_frame       : ONLY_IF_RW { KEEP (*(.eh_frame)) *(.eh_frame.*) }
  .gnu_extab      : ONLY_IF_RW { *(.gnu_extab) }
  .gcc_except_table   : ONLY_IF_RW { *(.gcc_except_table .gcc_except_table.*) }
  .exception_ranges   : ONLY_IF_RW { *(.exception_ranges .exception_ranges*) }
  /* Thread Local Storage sections  */
  .tdata	  : { *(.tdata .tdata.* .gnu.linkonce.td.*) }
  .tbss		  : { *(.tbss .tbss.* .gnu.linkonce.tb.*) *(.tcommon) }
  .preinit_array     :
  {
    PROVIDE_HIDDEN (__preinit_array_start = .);
    KEEP (*(.preinit_array))
    PROVIDE_HIDDEN (__preinit_array_end = .);
  }
  .init_array     :
  {
    PROVIDE_HIDDEN (__init_array_start = .);
    KEEP (*(SORT_BY_INIT_PRIORITY(.init_array.*) SORT_BY_INIT_PRIORITY(.ctors.*)))
    KEEP (*(.init_array EXCLUDE_FILE (*crtbegin.o *crtbegin?.o *crtend.o *crtend?.o ) .ctors))
    PROVIDE_HIDDEN (__init_array_end = .);
  }
  .fini_array     :
  {
    PROVIDE_HIDDEN (__fini_array_start = .);
    KEEP (*(SORT_BY_INIT_PRIORITY(.fini_array.*) SORT_BY_INIT_PRIORITY(.dtors.*)))
    KEEP (*(.fini_array EXCLUDE_FILE (*crtbegin.o *crtbegin?.o *crtend.o *crtend?.o ) .dtors))
    PROVIDE_HIDDEN (__fini_array_end = .);
  }
  .ctors          :
  {
    /* gcc uses crtbegin.o to find the start of
       the constructors, so we make sure it is
       first.  Because this is a wildcard, it
       doesn't matter if the user does not
       actually link against crtbegin.o; the
       linker won't look for a file to match a
       wildcard.  The wildcard also means that it
       doesn't matter which directory crtbegin.o
       is in.  */
    KEEP (*crtbegin.o(.ctors))
    KEEP (*crtbegin?.o(.ctors))
    /* We don't want to include the .ctor section from
       the crtend.o file until after the sorted ctors.
       The .ctor section from the crtend file contains the
       end of ctors marker and it must be last */
    KEEP (*(EXCLUDE_FILE (*crtend.o *crtend?.o ) .ctors))
    KEEP (*(SORT(.ctors.*)))
    KEEP (*(.ctors))
  }
  .dtors          :
  {
    KEEP (*crtbegin.o(.dtors))
    KEEP (*crtbegin?.o(.dtors))
    KEEP (*(EXCLUDE_FILE (*crtend.o *crtend?.o ) .dtors))
    KEEP (*(SORT(.dtors.*)))
    KEEP (*(.dtors))
  }
  .jcr            : { KEEP (*(.jcr)) }
  .data.rel.ro : { *(.data.rel.ro.local* .gnu.linkonce.d.rel.ro.local.*) *(.data.rel.ro .data.rel.ro.* .gnu.linkonce.d.rel.ro.*) }
  .dynamic        : { *(.dynamic) }
  .got            : { *(.got) *(.igot) }
  . = DATA_SEGMENT_RELRO_END (SIZEOF (.got.plt) >= 24 ? 24 : 0, .);
  .got.plt        : { *(.got.plt)  *(.igot.plt) }
  .data           :
  {
    *(.data .data.* .gnu.linkonce.d.*)
    SORT(CONSTRUCTORS)
  }
  .data1          : { *(.data1) }
  _edata = .; PROVIDE (edata = .);
  . = .;
  __bss_start = .;
  .bss            :
  {
   *(.dynbss)
   *(.bss .bss.* .gnu.linkonce.b.*)
   *(COMMON)
   /* Align here to ensure that the .bss section occupies space up to
      _end.  Align after .bss to ensure correct alignment even if the
      .bss section disappears because there are no input sections.
      FIXME: Why do we need it? When there is no .bss section, we don't
      pad the .data section.  */
   . = ALIGN(. != 0 ? 64 / 8 : 1);
  }
  .lbss   :
  {
    *(.dynlbss)
    *(.lbss .lbss.* .gnu.linkonce.lb.*)
    *(LARGE_COMMON)
  }
  . = ALIGN(64 / 8);
  . = SEGMENT_START("ldata-segment", .);
  .lrodata   ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) :
  {
    *(.lrodata .lrodata.* .gnu.linkonce.lr.*)
  }
  .ldata   ALIGN(CONSTANT (MAXPAGESIZE)) + (. & (CONSTANT (MAXPAGESIZE) - 1)) :
  {
    *(.ldata .ldata.* .gnu.linkonce.l.*)
    . = ALIGN(. != 0 ? 64 / 8 : 1);
  }
  . = ALIGN(64 / 8);
  _end = .; PROVIDE (end = .);
  . = DATA_SEGMENT_END (.);
  /* Stabs debugging sections.  */
  .stab          0 : { *(.stab) }
  .stabstr       0 : { *(.stabstr) }
  .stab.excl     0 : { *(.stab.excl) }
  .stab.exclstr  0 : { *(.stab.exclstr) }
  .stab.index    0 : { *(.stab.index) }
  .stab.indexstr 0 : { *(.stab.indexstr) }
  .comment       0 : { *(.comment) }
  /* DWARF debug sections.
     Symbols in the DWARF debugging sections are relative to the beginning
     of the section so we begin them at 0.  */
  /* DWARF 1 */
  .debug          0 : { *(.debug) }
  .line           0 : { *(.line) }
  /* GNU DWARF 1 extensions */
  .debug_srcinfo  0 : { *(.debug_srcinfo) }
  .debug_sfnames  0 : { *(.debug_sfnames) }
  /* DWARF 1.1 and DWARF 2 */
  .debug_aranges  0 : { *(.debug_aranges) }
  .debug_pubnames 0 : { *(.debug_pubnames) }
  /* DWARF 2 */
  .debug_info     0 : { *(.debug_info .gnu.linkonce.wi.*) }
  .debug_abbrev   0 : { *(.debug_abbrev) }
  .debug_line     0 : { *(.debug_line .debug_line.* .debug_line_end ) }
  .debug_frame    0 : { *(.debug_frame) }
  .debug_str      0 : { *(.debug_str) }
  .debug_loc      0 : { *(.debug_loc) }
  .debug_macinfo  0 : { *(.debug_macinfo) }
  /* SGI/MIPS DWARF 2 extensions */
  .debug_weaknames 0 : { *(.debug_weaknames) }
  .debug_funcnames 0 : { *(.debug_funcnames) }
  .debug_typenames 0 : { *(.debug_typenames) }
  .debug_varnames  0 : { *(.debug_varnames) }
  /* DWARF 3 */
  .debug_pubtypes 0 : { *(.debug_pubtypes) }
  .debug_ranges   0 : { *(.debug_ranges) }
  /* DWARF Extension.  */
  .debug_macro    0 : { *(.debug_macro) }
  .gnu.attributes 0 : { KEEP (*(.gnu.attributes)) }
  /DISCARD/ : { *(.note.GNU-stack) *(.gnu_debuglink) *(.gnu.lto_*) }
  //Any input section referenced by this section will not appear in the output file, which is what DISCARD means
}

Here is a link to explain ld link script, which is very good.

[[to] detailed explanation of lds link script under Linux](https://www.cnblogs.com/li-hao/p/4107964.html)

This one is much easier than the previous one, and there are not so many knowledge points, but the next one is another tough battle. Come on.

Reference article:

Self cultivation of programmers - linking, loading and Library

Deep understanding of computer systems

Posted by Emir on Sat, 04 Dec 2021 10:57:35 -0800