Memory layout of ucore

Keywords: C Linker

Starting from lab2, ucore has opened the conversion mechanism based on segment and page memory address, which makes ucore's memory layout different from lab1. For this part, I intend to discuss it in two articles, corresponding to lab1 and lab2 respectively.

Memory layout of ucore during lab1

How ucore is loaded into the kernel

After system initialization and detection, BIOS will load the first sector of the disk, called the primary boot sector, and then execute the code in it. This part of the code is bootloader, which will turn on 32-bit protection mode, then load the kernel image into memory, and then jump to the first instruction of the kernel to execute the kernel program.

qemu is used to simulate the hardware environment, load and execute the contents of ucore.img, the hard disk image file. This process is similar to the operation system on the computer boot loading disk. This ucore.img is composed of bootloader and kernel, which is also introduced in my other article, as shown in the figure below.

➜ readelf -S bootblock.o
//There are 9 section heads, starting from the offset 0x1390:
//Nodal head:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        00007c00 000074 000184 00 WAX  0   0  4
  [ 2] .eh_frame         PROGBITS        00007d84 0001f8 000068 00   A  0   0  4
  [ 3] .stab             PROGBITS        00000000 000260 000798 0c      4   0  4
  ...

Obviously, the bootloader will be loaded into the physical memory 0x7c00, as shown in the first image in orange. After bootloader turns on the protection mode, the addressing space changes to 32-bit without any other changes. Then bootloader starts to load the kernel. Where to load depends on the following code.

// prototype
/* *
 * readseg - read @count bytes at @offset from kernel into virtual address @va,
 * might copy more than asked.
 * */
static void readseg(uintptr_t va, uint32_t count, uint32_t offset);

// Actual call
// kernel is in ELF format, read elf header first
// 0x10000 for ELFHDR and 512 for SECTSIZE
readseg((uintptr_t)ELFHDR, SECTSIZE * 8, 0);
readseg(ph->p_va & 0xFFFFFF, ph->p_memsz, ph->p_offset);

Note that the input va of readseg refers to the virtual address. At this time, the computer actually manages the memory based on the segment mechanism, and the linear address is equal to the physical address (segment base is 0), so va is the physical address, so the elf header load address of the kernel is 0x10000, and the code segment is loaded at the physical address 0x100000.

➜ readelf -S kernel 
//There are 11 section heads, starting from the offset 0x122c0:
//Nodal head:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        00100000 001000 0035d7 00  AX  0   0  1
  [ 2] .rodata           PROGBITS        001035e0 0045e0 00090c 00   A  0   0 32
  [ 3] .stab             PROGBITS        00103eec 004eec 007a95 0c   A  4   0  4
  [ 4] .stabstr          STRTAB          0010b981 00c981 00206a 00   A  0   0  1

It should be noted that the address in the Addr column refers to the virtual address rather than the physical address, so there is a & 0xFFFFFF operation when calling readseg, which is to adjust the location of each segment of the kernel loaded in the physical memory. As I said before, the virtual address and physical address are one-to-one. If they are not adjusted, these segments will be loaded into the physical address space referred to in pH - > P ﹤ va. then there will be a problem. If the physical space of the computer itself is limited, for example, there is only a 2g memory module, but the program is required to load into the 3G address space, which is obviously not enough . From this point of view, the address of Addr0x1000000 is 1M, which is not big. Now computers can certainly meet this requirement. However, in lab2, it will be found that the Addr of. text in the kernel becomes 0xc0100000(3G), which will encounter the problems mentioned above.

kernel.ld analysis

Actually, I've almost finished here. I'll add why the starting position of the. text section of the kernel is 0x100000. Let's take a look at the kernel link script kernel.ld

/* Simple linker script for the JOS kernel.
   See the GNU ld 'info' manual ("info ld") to learn the syntax. */

OUTPUT_FORMAT("elf32-i386", "elf32-i386", "elf32-i386")
OUTPUT_ARCH(i386)
ENTRY(kern_init) /*Execution entry*/

SECTIONS {
    /* Load the kernel at the following address: "". Indicates the current address */
    . = 0x100000;
    
    /*.text Segment contains section*/
    .text : {
        *(.text .stub .text.* .gnu.linkonce.t.*)
    }

    PROVIDE(etext = .);    /* Define the 'etext' symbol whose value is the address here */
    
    /* Same.text* /
    .rodata : {
        *(.rodata .rodata.* .gnu.linkonce.r.*)
    }

    /* Contains some debug information */
    .stab : {
        PROVIDE(__STAB_BEGIN__ = .);
        *(.stab);
        PROVIDE(__STAB_END__ = .);
        BYTE(0)        /* Force the linker to allocate space
                   for this section */
    }

    .stabstr : {
        PROVIDE(__STABSTR_BEGIN__ = .);
        *(.stabstr);
        PROVIDE(__STABSTR_END__ = .);
        BYTE(0)        /* Force the linker to allocate space
                   for this section */
    }

    /* Adjust the address boundary to 4k (the size of one page) and load the data segment into the next page */
    . = ALIGN(0x1000);

    /* Data segment */
    .data : {
        *(.data)
    }
    
    /* Same etext */
    PROVIDE(edata = .);
    
    .bss : {
        *(.bss)
    }

    /* Same etext */
    PROVIDE(end = .);
    
    /* Some abandoned sections*/
    /DISCARD/ : {
        *(.eh_frame .note.GNU-stack)
    }
}

See line 10 of kernel.ld to see that the kernel should be loaded at the memory address 0x100000. We also saw many PROVIDE() commands, which is equivalent to declaring a variable representing the current address. In other ucore files, you can use extern to declare and use

Posted by kevinfwb on Tue, 05 Nov 2019 18:18:50 -0800