Hacker level article: play a new trick in the memory operation of dynamic library!

Author: Daoge, a 10 + year embedded development veteran, focusing on: C/C + +, embedded and Linux. Pay attention to the official account below, reply to books, get classic books in Linux and embedded field. Reply to [PDF] to obtain all original articles (PDF format).

catalogue

  • theory and practice
  • start
    • New dynamic library
    • Problems faced
    • How do you do it?
  • ELF
    • summary
    • ELF file header
    • SHT(section header table)
    • PHT(program header table)
    • Connect view and run view
    • .dynamic section
    • Dynamic linker (liker)
    • track
  • Memory
    • Base address
    • Memory access
    • Instruction cache
    • verification
    • Using xhook
  • FAQ
    • Can ELF information be read directly from the file?
    • What is the exact method of calculating the address?
    • What impact does the compilation options used by the target ELF have on hook?
    • What is the reason for an occasional segment error in hook? How?
    • Can calls between ELF internal functions hook?

Other people's experience, our ladder!

Hello, I'm brother Dao. Today I'll explain the technical knowledge for you: memory processing of dynamic library.

In a reprinted article last week, we introduced a technology of how to "swap" the calling functions in a dynamic library, so as to achieve some special purposes.

This technology is an open source xhook of iqiyi. The GitHub address is: https://github.com/iqiyi/xHook .

In the official document, the author describes the android system. Because the bottom layer is based on Linux, the hook technology introduced here is also suitable for the working environment of other Linux systems.

In this article, we will learn from the great God how to find the target (the address of the called function) step by step, and then secretly replace it with other function addresses.

The content of the article is relatively long, but it is definitely worth spending half a day or even a few days to study the knowledge points.

Maybe it can't improve your programming skills immediately, but it is absolutely first-class good information for the cultivation and improvement of internal skills!

In the process of learning, I will use orange font to add my learning experience or understanding in some important places. If the understanding is wrong, you are welcome to point it out and discuss it together.

In order to facilitate reading, I added font color to the key words in the original text.

theory and practice

About the content of dynamic library, the book with better quality on the market may be the self-cultivation of programmers - link, load and library.

The book in my hand is printed for the 29th time in June 2019, which shows how strong the vitality of this book is!

If you have read this book, you may feel that the content in the book is too theoretical. Even if you understand the truth, how should you practice it? Or, what can we do with these knowledge points?

Iqiyi's xHook is the perfect practice of these theoretical knowledge!

Self cultivation of programmers - link, load and library is a rare book. If you are very interested in dynamic library, I suggest you start a paper book and support the author!

If you just want to browse, I have a PDF version here (I forgot where to download it), which has been put in the online disk.

If you need it, back in the official account of IOT town of Internet of things: 1031, you can get the download link.

start

New dynamic library

We have a new dynamic library: libtest.so.

Header file test.h

#ifndef TEST_H
#define TEST_H 1

#ifdef __cplusplus
extern "C" {
#endif

void say_hello();

#ifdef __cplusplus
}
#endif

#endif

Source file test.c

#include <stdlib.h>
#include <stdio.h>

void say_hello()
{
    char *buf = malloc(1024);
    if(NULL != buf)
    {
        snprintf(buf, 1024, "%s", "hello\n");
        printf("%s", buf);
    }
}

say_ The function of Hello is to print the six characters of hello\n (including the ending \ n) on the terminal.

We need a test program: main.

Source file main.c

#include <test.h>

int main()
{
    say_hello();
    return 0;
}

Compile them to generate libtest.so and main, respectively. Run the following:

caikelun@debian:~$ adb push ./libtest.so ./main /data/local/tmp
caikelun@debian:~$ adb shell "chmod +x /data/local/tmp/main"
caikelun@debian:~$ adb shell "export LD_LIBRARY_PATH=/data/local/tmp; /data/local/tmp/main"
hello
caikelun@debian:~$

That is great! Although the code of libtest.so looks stupid, it actually works correctly. What else to complain about?

Start using it in the new APP!

Unfortunately, as you may have found, libtest.so has a serious memory leak problem. Say is called every time_ The Hello function will leak 1024 bytes of memory.

After the launch of the new APP, the crash rate began to rise, and all kinds of strange crash information and barrier information crashed.

Problems faced

Fortunately, we fixed libtest.so. But what will we do in the future? We face two problems:

  1. When the test coverage is insufficient, how to timely find and accurately locate such problems in the online APP?
  2. If libtest.so is a system library for some models or a closed source library of a third party, how can we repair it? What if you monitor its behavior?

How do you do it?

If we can hook the function calls in the dynamic library (replace, intercept, eavesdrop, or any correct description you think), we can do a lot of things we want to do.

For example, hook malloc, calloc, realloc and free, we can count how much memory each dynamic library allocates and which memory has been occupied and not released.

Can this really be done? The answer is: hook our own process is entirely possible.

hook other processes need root permission (for other processes, you can't modify its memory space or inject code without root permission).

Fortunately, we just need to hook ourselves.

Taoist notes:

If going to hook doesn't belong to your own process, it really belongs to the virus!

Process level isolation is generally handled by the operating system!

ELF

Taoist notes:

For a detailed introduction to ELF, you can also see an article I wrote earlier: ELF file, the cornerstone of compiling and linking in Linux system: peel off its layers and explore it from the granularity of bytecode.

The content of this article is very detailed, just like peeling onions, analyzing the structure of ELF files layer by layer.

And in the form of pictures, it is more intuitive to correspond the binary content in ELF file with the related structure member variables one by one.

summary

ELF (Executable and Linkable Format) is an industry standard binary data encapsulation format, which is mainly used to encapsulate executable files, dynamic libraries, object files and core dumps files.

Use google NDK to compile and link the source code, and the generated dynamic library or executable file is in ELF format.

Use readelf to view the basic information of ELF files, and objdump to view the disassembly output of ELF files.

For an overview of ELF format, please refer to here, and for a complete definition, please refer to here.

The most important parts are ELF file header, SHT (section header table), and PHT (program header table).

ELF file header

At the beginning of ELF file, there is a fixed format fixed length file header (52 bytes for 32-bit architecture and 64 bytes for 64 bit architecture). ELF file header starts with magic number 0x7F 0x45 0x4C 0x46 (the last three bytes correspond to the visible characters E L F respectively).

ELF header information of libtest.so:

caikelun@debian:~$ arm-linux-androideabi-readelf -h ./libtest.so
 
ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              DYN (Shared object file)
  Machine:                           ARM
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          52 (bytes into file)
  Start of section headers:          12744 (bytes into file)
  Flags:                             0x5000200, Version5 EABI, soft-float ABI
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         8
  Size of section headers:           40 (bytes)
  Number of section headers:         25
  Section header string table index: 24

The ELF file header contains the starting position and length of SHT and PHT in the current ELF file.

For example, the SHT start position of libtest.so is 12744 and the length is 40 bytes;

PHT start position 52, length 32 bytes.

SHT(section header table)

ELF organizes and manages various information in section s.

ELF uses SHT to record the basic information of all section s.

It mainly includes the type of section, offset in the file, size, relative address of virtual memory after loading into memory, alignment of bytes in memory, etc.

SHT of libtest.so:

caikelun@debian:~$ arm-linux-androideabi-readelf -S ./libtest.so
 
There are 25 section headers, starting at offset 0x31c8:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .note.android.ide NOTE            00000134 000134 000098 00   A  0   0  4
  [ 2] .note.gnu.build-i NOTE            000001cc 0001cc 000024 00   A  0   0  4
  [ 3] .dynsym           DYNSYM          000001f0 0001f0 0003a0 10   A  4   1  4
  [ 4] .dynstr           STRTAB          00000590 000590 0004b1 00   A  0   0  1
  [ 5] .hash             HASH            00000a44 000a44 000184 04   A  3   0  4
  [ 6] .gnu.version      VERSYM          00000bc8 000bc8 000074 02   A  3   0  2
  [ 7] .gnu.version_d    VERDEF          00000c3c 000c3c 00001c 00   A  4   1  4
  [ 8] .gnu.version_r    VERNEED         00000c58 000c58 000020 00   A  4   1  4
  [ 9] .rel.dyn          REL             00000c78 000c78 000040 08   A  3   0  4
  [10] .rel.plt          REL             00000cb8 000cb8 0000f0 08  AI  3  18  4
  [11] .plt              PROGBITS        00000da8 000da8 00017c 00  AX  0   0  4
  [12] .text             PROGBITS        00000f24 000f24 0015a4 00  AX  0   0  4
  [13] .ARM.extab        PROGBITS        000024c8 0024c8 00003c 00   A  0   0  4
  [14] .ARM.exidx        ARM_EXIDX       00002504 002504 000100 08  AL 12   0  4
  [15] .fini_array       FINI_ARRAY      00003e3c 002e3c 000008 04  WA  0   0  4
  [16] .init_array       INIT_ARRAY      00003e44 002e44 000004 04  WA  0   0  1
  [17] .dynamic          DYNAMIC         00003e48 002e48 000118 08  WA  4   0  4
  [18] .got              PROGBITS        00003f60 002f60 0000a0 00  WA  0   0  4
  [19] .data             PROGBITS        00004000 003000 000004 00  WA  0   0  4
  [20] .bss              NOBITS          00004004 003004 000000 00  WA  0   0  1
  [21] .comment          PROGBITS        00000000 003004 000065 01  MS  0   0  1
  [22] .note.gnu.gold-ve NOTE            00000000 00306c 00001c 00      0   0  4
  [23] .ARM.attributes   ARM_ATTRIBUTES  00000000 003088 00003b 00      0   0  1
  [24] .shstrtab         STRTAB          00000000 0030c3 000102 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  y (noread), p (processor specific)

The more important section s related to hook are:

dynstr: saves all string constant information. dynsym: saves the information of symbol (type, starting address, size, index number of symbol name in. dynstr, etc.). Functions are also symbols. text: machine instructions generated after program code is compiled. Dynamic: various information used by the dynamic linker, which records the external dependency of the current ELF and the starting position of other important section s. got: Global Offset Table. Entry address used to record external calls. When the dynamic linker performs a relocate operation, the absolute address of the real external call will be filled in here. plt: Procedure Linkage Table. The springboard for external calls is mainly used to support the relocation of external calls in lazy binding mode. (currently, only MIPS architecture supports lazy binding for Android) rel.plt: relocation information for direct calls to external functions. rel.dyn: relocation information other than. rel.plt. (for example, calling an external function through a global function pointer)

Taoist notes:

In ELF file, dynamic section is very important!

When a dynamic library is loaded into memory, the dynamic linker reads the contents of the section, such as:

Which other shared objects depend on; The location of the dynamically linked symbol table (. dynsym); Dynamically link the location of the relocation table; Location of initialization code; ...

Use the instruction: readelf -d xxx.so to view the contents of. Dynamic in a dynamic library.

In addition, got and plt section s are mainly used to handle address independent functions.

If you query the relevant contents of - fPIC, you will certainly explain these two knowledge points.

Generally speaking, the dynamic library under Linux changes the address related parts of the code segment into "address independent" through the principle of "adding a layer".

In this way, after the code segment of the dynamic library is loaded into the physical memory, it can be shared by multiple different processes. Just map the physical address of the code segment to each process's own virtual address.

The "address related" part is placed in got (reference to variables) and PLT (reference to functions).

PHT(program header table)

·When ELF is loaded into memory, it is in segment. A segment contains one or more section s'.

ELF uses PHT to record the basic information of all segment s.

It mainly includes the type of segment, offset in the file, size, relative address of virtual memory after loading into memory, byte alignment in memory, etc.

PHT of libtest.so:

caikelun@debian:~$ arm-linux-androideabi-readelf -l ./libtest.so 

Elf file type is DYN (Shared object file)
Entry point 0x0
There are 8 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  PHDR           0x000034 0x00000034 0x00000034 0x00100 0x00100 R   0x4
  LOAD           0x000000 0x00000000 0x00000000 0x02604 0x02604 R E 0x1000
  LOAD           0x002e3c 0x00003e3c 0x00003e3c 0x001c8 0x001c8 RW  0x1000
  DYNAMIC        0x002e48 0x00003e48 0x00003e48 0x00118 0x00118 RW  0x4
  NOTE           0x000134 0x00000134 0x00000134 0x000bc 0x000bc R   0x4
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x10
  EXIDX          0x002504 0x00002504 0x00002504 0x00100 0x00100 R   0x4
  GNU_RELRO      0x002e3c 0x00003e3c 0x00003e3c 0x001c4 0x001c4 RW  0x4

 Section to Segment mapping:
  Segment Sections...
   00     
   01     .note.android.ident .note.gnu.build-id .dynsym .dynstr .hash .gnu.version .gnu.version_d .gnu.version_r .rel.dyn .rel.plt .plt .text .ARM.extab .ARM.exidx 
   02     .fini_array .init_array .dynamic .got .data 
   03     .dynamic 
   04     .note.android.ident .note.gnu.build-id 
   05     
   06     .ARM.exidx 
   07     .fini_array .init_array .dynamic .got

All types are Pt_ The segment s of load are mapped (mmap) into memory by the dynamic linker.

Linking View and Execution View

Connection view: the data organization form in section before ELF is loaded into memory and executed. Execution view: the data organization in segment after ELF is loaded into memory.

The hook operation we are concerned about belongs to a dynamic memory operation. Therefore, we are mainly concerned about the execution view, that is, how the data in ELF is organized and stored after ELF is loaded into memory.

.dynamic section

This is a very important and special section, which contains information such as the memory location of other sections in ELF.

In the execution view, there is always a type PT_DYNAMIC segment, which contains the content of. dynamic section.

PT is required for both hook operation and dynamic link by dynamic linker_ Dynamic segment to find the memory location of. dynamic section, and then further read the information of other sections.

. dynamic section of libtest.so:

caikelun@debian:~$ arm-linux-androideabi-readelf -d ./libtest.so 

Dynamic section at offset 0x2e48 contains 30 entries:
  Tag        Type                         Name/Value
 0x00000003 (PLTGOT)                     0x3f7c
 0x00000002 (PLTRELSZ)                   240 (bytes)
 0x00000017 (JMPREL)                     0xcb8
 0x00000014 (PLTREL)                     REL
 0x00000011 (REL)                        0xc78
 0x00000012 (RELSZ)                      64 (bytes)
 0x00000013 (RELENT)                     8 (bytes)
 0x6ffffffa (RELCOUNT)                   3
 0x00000006 (SYMTAB)                     0x1f0
 0x0000000b (SYMENT)                     16 (bytes)
 0x00000005 (STRTAB)                     0x590
 0x0000000a (STRSZ)                      1201 (bytes)
 0x00000004 (HASH)                       0xa44
 0x00000001 (NEEDED)                     Shared library: [libc.so]
 0x00000001 (NEEDED)                     Shared library: [libm.so]
 0x00000001 (NEEDED)                     Shared library: [libstdc++.so]
 0x00000001 (NEEDED)                     Shared library: [libdl.so]
 0x0000000e (SONAME)                     Library soname: [libtest.so]
 0x0000001a (FINI_ARRAY)                 0x3e3c
 0x0000001c (FINI_ARRAYSZ)               8 (bytes)
 0x00000019 (INIT_ARRAY)                 0x3e44
 0x0000001b (INIT_ARRAYSZ)               4 (bytes)
 0x0000001e (FLAGS)                      BIND_NOW
 0x6ffffffb (FLAGS_1)                    Flags: NOW
 0x6ffffff0 (VERSYM)                     0xbc8
 0x6ffffffc (VERDEF)                     0xc3c
 0x6ffffffd (VERDEFNUM)                  1
 0x6ffffffe (VERNEED)                    0xc58
 0x6fffffff (VERNEEDNUM)                 1
 0x00000000 (NULL)                       0x0

Dynamic linker

The dynamic linker program in Android is linker. The source code is here.

The general steps of dynamic linking (such as dlopen):

  1. Check the loaded ELF list. (if libtest.so has been loaded, it will not be loaded again. Just add one to the reference count of libtest.so and return directly.)
  2. Read the ELF list of external dependencies of libtest.so from the. dynamic section of libtest.so, remove the loaded ELF from this list, and finally get the complete list of ELF to be loaded this time (including libtest.so itself).
  3. Load the ELF in the list one by one. Loading steps:

(1) Use mmap to reserve a large enough memory for subsequent mapping ELF. (MAP_PRIVATE mode) (2) Read the PHT of ELF and change all types to Pt with mmap_ The segment s of load are mapped to memory in turn. (3) Read each information item from. dynamic segment, mainly the relative address of virtual memory of each section, and then calculate and save the absolute address of virtual memory of each section. (4) Perform a relocate operation, which is the most critical step. Relocation information may exist in one or more of the following secons:. Rel.plt,. Rela.plt,. Rel.dyn,. Rela.dyn,. Rel.android,. Rela.android. The dynamic linker needs to handle the relocation requests in these. relxxx section s one by one. According to the information of the loaded ELF, the dynamic linker looks up the address of the required symbol (such as malloc in libtest.so). After finding it, fill the address value into the target address specified in. relxxx. These "target addresses" generally exist in. got or. data. (5) The reference count of elf is incremented by one.

  1. Call the constructors of ELF in the list one by one. The addresses of these constructors are previously read from. dynamic segment (types DT_INIT and DT_INIT_ARRAY). Each ELF's constructor is called by the dependency layer by layer, first calls the constructor that is dependent on ELF, and finally calls the libtest.so's own constructor. (ELF can also define its own destructor, which will be called automatically when elf is unload ed)

wait a minute! We seem to have found something! Look again at the relocation section.

Can we just get the "target address" from these. relxxx, and then fill in a new function address in the "target address", so as to complete the hook? Maybe.

track

Static analysis is easy to verify. Take libtest.so of armeabi-v7a architecture as an example.

Look at say first_ The assembly code corresponding to the Hello function.

caikelun@debian:~/$ arm-linux-androideabi-readelf -s ./libtest.so

Symbol table '.dynsym' contains 58 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 00000000     0 FUNC    GLOBAL DEFAULT  UND __cxa_finalize@LIBC (2)
     2: 00000000     0 FUNC    GLOBAL DEFAULT  UND snprintf@LIBC (2)
     3: 00000000     0 FUNC    GLOBAL DEFAULT  UND malloc@LIBC (2)
     4: 00000000     0 FUNC    GLOBAL DEFAULT  UND __cxa_atexit@LIBC (2)
     5: 00000000     0 FUNC    GLOBAL DEFAULT  UND printf@LIBC (2)
     6: 00000f61    60 FUNC    GLOBAL DEFAULT   12 say_hello
...............
...............

eureka! say_hello at address f61, the corresponding assembly instruction volume is 60 (hexadecimal) bytes.

Viewing say with objdump_ Disassembly output of Hello.

caikelun@debian:~$ arm-linux-androideabi-objdump -D ./libtest.so
...............
...............
00000f60 <say_hello@@Base>:
     f60:   b5b0        push    {r4, r5, r7, lr}
     f62:   af02        add r7, sp, #8
     f64:   f44f 6080   mov.w   r0, #1024   ; 0x400
     f68:   f7ff ef34   blx dd4 <malloc@plt>
     f6c:   4604        mov r4, r0
     f6e:   b16c        cbz r4, f8c <say_hello@@Base+0x2c>
     f70:   a507        add r5, pc, #28 ; (adr r5, f90 <say_hello@@Base+0x30>)
     f72:   a308        add r3, pc, #32 ; (adr r3, f94 <say_hello@@Base+0x34>)
     f74:   4620        mov r0, r4
     f76:   f44f 6180   mov.w   r1, #1024   ; 0x400
     f7a:   462a        mov r2, r5
     f7c:   f7ff ef30   blx de0 <snprintf@plt>
     f80:   4628        mov r0, r5
     f82:   4621        mov r1, r4
     f84:   e8bd 40b0   ldmia.w sp!, {r4, r5, r7, lr}
     f88:   f001 ba96   b.w 24b8 <_Unwind_GetTextRelBase@@Base+0x8>
     f8c:   bdb0        pop {r4, r5, r7, pc}
     f8e:   bf00        nop
     f90:   7325        strb    r5, [r4, #12]
     f92:   0000        movs    r0, r0
     f94:   6568        str r0, [r5, #84]   ; 0x54
     f96:   6c6c        ldr r4, [r5, #68]   ; 0x44
     f98:   0a6f        lsrs    r7, r5, #9
     f9a:   0000        movs    r0, r0
...............
...............

The call to the malloc function corresponds to the instruction blx dd4. Jump to address dd4.

Look at what's in this address:

caikelun@debian:~$ arm-linux-androideabi-objdump -D ./libtest.so
...............
...............
00000dd4 <malloc@plt>:
 dd4:   e28fc600    add ip, pc, #0, 12
 dd8:   e28cca03    add ip, ip, #12288  ; 0x3000
 ddc:   e5bcf1b4    ldr pc, [ip, #436]! ; 0x1b4
...............
...............

Sure enough, it jumps to. plt. After several address calculations, it finally jumps to the address pointed to by the value in address 3f90, which is a function pointer.

A little explanation: because the arm processor uses a three-stage pipeline, the value of the pc obtained by the first instruction is the address of the currently executed instruction + 8.

So: dd4 + 8 + 3000 + 1b4 = 3f90.

Where is the address 3f90

caikelun@debian:~$ arm-linux-androideabi-objdump -D ./libtest.so
...............
...............
00003f60 <.got>:
    ...
    3f70:   00002604    andeq   r2, r0, r4, lsl #12
    3f74:   00002504    andeq   r2, r0, r4, lsl #10
    ...
    3f88:   00000da8    andeq   r0, r0, r8, lsr #27
    3f8c:   00000da8    andeq   r0, r0, r8, lsr #27
    3f90:   00000da8    andeq   r0, r0, r8, lsr #27
...............
...............

Sure enough, in. got.

By the way. rel.plt:

caikelun@debian:~$ arm-linux-androideabi-readelf -r ./libtest.so

Relocation section '.rel.plt' at offset 0xcb8 contains 30 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
00003f88  00000416 R_ARM_JUMP_SLOT   00000000   __cxa_atexit@LIBC
00003f8c  00000116 R_ARM_JUMP_SLOT   00000000   __cxa_finalize@LIBC
00003f90  00000316 R_ARM_JUMP_SLOT   00000000   malloc@LIBC
...............
...............

It's no coincidence that malloc's address is actually stored in 3f90!

Taoist notes:

. rel.plt this section records the information of the relocation table, that is, which function addresses need to be relocated.

When the linker loads all the dependent shared objects into memory, it will summarize the symbols in each shared object to get the global symbol table.

Then check. rel.plt in each shared object to see if some addresses need to be relocated.

If necessary, find the memory address of the symbol from the global symbol table and fill it in the corresponding position in. plt.

What are you waiting for? Change the code quickly. Our main.c should read as follows:

#include <test.h>

void *my_malloc(size_t size)
{
    printf("%zu bytes memory are allocated by libtest.so\n", size);
    return malloc(size);
}

int main()
{
    void **p = (void **)0x3f90;
    *p = (void *)my_malloc; // do hook
    
    say_hello();
    return 0;
}

Compile and run:

caikelun@debian:~$ adb push ./main /data/local/tmp
caikelun@debian:~$ adb shell "chmod +x /data/local/tmp/main"
caikelun@debian:~$ adb shell "export LD_LIBRARY_PATH=/data/local/tmp; /data/local/tmp/main"
Segmentation fault
caikelun@debian:~$

The idea is correct. But it still failed because there are three problems in this Code:

  1. 3f90 is a relative memory address, which needs to be converted into an absolute address.
  2. The absolute address corresponding to 3f90 may not have write permission. Directly assigning a value to this address will cause a segment error.
  3. Even if the new function address is assigned successfully, my_malloc will not be executed because the processor has an instruction cache.

We need to solve these problems.

Memory

Base address

In the memory space of the process, the loading addresses of various ELF are random. Only when running can we get the loading address, that is, the base address.

Taoist notes:

When we view a dynamic link library, the entry address we see is 0x0000_0000.

When the dynamic library is loaded into memory, the loading address is not fixed because of the problem of loading order.

There is another saying: for a process, when it is loaded into memory, the order of all dynamic libraries it depends on is certain.

Therefore, the loading address of each dynamic library is also fixed. Therefore, in theory, the relocated code segment can be stored after the first relocation.

In this way, when the process is started again in the future, there is no need to relocate and speed up the startup of the program.

We need to know the base address of ELF to convert the relative address to the absolute address.

No mistake. You must know that we can call DL directly if you are familiar with Linux development_ iterate_ phdr. See here for detailed definitions.

Taoist notes:

dl_iterate_phdr this function is really useful. In the form of callback function, you can get the loading address and other information of each dynamic link library.

Without this function, a lot of information needs to be obtained from / proc/xxx/maps. The execution speed is slow because a lot of string information needs to be processed.

Well, wait a minute. Years of Android development experience tells us that we'd better take another look at the linker.h header file in NDK:

#if defined(__arm__)

#if __ANDROID_API__ >= 21
int dl_iterate_phdr(int (*__callback)(struct dl_phdr_info*, size_t, void*), void* __data) __INTRODUCED_IN(21);
#endif /* __ANDROID_API__ >= 21 */

#else
int dl_iterate_phdr(int (*__callback)(struct dl_phdr_info*, size_t, void*), void* __data);
#endif

Why? The Android version under 5.0 of ARM architecture does not support dl_iterate_phdr!

Our APP should support all versions above Android 4.0.

Especially ARM, how can we not support it?! This also let people write code!

Fortunately, we thought of it. We can also parse / proc/self/maps:

root@android:/ # ps | grep main
ps | grep main
shell     7884  7882  2616   1016  hrtimer_na b6e83824 S /data/local/tmp/main

root@android:/ # cat /proc/7884/maps
cat /proc/7884/maps

address           perms offset  dev   inode       pathname
---------------------------------------------------------------------
...........
...........
b6e42000-b6eb5000 r-xp 00000000 b3:17 57457      /system/lib/libc.so
b6eb5000-b6eb9000 r--p 00072000 b3:17 57457      /system/lib/libc.so
b6eb9000-b6ebc000 rw-p 00076000 b3:17 57457      /system/lib/libc.so
b6ec6000-b6ec9000 r-xp 00000000 b3:19 753708     /data/local/tmp/libtest.so
b6ec9000-b6eca000 r--p 00002000 b3:19 753708     /data/local/tmp/libtest.so
b6eca000-b6ecb000 rw-p 00003000 b3:19 753708     /data/local/tmp/libtest.so
b6f03000-b6f20000 r-xp 00000000 b3:17 32860      /system/bin/linker
b6f20000-b6f21000 r--p 0001c000 b3:17 32860      /system/bin/linker
b6f21000-b6f23000 rw-p 0001d000 b3:17 32860      /system/bin/linker
b6f25000-b6f26000 r-xp 00000000 b3:19 753707     /data/local/tmp/main
b6f26000-b6f27000 r--p 00000000 b3:19 753707     /data/local/tmp/main
becd5000-becf6000 rw-p 00000000 00:00 0          [stack]
ffff0000-ffff1000 r-xp 00000000 00:00 0          [vectors]
...........
...........

maps returns the mapping information of mmap in the memory space of the specified process, including various dynamic libraries, executable files (such as linker), stack space, heap space, and even font files.

The detailed description of maps format is shown here.

Our libtest.so has three lines of records in maps.

The starting address b6ec6000 of the first line with offset 0 is the base address we are looking for in most cases.

Memory access

The information returned by maps already contains permission access information.

If you want to execute hook, you need write permission. You can use mprotect to complete:

#include <sys/mman.h>

int mprotect(void *addr, size_t len, int prot);

Note: when modifying memory access permissions, you can only use "page" as the unit.

A detailed description of mprotect is provided here.

Instruction cache

Note that the section types of. got and. Data are PROGBITS, that is, execute code. The processor may cache this data.

After modifying the memory address, we need to clear the instruction cache of the processor and let the processor read this part of instructions from memory again.

Method is called__ builtin___clear_cache:

void __builtin___clear_cache (char *begin, char *end);

Note that when clearing the instruction cache, it can only be in "pages"__ builtin___ clear_ See here for a detailed description of cache.

verification

We modify main.c to read:

#include <inttypes.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/mman.h>
#include <test.h>

#define PAGE_START(addr) ((addr) & PAGE_MASK)
#define PAGE_END(addr)   (PAGE_START(addr) + PAGE_SIZE)

void *my_malloc(size_t size)
{
    printf("%zu bytes memory are allocated by libtest.so\n", size);
    return malloc(size);
}

void hook()
{
    char       line[512];
    FILE      *fp;
    uintptr_t  base_addr = 0;
    uintptr_t  addr;

    //find base address of libtest.so
    if(NULL == (fp = fopen("/proc/self/maps", "r"))) return;
    while(fgets(line, sizeof(line), fp))
    {
        if(NULL != strstr(line, "libtest.so") &&
           sscanf(line, "%"PRIxPTR"-%*lx %*4s 00000000", &base_addr) == 1)
            break;
    }
    fclose(fp);
    if(0 == base_addr) return;

    //the absolute address
    addr = base_addr + 0x3f90;
    
    //add write permission
    mprotect((void *)PAGE_START(addr), PAGE_SIZE, PROT_READ | PROT_WRITE);

    //replace the function address
    *(void **)addr = my_malloc;

    //clear instruction cache
    __builtin___clear_cache((void *)PAGE_START(addr), (void *)PAGE_END(addr));
}

int main()
{
    hook();
    
    say_hello();
    return 0;
}

Recompile run:

caikelun@debian:~$ adb push ./main /data/local/tmp
caikelun@debian:~$ adb shell "chmod +x /data/local/tmp/main"
caikelun@debian:~$ adb shell "export LD_LIBRARY_PATH=/data/local/tmp; /data/local/tmp/main"
1024 bytes memory are allocated by libtest.so
hello
caikelun@debian:~$

Yes, it worked!

We didn't modify the libtest.so code, or even recompile it. We only modified the main program.

The source codes of libtest.so and main are placed on github and can be obtained from here.

(depending on the compiler you use or the version of the compiler, in the generated libtest.so, the address corresponding to malloc may no longer be 0x3f90. In this case, you need to confirm with readelf and then modify it in main.c.)

Using xhook

Of course, we have an open source tool library called xhook.

Using xhook, you can more gracefully complete the hook operation of libtest.so without worrying about the compatibility problems caused by hard coding 0x3f90.

#include <stdlib.h>
#include <stdio.h>
#include <test.h>
#include <xhook.h>

void *my_malloc(size_t size)
{
    printf("%zu bytes memory are allocated by libtest.so\n", size);
    return malloc(size);
}

int main()
{
    xhook_register(".*/libtest\\.so$", "malloc", my_malloc, NULL);
    xhook_refresh(0);
    
    say_hello();
    return 0;
}

xhook supports armeabi, armeabi-v7a and arm64-v8a.

Support Android 4.0 or above (API level > = 14).

It has been verified by product level stability and compatibility. You can get xhook here.

Summarize the process of executing PLT hook in xhook:

  1. Read maps and get the start address of ELF.
  2. Verify ELF header information.
  3. Type Pt found in PHT_ segment with load and offset 0. Calculate ELF base address.
  4. Type Pt found in PHT_ Dynamic segment, from which. dynamic section is obtained, and the memory address corresponding to other sections is obtained from. dynamic section.
  5. Find the index value corresponding to the symbol of the hook in the. dynstr section.
  6. Traverse all. Relxxx sections (relocation sections) to find the item matching symbol index and symbol type. For this relocation item, perform hook operation. The hook process is as follows:

(1) Read maps and confirm the memory access permission of the current hook address. (2) If the permission is not readable or writable, use mprotect to modify the access permission to be readable or writable. (3) If required by the caller, the current value of the hook address is retained for return. (4) Replace the value of the hook address with the new value. (execute hook) (5) If you have previously modified the memory access permissions with mprotect, now restore to the previous permissions. (6) Clear the processor instruction cache of the memory page where the hook address is located.

FAQ

Can ELF information be read directly from the file?

sure.

Moreover, for format parsing, reading files is the safest way, because in principle, many section s do not need to be kept in memory all the time when ELF is running, and can be discarded from memory after loading, which can save a small amount of memory.

However, from a practical point of view, dynamic linkers and loaders of various platforms will not do so. They may think that the increased complexity is not worth the loss.

So we can read all kinds of ELF information from memory. Reading files increases performance loss.

In addition, APP may not have access to ELF files of some system libraries.

What is the exact method of calculating the base address?

As you have noticed, in the previous introduction to libtest.so base address acquisition, in order to simplify the concept and facilitate coding, the description method of "most cases" is used.

For hook, the accurate base address calculation process is:

  1. Find the row with offset 0 and pathname as the target ELF in maps. Save the start address of this line as p0.
  2. Find out that the first type of PHT in ELF is PT_ For a segment with load and offset 0, the relative address (p_vaddr) of the virtual memory storing the segment is p1.
  3. p0 - p1 is the current base address of the ELF.

Most ELF first Pt_ P of load segment_ Vaddr is 0.

In addition, we need to find the line with offset 0 in maps because we want to verify the ELF file header in memory before executing hook to ensure that the current operation is a valid elf, and this ELF file header can only appear in the mmap area with offset 0.

You can search "load_bias" in the source code of Android linker to find many detailed notes. You can also refer to the description of load in linker_ bias_ Variable assignment program logic.

What impact does the compilation options used by the target ELF have on hook?

There will be some impact.

There are three situations for calling external functions:

  1. Call directly. Regardless of the compilation options, it can be hook to. The external function address is always saved in. got.
  2. Called through a global function pointer. Regardless of the compilation options, it can be hook to. The external function address is always saved in. data.
  3. Called through a local function pointer. If the compilation option is - O2 (the default), the call will be optimized for direct call (the same as case 1). If the compilation option is - O0, the pointer of the external function that has been assigned to the temporary variable before hook execution cannot be hook through PLT; For those that are assigned values after hook execution, you can use PLT to hook.

In general, production level ELF is rarely compiled with - O0, so it doesn't have to be tangled.

However, if you want your ELF not to be PLT hook by others as much as possible, you can try to compile it with - O0, and then assign the pointer of the external function to the local function pointer variable as early as possible, and then use these local function pointers to access the external function all the time.

In short, it is meaningless to view the source code of C/C + + to understand this problem. You need to view the disassembly output of ELF generated after using different compilation options and compare their differences, so as to know what conditions and reasons can not be PLT hook.

What is the reason for the occasional segment error encountered in hook? How?

We sometimes encounter such problems:

  1. After reading / proc/self/maps, it is found that the access permission of a memory area is readable. When we read the contents of the area for ELF file header verification, a segment error (sig: SIGSEGV, code: SEGV_ACCERR) occurred.
  2. mprotect() has been used to modify the access permission of a memory area to be writable. mprotect() returns that the modification is successful, and then read / proc/self/maps again to confirm that the access permission of the corresponding memory area is indeed writable. A segment error (SIG: sigsegsegv, code: segv_accerr) occurs during the write operation (replacing the function pointer and executing hook).
  3. Successfully read and verify the ELF file header. According to the relative address value in the elf header, a segment error (sig: SIGSEGV, code: SEGV_ACCERR or SEGV_MAPERR) occurs when further reading PHT or. dynamic section.

Possible reasons are:

  1. The memory space of the process is shared by multiple threads. When we execute hook, other threads (even linker) may be executing dlclose(), or using mprotect() to modify the access permissions of this memory area.
  2. Android ROM s of different manufacturers, models and versions may have undisclosed behaviors. For example, in some cases, there are write protection or read protection mechanisms for some memory areas, and these protection mechanisms are not reflected in the contents of / proc/self/maps.

Problem analysis:

  1. A segment error while reading memory is actually harmless.
  2. In the process of hook execution, there is only one place where I need to write data directly by calculating the memory address: the most critical line of the function pointer. As long as there is no logic error elsewhere, even if the write fails here, it will not damage other memory areas.
  3. When loading the APP process running Android platform, the loader has injected the registration logic of signal handler into us, so that when the APP crashes, it can communicate with the debuggerd daemon of the system. Debuggerd uses ptrace to debug the crash process, obtain the required crash site information, record it in the tombstone file, and then the APP commits suicide.
  4. The system will accurately send the segment error signal to the "thread with segment error".
  5. We hope to have a secret and controllable way to avoid APP crash caused by segment errors.

First clarify a point:

Don't just look at the pending segment error from the perspective of application layer program development. The segment error is not a beast, but a normal way of communication between the kernel and user processes.

When a user process accesses a virtual memory address without permission or mmap, the kernel sends a SIGSEGV signal to the user process to notify the user process, that's all.

As long as the location of the segment error is controllable, we can handle it in the user process.

Solution:

  1. Before the hook logic enters the dangerous area (directly calculating the memory address for reading and writing), it is marked by a global flag, and the flag is reset after leaving the dangerous area.
  2. Register our own signal handler and only catch segment errors. In the signal handler, judge whether the current thread logic is in the dangerous area by judging the value of flag. If yes, use signongjmp to jump out of the signal handler and directly jump to the "next line of code outside the dangerous area" set in advance; If not, restore the signal handler injected to us by the previous loader, and then return directly. At this time, the system will send a segment error signal to our thread again. Since the previous signal handler has been restored, it will enter the default system signal handler and follow the normal logic.
  3. We call this mechanism SFP (segment fault protection) for short
  4. Note: the SFP needs a switch so that we can turn it on and off at any time. In the APP development and debugging stage, the SFP should always be closed, so that the segment errors caused by coding errors will not be missed, and these errors should be repaired; After the official launch, the SFP should be turned on to ensure that the APP will not crash. (of course, it is also possible to partially turn off the SFP in the form of sampling to observe and analyze the crashes caused by the hook mechanism itself.)

The specific code can refer to the implementation in xhook and search for siglong JMP and sigset JMP in the source code.

Can calls between ELF internal functions hook?

The hook method we introduced here is PLT hook, which can not be used for hook calls between ELF internal functions.

Taoist notes:

The external function is recorded in the. plt section, so you can find its relocation address step by step in the section and then modify it.

For internal functions, such as a function decorated with the static keyword, the compiler may directly "hard code" the address of the function where it is referenced when compiling.

That's why: if a function is only used inside a file, it's best to add the static keyword.

One reason is that it is safe to prevent duplicate names with symbols in other files. Another reason is to speed up the startup speed because there is no need to relocate!

inline hook can do this. You need to know the symbol name or address of the internal function you want to hook, and then you can hook.

There are many open source and non open source inline hook implementations, such as:

substrate: http://www.cydiasubstrate.com/ frida: https://www.frida.re/

While the inline hook scheme is powerful, it may bring the following problems:

  1. Due to the need to directly parse and modify the machine instructions (sink code) in ELF, there may be different compatibility and stability problems for processors of different architectures, processor instruction sets, compiler optimization options and operating system versions.
  2. After a problem occurs, it may be difficult to analyze and locate. Some well-known inline hook schemes are closed source.
  3. The implementation is relatively complex and difficult.
  4. There are relatively many unknown pits. You can google this by yourself.

It is suggested that if PLT hook is enough, you don't have to try inline hook.

------ End ------

The article comes from: https://my.oschina.net/nomagic/blog/1806011 .

I have reprinted this article in a private letter, but I haven't received a reply. Since the article is well written, let's share it with you first.

In case of infringement, please delete the text by private letter. Thank you!

Posted by danaman on Tue, 02 Nov 2021 00:19:23 -0700