preface
This module is close to the boundary of C language and takes some time to learn. However, when we know this knowledge, we can see more than the appearance in the function of C language, and we can really understand how the function is called. However, my ability is limited. If the following knowledge is inappropriate, please correct it.
Knowledge point reserve
- Preliminary understanding of functions (the functions mentioned here are user-defined functions by default)
- Understanding C program address space
- Basic register
- Know some assembly language
Concept of function
Functions should be familiar to everyone. I won't elaborate here.
Let's just have a look
ret_type fun_name(para1, * ) { statement; //Statement item } ret_type Return type fun_name Function name para1 Function parameters
C program address space (key memory)
We have always said that "the life cycle of global variables is the whole program", "the life cycle of static modified variables becomes longer", and "the most important temporary variables will be destroyed when they come out of the function". But we need to know why.
In C language, each variable we create will have its own storage category. For example, cars generally don't park in tall buildings, and everything will have its own set.
Take a look at the code and verify it
#include<stdio.h> #include<stdlib.h> int g_val1 = 10; int g_val2 = 10; int g_val3; int g_val4; int main() { const char* str = "abcdef"; printf("code: %p\n", main); printf("read only : %p\n", str); printf("init g_val1 : %p\n", &g_val1); printf("init g_val2 : %p\n", &g_val2); printf("uninit g_val2 : %p\n", &g_val3); printf("uninit g_val2 : %p\n", &g_val4); char* p1 = (char*)malloc(sizeof(char*) * 10); char* p2 = (char*)malloc(sizeof(char*) * 10); printf("heap addr : %p\n", p1); printf("heap addr : %p\n", p2); printf("stack addr : %p\n", &str); printf("stack addr : %p\n", &p1); printf("stack addr : %p\n", &p2); return 0; }
It can be seen that local variables are stored on the stack and the stack space is opened up in the direction of low address
Related registers
The function call is closely related to the registers in the CPU. Here are some basic knowledge
- eax: a general-purpose register that retains temporary data and is often used to return values
- ebx: general purpose register, holding temporary data
- ebp: stack bottom register
- esp: stack top register
- eip: instruction register, which saves the address of the next instruction of the current instruction and measures the step taken
Related assembly language
- mov: data transfer instruction
- push: when data is pushed into the stack, the top register of esp stack will also change
- Pop: the data will pop up to the specified location, and the top register of esp stack will also change
- sub: subtract command
- Add: add command
- Call: function call, 1. Press the return address, 2. Transfer to the target function
- jump: by modifying the eip, transfer to the target function and call it
- ret: restore the return address and press eip, similar to pop eip command
After reading so much knowledge, we will feel very boring. We think it has nothing to do with the function stack frame. Don't worry. Let's start our formal content.
Function stack frame
Here, in order to facilitate understanding, we look at the stack space like this, so we will draw more pictures
We know that the main function is also a function and can be called, so the main function will also form a stack frame.
Sample code
int MyAdd(int a, int b) { int c = a + b; return c; } int main() { int x = 0xA; int y = 0xB; int z = 0; z = MyAdd(a, b); printf("z = %d\n",z); return 0; }
Go to disassembly and open the register
I'll copy the assembly code, and we'll analyze these things step by step
int main() { int main() { int main() { 00821E40 push ebp 00821E41 mov ebp,esp 00821E43 sub esp,0E4h 00821E49 push ebx 00821E4A push esi 00821E4B push edi 00821E4C lea edi,[ebp-24h] 00821E4F mov ecx,9 00821E54 mov eax,0CCCCCCCCh 00821E59 rep stos dword ptr es:[edi] 00821E5B mov ecx,82C003h 00821E60 call 0082130C int x = 0xA; 00821E65 mov dword ptr [ebp-8],0Ah int y = 0xB; 00821E6C mov dword ptr [ebp-14h],0Bh int z = 0; 00821E73 mov dword ptr [ebp-20h],0 z = MyAdd(x, y); 00821E7A mov eax,dword ptr [ebp-14h] 00821E7D push eax 00821E7E mov ecx,dword ptr [ebp-8] 00821E81 push ecx 00821E82 call 008211E5 00821E87 add esp,8 00821E8A mov dword ptr [ebp-20h],eax printf("z = %d\n", z); 00821E8D mov eax,dword ptr [ebp-20h] 00821E90 push eax 00821E91 push 827BCCh 00821E96 call 008213A2 00821E9B add esp,8 return 0; 00821E9E xor eax,eax } 00821EA0 pop edi 00821EA1 pop esi 00821EA2 pop ebx 00821EA3 add esp,0E4h 00821EA9 cmp ebp,esp 00821EAB call 00821235 00821EB0 mov esp,ebp 00821EB2 pop ebp 00821EB3 ret
- ebp points to the bottom of the stack
- esp points to the top of the stack
- eip pointing to the next address to be executed has not been executed yet
First step
int x = 0xA; 01011E65 mov dword ptr [ebp-8],0Ah //Open up a space at ebp-8 and put the value of x in it
int y = 0xB; 01011E6C mov dword ptr [ebp-14h],0Bh //Open up a space at ebp-14 and put the value of y in it
int z = 0; 00821E73 mov dword ptr [ebp-20h],0 //Open up a space at ebp-20 and put the value of z in it
It can be seen that the spaces of x, y and z are discontinuous. This is a VS protection mechanism to prevent some programmers from guessing the corresponding address.
Step 2
00821E7A mov eax,dword ptr [ebp-14h]
Assign ebp-14 (that is, y) to eax
eax is a temporary register that holds temporary data and is often used to return values
00821E7D push eax
The push command puts the value of eax into the stack. At the same time, the position of the top of the stack changes. The size of the change is 4 bytes, because y is of type int
Stack top after push
00821E7E mov ecx,dword ptr [ebp-8]
Assign ebp-8 (that is, x) to ecx
00821E81 push ecx
As above, pressing the value of ecx into the stack changes the position of the top of the stack
conclusion
- The formation of temporary variables (temporary copies of arguments) is completed before the function call
- The order of parameter instantiation is formed from right to left
- The space of formal parameters is adjacent
Call function
Let's talk about the function of the call command first
- Press in return address (most important)
- Transfer to objective function
Press in the return address, who? Why press in?
Press in who? Press in the address of the next command
Why press in? The root cause is that after the function call is completed, it may need to return
00821E82 call 008211E5
The jump command changes the eip to the target function for calling
Before jmp
After jmp
Now we finally enter the MyAdd() function,
Draw our stack frame
Due to the limited space, let's talk about this first, and then talk about the internal affairs of the MyAdd() function in the next article.