Function stack frame explanation

Keywords: C

preface

This module is close to the boundary of C language and takes some time to learn. However, when we know this knowledge, we can see more than the appearance in the function of C language, and we can really understand how the function is called. However, my ability is limited. If the following knowledge is inappropriate, please correct it.

Knowledge point reserve

  • Preliminary understanding of functions (the functions mentioned here are user-defined functions by default)
  • Understanding C program address space
  • Basic register
  • Know some assembly language

Concept of function

Functions should be familiar to everyone. I won't elaborate here.
Let's just have a look

ret_type fun_name(para1, * )
{
    statement;  //Statement item
}

ret_type  Return type
fun_name  Function name
para1     Function parameters  

C program address space (key memory)

We have always said that "the life cycle of global variables is the whole program", "the life cycle of static modified variables becomes longer", and "the most important temporary variables will be destroyed when they come out of the function". But we need to know why.
In C language, each variable we create will have its own storage category. For example, cars generally don't park in tall buildings, and everything will have its own set.

Take a look at the code and verify it

#include<stdio.h>                                                                             
#include<stdlib.h>

int g_val1 = 10;
int g_val2 = 10;
int g_val3;
int g_val4;

int main()
{
    const char* str = "abcdef";

    printf("code: %p\n", main);

    printf("read only : %p\n", str);

    printf("init g_val1 : %p\n", &g_val1);
    printf("init g_val2 : %p\n", &g_val2);
    printf("uninit g_val2 : %p\n", &g_val3);
    printf("uninit g_val2 : %p\n", &g_val4);

    char* p1 = (char*)malloc(sizeof(char*) * 10);
    char* p2 = (char*)malloc(sizeof(char*) * 10);

    printf("heap addr : %p\n", p1);
    printf("heap addr : %p\n", p2);

    printf("stack addr : %p\n", &str);
    printf("stack addr : %p\n", &p1);
    printf("stack addr : %p\n", &p2);

    return 0;
}


It can be seen that local variables are stored on the stack and the stack space is opened up in the direction of low address

Related registers

The function call is closely related to the registers in the CPU. Here are some basic knowledge

  • eax: a general-purpose register that retains temporary data and is often used to return values
  • ebx: general purpose register, holding temporary data
  • ebp: stack bottom register
  • esp: stack top register
  • eip: instruction register, which saves the address of the next instruction of the current instruction and measures the step taken

Related assembly language

  • mov: data transfer instruction
  • push: when data is pushed into the stack, the top register of esp stack will also change
  • Pop: the data will pop up to the specified location, and the top register of esp stack will also change
  • sub: subtract command
  • Add: add command
  • Call: function call, 1. Press the return address, 2. Transfer to the target function
  • jump: by modifying the eip, transfer to the target function and call it
  • ret: restore the return address and press eip, similar to pop eip command

After reading so much knowledge, we will feel very boring. We think it has nothing to do with the function stack frame. Don't worry. Let's start our formal content.

Function stack frame

Here, in order to facilitate understanding, we look at the stack space like this, so we will draw more pictures

We know that the main function is also a function and can be called, so the main function will also form a stack frame.

Sample code

int MyAdd(int a, int b)
{
	int c = a + b;
	return c;
}

int main()
{
	int x = 0xA;
	int y = 0xB;
    int z = 0;
    
	z = MyAdd(a, b);
	printf("z = %d\n",z);
	return 0;
}

Go to disassembly and open the register

I'll copy the assembly code, and we'll analyze these things step by step

int main()
{
int main()
{
int main()
{
00821E40  push        ebp  
00821E41  mov         ebp,esp  
00821E43  sub         esp,0E4h  
00821E49  push        ebx  
00821E4A  push        esi  
00821E4B  push        edi  
00821E4C  lea         edi,[ebp-24h]  
00821E4F  mov         ecx,9  
00821E54  mov         eax,0CCCCCCCCh  
00821E59  rep stos    dword ptr es:[edi]  
00821E5B  mov         ecx,82C003h  
00821E60  call        0082130C  
	int x = 0xA;
00821E65  mov         dword ptr [ebp-8],0Ah  
	int y = 0xB;
00821E6C  mov         dword ptr [ebp-14h],0Bh  
	int z = 0;
00821E73  mov         dword ptr [ebp-20h],0  

	z = MyAdd(x, y);
00821E7A  mov         eax,dword ptr [ebp-14h]  
00821E7D  push        eax  
00821E7E  mov         ecx,dword ptr [ebp-8]  
00821E81  push        ecx  
00821E82  call        008211E5  
00821E87  add         esp,8  
00821E8A  mov         dword ptr [ebp-20h],eax  
	printf("z = %d\n", z);
00821E8D  mov         eax,dword ptr [ebp-20h]  
00821E90  push        eax  
00821E91  push        827BCCh  
00821E96  call        008213A2  
00821E9B  add         esp,8  
	return 0;
00821E9E  xor         eax,eax  
}
00821EA0  pop         edi  
00821EA1  pop         esi  
00821EA2  pop         ebx  
00821EA3  add         esp,0E4h  
00821EA9  cmp         ebp,esp  
00821EAB  call        00821235  
00821EB0  mov         esp,ebp  
00821EB2  pop         ebp  
00821EB3  ret  
  • ebp points to the bottom of the stack
  • esp points to the top of the stack
  • eip pointing to the next address to be executed has not been executed yet

First step

int x = 0xA;
01011E65  mov         dword ptr [ebp-8],0Ah 
                      //Open up a space at ebp-8 and put the value of x in it

	int y = 0xB;
    01011E6C  mov      dword ptr [ebp-14h],0Bh  
                       //Open up a space at ebp-14 and put the value of y in it

		int z = 0;
        00821E73  mov   dword ptr [ebp-20h],0  
                       //Open up a space at ebp-20 and put the value of z in it 

It can be seen that the spaces of x, y and z are discontinuous. This is a VS protection mechanism to prevent some programmers from guessing the corresponding address.

Step 2

00821E7A  mov         eax,dword ptr [ebp-14h]  

Assign ebp-14 (that is, y) to eax

eax is a temporary register that holds temporary data and is often used to return values

00821E7D  push        eax  

The push command puts the value of eax into the stack. At the same time, the position of the top of the stack changes. The size of the change is 4 bytes, because y is of type int

Stack top after push

00821E7E  mov         ecx,dword ptr [ebp-8]  

Assign ebp-8 (that is, x) to ecx

00821E81  push        ecx  

As above, pressing the value of ecx into the stack changes the position of the top of the stack

conclusion

  • The formation of temporary variables (temporary copies of arguments) is completed before the function call
  • The order of parameter instantiation is formed from right to left
  • The space of formal parameters is adjacent

Call function

Let's talk about the function of the call command first

  • Press in return address (most important)
  • Transfer to objective function
    Press in the return address, who? Why press in?
    Press in who? Press in the address of the next command
    Why press in? The root cause is that after the function call is completed, it may need to return
00821E82  call        008211E5



The jump command changes the eip to the target function for calling

Before jmp
After jmp

Now we finally enter the MyAdd() function,
Draw our stack frame

Due to the limited space, let's talk about this first, and then talk about the internal affairs of the MyAdd() function in the next article.

Posted by YoussefSiblini on Fri, 05 Nov 2021 12:09:24 -0700