Grasp the detailed process of function call stack from the perspective of instruction

Keywords: C++

Stack space

         Stack space is expanded from high address to low address, and heap address is expanded from low address to high address.

         Stack is a data structure with certain rules. We can add and delete data according to certain rules. It uses the principle of last in, first out. In x86 and other assembly sets, the operation instructions of stack and pop-up stack are respectively:

PUSH: PUSH the target memory to the top of the stack.

POP: removes the target from the top of the stack.

          When a function is executed, relevant parameters and local variables will be recorded in the middle area of ESP and EBP. Once the function is executed, the relevant stack frame will pop up from the stack, and then recover from the pre saved context to maintain stack balance. The CPU must know where to execute after the function call (the pc register points to)

ESP and EBP

(1) ESP: stack pointer register (extended)   stack   A pointer is stored in its memory, which always points to the top of the stack frame at the top of the system stack.
(2) EBP: base pointer register (extended)   base   A pointer is stored in its memory, which always points to the bottom of the top stack frame of the system stack.

         According to the above definition, under normal circumstances, ESP is variable and gradually decreases with the production of the stack (because the stack expands to a low address, the value of the stack top register continues to decrease), while the EBP register is fixed and changes only after the function is called.

In the above definition, ESP is used to mark the bottom of the stack, which changes with the change of the stack

pop ebp; The stack is expanded by 4 bytes because EBP is 32 bits

push ebp; Stack, stack reduced by 4 bytes         

add esp, 0Ch; Indicates that the stack is reduced by 12byte

sub esp, 0Ch; Indicates that the stack is expanded by 12byte

         The appearance of ebp register is for another goal, which is to find parameters and variables on the stack through fixed address and offset. The fixed value is stored in the ebp register,. However, this value will change during the call of the function. After the function is executed, it needs to be restored. Therefore, it is saved during the process of function out of the stack and into the stack

Example

#include <iostream>

int sum(int a, int b)
{
  int temp = 0;
  temp = a + b;
  return temp;
}

int main()
{
  int a = 10;
  int b = 20;

  int ret = sum(a, b);

  return 0;
}

Break point, debug, view disassembly:

    10: int main()
    11: {
00F81860 55                   push        ebp  
00F81861 8B EC                mov         ebp,esp  
00F81863 81 EC E4 00 00 00    sub         esp,0E4h  
00F81869 53                   push        ebx  
00F8186A 56                   push        esi  
00F8186B 57                   push        edi  
00F8186C 8D BD 1C FF FF FF    lea         edi,[ebp-0E4h]  
00F81872 B9 39 00 00 00       mov         ecx,39h  
00F81877 B8 CC CC CC CC       mov         eax,0CCCCCCCCh  
00F8187C F3 AB                rep stos    dword ptr es:[edi]  
00F8187E B9 27 D0 F8 00       mov         ecx,offset _4A8A7142_c++test@cpp (0F8D027h)  
00F81883 E8 99 F9 FF FF       call        @__CheckForDebuggerJustMyCode@4 (0F81221h)  
    12:   int a = 10;
00F81888 C7 45 F8 0A 00 00 00 mov         dword ptr [a],0Ah  
    13:   int b = 20;
00F8188F C7 45 EC 14 00 00 00 mov         dword ptr [b],14h  
    14: 
    15:   int ret = sum(a, b);
00F81896 8B 45 EC             mov         eax,dword ptr [b]  
00F81899 50                   push        eax  
00F8189A 8B 4D F8             mov         ecx,dword ptr [a]  
00F8189D 51                   push        ecx  
00F8189E E8 E9 F7 FF FF       call        sum (0F8108Ch)  
00F818A3 83 C4 08             add         esp,8  
00F818A6 89 45 E0             mov         dword ptr [ret],eax  
    16: 
    17:   return 0;
00F818A9 33 C0                xor         eax,eax  
    18: }
00F818AB 5F                   pop         edi  
00F818AC 5E                   pop         esi  
00F818AD 5B                   pop         ebx  
00F818AE 81 C4 E4 00 00 00    add         esp,0E4h  
00F818B4 3B EC                cmp         ebp,esp  
00F818B6 E8 70 F9 FF FF       call        __RTC_CheckEsp (0F8122Bh)  
00F818BB 8B E5                mov         esp,ebp  
00F818BD 5D                   pop         ebp  
00F818BE C3 

At the entry and exit of the main function: {will perform the stack entry operation,} will perform the stack exit operation

         Literally, the above two sentences mean to push ebp into the stack and then make esp equal to ebp

         Why do you do that? Because ebp has a time period as a fixed value for addressing. It is fixed only during the execution of a function, and will change after the function is called and executed.

         Before the function call, the ebp of the caller's function (caller) is stored on the stack to restore the value of ebp after execution. Next, space must be allocated for its local variables, and space must also be allocated for some temporary variables that it may use.

  sub esp, 0E4h; The subtracted value depends on the procedure

Then, depending on the situation, whether to save some specific registers (EBX, ESI and EDI) will be determined

After that, the value of EBP will remain fixed. After that, local variables and temporary storage can be found through the reference pointer EBP plus offset

After the function is executed, the following operations will be performed before the control flow returns to the caller's function (caller)

          The so-called starting and ending means that the register value saved above will be restored, and then the esp value (the esp before the last function call is saved in a fixed ebp) and ebp value will be restored. This process is called restoring the scene and returning the previous function through ret

In main function

    int a = 10;   Execute a MOV instruction:   mov         dword ptr [a],0Ah

  Similarly, int b = 20;mov         dword ptr [b],14h

  Next is int ret = sum(a,b):

00F81896 8B 45 EC             mov         eax,dword ptr [b]  
00F81899 50                   push        eax     #Value of stack b
00F8189A 8B 4D F8             mov         ecx,dword ptr [a]  
00F8189D 51                   push        ecx     #Value of stack a
00F8189E E8 E9 F7 FF FF       call        sum (0F8108Ch)   #Execute call
00F818A3 83 C4 08             add         esp,8  
00F818A6 89 45 E0             mov         dword ptr [ret],eax 

Stack order of function call parameters: parameters are pushed into the stack from right to left.

Therefore, the above corresponds to:

First push the value of b onto the stack, and then push the value of a onto the stack.

Execute call         sum (0F8108Ch)   # Execute call:

The call function will first stack the address to be executed on the next line: assume that the address bit of the instruction on the next line is 0x08124458

  Step 2: enter function call: sum

First step of function call:   Press the stack bottom pointer ebp of the calling function (main) on the stack

Step 2: point the new ebp at the bottom of the stack to the original esp at the top of the stack

Step 3: point the esp to the new stack top (which opens up the stack frame of the function): size: 0cch

  Then execute   int temp = 0;//mov         dword ptr [temp],0

  temp = a + b; Since the values of a and B are stacked before each other, the value of B can be found by ebp+12 bytes, the value of a can be found by ebp+8 bytes, and finally the operation result is assigned to temp

  Then run return temp;:   mov         eax,dword ptr [temp]

 

  Followed by the closing parenthesis "}" of the function:

(1)mov esp,ebp   Rewind the stack frame and point the pointer at the top of the stack to the bottom of the stack

(2) pop ebp stack ejects the stack and assigns the contents of the stack to ebp, which is also to reassign the bottom of the main stack to ebp

(3) ret   The stack is pushed out of the stack, and the contents of the stack are assigned to the pc register, that is, the next instruction of the previously pressed call sun is assigned to the pc register for execution

After calling the function, return to the main function:
The PC register is used to make the program know which instruction to run after exiting sum:

  Next:

  add         esp,8, reclaim the A and b parameter space of the stack

  mov         dword ptr [ret],eax   # In sum, at last, assign temp to eax register, where eax is assigned to ret

 

  Finally, return 0 and the program ends

vs stack space size

The default stack space of VC + + is 1M

  Stack overflow

  There are two common causes of stack memory overflow:
    1. The function call level is too deep. Every time it is called, the function parameters, local variables and other information will be pressed on the stack.
    2. The volume of local static variable is too large
    The first case is not very common. In many cases, we use other methods to replace recursive calls. Therefore, as long as there are no unlimited calls, there should be no problem. At least there are dozens of layers deep. I think there is no problem. The method to check whether this is the cause is to set a breakpoint at the function causing the overflow, then execute the program to stop at the breakpoint, and then press the shortcut key Alt+7 to call up the call stack window, where you can see the hierarchical relationship of function calls.

    The second case is more common. A local variable is defined in a function, which is a class object with a large array in the class

That is, if the function reads:
    void test_stack_overflow()
    {
      char* chdata = new[2*1024*1024];
      delete []chdata;
    }
   This error will not occur, but it will not work if it is written like this:
    void test_stack_overflow()
    {
      char chdata[2*1024*1024];
    }
   Memory overflow errors occur in most cases,


     Generally speaking, there are two solutions:
    1 increase the number of stack memory
    2 use heap memory

Posted by Typer999 on Sat, 06 Nov 2021 14:14:59 -0700