inline exploration under clang compiler
Today, I saw an article about the introduction of inline functions, which mentioned the functions and application scenarios of inline functions. I just read some materials of clang compiler recently, and I just sent some time to explore
The introduction of this article is as follows Inline function(https://www.cnblogs.com/spock12345/p/11551147.html )The function of the inline function is not to transfer control when calling, but to embed the function body in every calling function when compiling. It is suitable for functions with simple function, small scale and frequent use.
Get ready
Code
The sample code used is as follows. I name the file a.c
#include <stdio.h> int inlinecalc(int a, int b, int fac); int inline_loopFunc(); void inline_recCallFunc(int count); inline int inline_loopFunc(){ //Realization int count = 1000; for (int i = 0; i < count; ++i) { printf("%d\n", i); } return 0; } inline void inline_recCallFunc(int count){ if (count > 0) { inline_recCallFunc(count-1); } printf("%d\n", count); } int recCallFunc(){ inline_recCallFunc(10); return 0; } int loopFunc(){ inline_loopFunc(); return 0; } inline int inlinecalc(int a, int b, int fac) { if (fac > 2) { return a * 2 + b; } return a + b; } int calc(int a, int b, int fac) { int tmp = inlinecalc(a, b, fac); return tmp * 1.2 / fac; } int main(int argc, char const *argv[]) { int res = inlinecalc(111, 222, argc); printf("result = %d\n", res); loopFunc(); recCallFunc(); return 0; }
Optimization level of clang
This knowledge point is also introduced first. Clang compiler has multiple optimization options. The assembly code generated under different optimization options is different. The configuration in Xcode is as follows. Using clang command line tool corresponds to different optimization options such as - O0, - O1, - O2, etc., which will be used later. Because clang can do optimization, I also appropriately adjusted the complexity of the code in the code preparation stage to avoid that the assembly code generated by different levels of optimization options is the same, resulting in inaccurate analysis results.
Analysis
The next step is the analysis step. With clang's command-line tool, clang can convert the source file into assembly code.
clang a.c -O0 -S -o -
- -O0 is the optimization level, which has been described above
- -S means generate assembly code
- -o - output results to console
O0/O1 level
.globl _main ## -- Begin function main .p2align 4, 0x90 _main: ## @main .cfi_startproc ## %bb.0: pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset %rbp, -16 movq %rsp, %rbp .cfi_def_cfa_register %rbp movl %edi, %eax movl $111, %edi movl $222, %esi movl %eax, %edx callq _inlinecalc movl %eax, %ecx leaq L_.str.1(%rip), %rdi xorl %eax, %eax movl %ecx, %esi callq _printf callq _loopFunc callq _recCallFunc xorl %eax, %eax popq %rbp retq .cfi_endproc ## -- End function
There are few optimizations at this level. The three inline methods in the source code are used as method calls and are not embedded in the call place
O2 level
.globl _main ## -- Begin function main .p2align 4, 0x90 _main: ## @main .cfi_startproc ## %bb.0: pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset %rbp, -16 movq %rsp, %rbp .cfi_def_cfa_register %rbp cmpl $2, %edi setg %cl movl $111, %esi shll %cl, %esi addl $222, %esi leaq L_.str.1(%rip), %rdi xorl %eax, %eax callq _printf callq _loopFunc movl $10, %edi callq _inline_recCallFunc xorl %eax, %eax popq %rbp retq .cfi_endproc ## -- End function
Two changes can be seen in the O1 level
- _The call of inlinecalc method is no longer available. It is embedded in the call place. The corresponding code is from cmpl $2,% EDI to callq ﹐ printf
O3 level
.globl _main ## -- Begin function main .p2align 4, 0x90 _main: ## @main .cfi_startproc ## %bb.0: pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset %rbp, -16 movq %rsp, %rbp .cfi_def_cfa_register %rbp pushq %r14 pushq %rbx .cfi_offset %rbx, -32 .cfi_offset %r14, -24 cmpl $2, %edi setg %cl movl $111, %esi shll %cl, %esi addl $222, %esi leaq L_.str.1(%rip), %rdi xorl %eax, %eax callq _printf leaq L_.str(%rip), %r14 xorl %ebx, %ebx .p2align 4, 0x90 LBB6_1: ## =>This Inner Loop Header: Depth=1 xorl %eax, %eax movq %r14, %rdi movl %ebx, %esi callq _printf incl %ebx cmpl $1000, %ebx ## imm = 0x3E8 jne LBB6_1 ## %bb.2: movl $10, %edi callq _inline_recCallFunc xorl %eax, %eax popq %rbx popq %r14 popq %rbp retq .cfi_endproc ## -- End function
The inline method with for loop in this level is also embedded in the call place, and the inline is also effective. In addition, according to my own verification, the inline with switch statement will also be embedded in the call place, and the only one that will not be effective is the recursive call method.