inline exploration under clang compiler

Keywords: Mobile xcode

inline exploration under clang compiler

Today, I saw an article about the introduction of inline functions, which mentioned the functions and application scenarios of inline functions. I just read some materials of clang compiler recently, and I just sent some time to explore

The introduction of this article is as follows Inline function(https://www.cnblogs.com/spock12345/p/11551147.html )The function of the inline function is not to transfer control when calling, but to embed the function body in every calling function when compiling. It is suitable for functions with simple function, small scale and frequent use.

Get ready

Code

The sample code used is as follows. I name the file a.c

#include <stdio.h>

int inlinecalc(int a, int b, int fac);
int inline_loopFunc();
void inline_recCallFunc(int count);

inline int inline_loopFunc(){
	//Realization
	int count = 1000;
	for (int i = 0; i < count; ++i)
	{
		printf("%d\n", i);
	}
	return 0;
}

inline void inline_recCallFunc(int count){
	if (count > 0) {
		inline_recCallFunc(count-1);
	}
	printf("%d\n", count);
}

int recCallFunc(){
	inline_recCallFunc(10);
	return 0;
}

int loopFunc(){
	inline_loopFunc();
	return 0;
}

inline int inlinecalc(int a, int b, int fac) {
	if (fac > 2)
	{
		return a * 2 + b;
	}
	return a + b;
}

int calc(int a, int b, int fac) {
	int tmp = inlinecalc(a, b, fac);
	return tmp * 1.2 / fac;
}

int main(int argc, char const *argv[])
{
	int res = inlinecalc(111, 222, argc);
	printf("result = %d\n", res);

	loopFunc();

	recCallFunc();

	return 0;
}

Optimization level of clang

This knowledge point is also introduced first. Clang compiler has multiple optimization options. The assembly code generated under different optimization options is different. The configuration in Xcode is as follows. Using clang command line tool corresponds to different optimization options such as - O0, - O1, - O2, etc., which will be used later. Because clang can do optimization, I also appropriately adjusted the complexity of the code in the code preparation stage to avoid that the assembly code generated by different levels of optimization options is the same, resulting in inaccurate analysis results.

Analysis

The next step is the analysis step. With clang's command-line tool, clang can convert the source file into assembly code.

clang a.c -O0 -S -o -
  • -O0 is the optimization level, which has been described above
  • -S means generate assembly code
  • -o - output results to console

O0/O1 level

	.globl	_main                   ## -- Begin function main
	.p2align	4, 0x90
_main:                                  ## @main
	.cfi_startproc
## %bb.0:
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset %rbp, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register %rbp
	movl	%edi, %eax
	movl	$111, %edi
	movl	$222, %esi
	movl	%eax, %edx
	callq	_inlinecalc
	movl	%eax, %ecx
	leaq	L_.str.1(%rip), %rdi
	xorl	%eax, %eax
	movl	%ecx, %esi
	callq	_printf
	callq	_loopFunc
	callq	_recCallFunc
	xorl	%eax, %eax
	popq	%rbp
	retq
	.cfi_endproc
                                        ## -- End function

There are few optimizations at this level. The three inline methods in the source code are used as method calls and are not embedded in the call place

O2 level

	.globl	_main                   ## -- Begin function main
	.p2align	4, 0x90
_main:                                  ## @main
	.cfi_startproc
## %bb.0:
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset %rbp, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register %rbp
	cmpl	$2, %edi
	setg	%cl
	movl	$111, %esi
	shll	%cl, %esi
	addl	$222, %esi
	leaq	L_.str.1(%rip), %rdi
	xorl	%eax, %eax
	callq	_printf
	callq	_loopFunc
	movl	$10, %edi
	callq	_inline_recCallFunc
	xorl	%eax, %eax
	popq	%rbp
	retq
	.cfi_endproc
                                        ## -- End function

Two changes can be seen in the O1 level

  • _The call of inlinecalc method is no longer available. It is embedded in the call place. The corresponding code is from cmpl $2,% EDI to callq ﹐ printf

O3 level

	.globl	_main                   ## -- Begin function main
	.p2align	4, 0x90
_main:                                  ## @main
	.cfi_startproc
## %bb.0:
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset %rbp, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register %rbp
	pushq	%r14
	pushq	%rbx
	.cfi_offset %rbx, -32
	.cfi_offset %r14, -24
	cmpl	$2, %edi
	setg	%cl
	movl	$111, %esi
	shll	%cl, %esi
	addl	$222, %esi
	leaq	L_.str.1(%rip), %rdi
	xorl	%eax, %eax
	callq	_printf
	leaq	L_.str(%rip), %r14
	xorl	%ebx, %ebx
	.p2align	4, 0x90
LBB6_1:                                 ## =>This Inner Loop Header: Depth=1
	xorl	%eax, %eax
	movq	%r14, %rdi
	movl	%ebx, %esi
	callq	_printf
	incl	%ebx
	cmpl	$1000, %ebx             ## imm = 0x3E8
	jne	LBB6_1
## %bb.2:
	movl	$10, %edi
	callq	_inline_recCallFunc
	xorl	%eax, %eax
	popq	%rbx
	popq	%r14
	popq	%rbp
	retq
	.cfi_endproc
                                        ## -- End function

The inline method with for loop in this level is also embedded in the call place, and the inline is also effective. In addition, according to my own verification, the inline with switch statement will also be embedded in the call place, and the only one that will not be effective is the recursive call method.

Posted by nsarisk on Sat, 09 Nov 2019 02:50:38 -0800