C++ Can Compilers Inline a Function Pointer

C++ can compilers inline a function pointer?

Sure thing.

It knows the value of function is the same as the value it passes it, knows the definition of the function, so just replaces the definition inline and calls the function directly.

I can't think of a condition where a compiler won't inline a one-line function call, it's just replacing a function call with a function call, no possible loss.


Given this code:

#include <iostream>

template <typename Function>
void functionProxy(Function function)
{
function();
}

struct Functor
{
void operator()() const
{
std::cout << "functor!" << std::endl;
}
};

void function()
{
std::cout << "function!" << std::endl;
}

//#define MANUALLY_INLINE

#ifdef MANUALLY_INLINE
void test()
{
Functor()();

function();

[](){ std::cout << "lambda!" << std::endl; }();
}
#else
void test()
{
functionProxy(Functor());

functionProxy(function);

functionProxy([](){ std::cout << "lambda!" << std::endl; });
}
#endif

int main()
{
test();
}

With MANUALLY_INLINE defined, we get this:

test:
00401000 mov eax,dword ptr [__imp_std::endl (402044h)]
00401005 mov ecx,dword ptr [__imp_std::cout (402058h)]
0040100B push eax
0040100C push offset string "functor!" (402114h)
00401011 push ecx
00401012 call std::operator<<<std::char_traits<char> > (401110h)
00401017 add esp,8
0040101A mov ecx,eax
0040101C call dword ptr [__imp_std::basic_ostream<char,std::char_traits<char> >::operator<< (40204Ch)]
00401022 mov edx,dword ptr [__imp_std::endl (402044h)]
00401028 mov eax,dword ptr [__imp_std::cout (402058h)]
0040102D push edx
0040102E push offset string "function!" (402120h)
00401033 push eax
00401034 call std::operator<<<std::char_traits<char> > (401110h)
00401039 add esp,8
0040103C mov ecx,eax
0040103E call dword ptr [__imp_std::basic_ostream<char,std::char_traits<char> >::operator<< (40204Ch)]
00401044 mov ecx,dword ptr [__imp_std::endl (402044h)]
0040104A mov edx,dword ptr [__imp_std::cout (402058h)]
00401050 push ecx
00401051 push offset string "lambda!" (40212Ch)
00401056 push edx
00401057 call std::operator<<<std::char_traits<char> > (401110h)
0040105C add esp,8
0040105F mov ecx,eax
00401061 call dword ptr [__imp_std::basic_ostream<char,std::char_traits<char> >::operator<< (40204Ch)]
00401067 ret

And without, this:

test:
00401000 mov eax,dword ptr [__imp_std::endl (402044h)]
00401005 mov ecx,dword ptr [__imp_std::cout (402058h)]
0040100B push eax
0040100C push offset string "functor!" (402114h)
00401011 push ecx
00401012 call std::operator<<<std::char_traits<char> > (401110h)
00401017 add esp,8
0040101A mov ecx,eax
0040101C call dword ptr [__imp_std::basic_ostream<char,std::char_traits<char> >::operator<< (40204Ch)]
00401022 mov edx,dword ptr [__imp_std::endl (402044h)]
00401028 mov eax,dword ptr [__imp_std::cout (402058h)]
0040102D push edx
0040102E push offset string "function!" (402120h)
00401033 push eax
00401034 call std::operator<<<std::char_traits<char> > (401110h)
00401039 add esp,8
0040103C mov ecx,eax
0040103E call dword ptr [__imp_std::basic_ostream<char,std::char_traits<char> >::operator<< (40204Ch)]
00401044 mov ecx,dword ptr [__imp_std::endl (402044h)]
0040104A mov edx,dword ptr [__imp_std::cout (402058h)]
00401050 push ecx
00401051 push offset string "lambda!" (40212Ch)
00401056 push edx
00401057 call std::operator<<<std::char_traits<char> > (401110h)
0040105C add esp,8
0040105F mov ecx,eax
00401061 call dword ptr [__imp_std::basic_ostream<char,std::char_traits<char> >::operator<< (40204Ch)]
00401067 ret

The same. (Compiled with MSVC 2010, vanilla Release.)

Why are pointers to inline functions allowed?

1) Why pointers to inline functions are allowed in c++?

Because inline functions are functions just like any other, and pointing to them is one of the things that you can do with functions. Inline functions just aren't special in this regard.

I have read that code of inline functions just get copied to the function calling statement and there is no compile time memory allocations in inline functions.

You (and perhaps the material you've read) have mixed two related and similarly named concepts.

An inline function is defined in all translation units that use it, while a non-inline function is defined in one translation unit only as required by the one definition rule. That is what an inline declaration of a function means; it relaxes the one definition rule, but also gives the additional requirement of being defined in all translation units that use it (which would not have been possible if the odr wasn't relaxed).

Inline expansion (or inlining) is an optimization, where a function call is avoided by copying the called function into the frame of the caller. A function call can be expanded inline, whether the function has been declared inline or not. And a function that has been declared inline is not necessarily expanded inline.

However, a function can not be expanded inline in a translation unit where it is not defined (unless link time optimization performs the expansion). Therefore the requirement of being defined in all TUs that the inline declaration allows, also makes possible the inline expansion of the function by allowing the function to be defined in all TUs that invoke it. But the optimization is not guaranteed.

2) Should it not print different values of address of n each time func() is called?

Inline expansion does cause the local variables to be located in the frame of the caller, yes. But their location will differ regardless of expansion if the calls originate from separate frames.

There is typically a regular non-expanded version generated of any function that has been expanded inline. If the address of a function is taken, it will point to that non-expanded function. If the compiler can prove that all calls to a function are inlined, the compiler might choose to not provide the non-expanded version at all. This requires that the function has internal linkage, and taking the address of the function typically makes such proof very difficult, or impossible.

Will GCC inline a function that takes a pointer?

GCC is quite smart. Consider this code fragment:

#include <stdio.h>

void __inline__ inc(int *val)
{
++ *val;
}

int main()
{
int val;

scanf("%d", &val);

inc(&val);

printf("%d\n", val);

return 0;
}

After a gcc -S -O3 test.c you'll get the following relevant asm:

...
call __isoc99_scanf
movl 12(%rsp), %esi
movl $.LC1, %edi
xorl %eax, %eax
addl $1, %esi
movl %esi, 12(%rsp)
call printf
...

As you can see, there's no need to be an asm expert to see the inc() call has been converted to an increment instruction.

C: Pointer to inline function

inline is one of the misnomers of the C standard. Its main meaning is to be able to put the definition of a function in a header file without having to deal with "multiple definition" problems at link time.

The official way in C99 and C11 to do what you want to achieve is to have the inline definition in the header file, without the static. Since you also need the symbol to be emitted you need to tell the compiler in which compilation unit this should be. Such an instantiation can be done by have a declaration in that .c file where you omit the inline keyword.

Most naturally you could use the .c file where you actually need the symbol.

Visual C++ ~ Not inlining simple const function pointer calls

The problem isn't with inlining, which the compiler does at every opportunity. The problem is that Visual C++ doesn't seem to realize that the pointer variable is actually a compile-time constant.

Test-case:

// function_pointer_resolution.cpp : Defines the entry point for the console application.
//

extern void show_int( int );

extern "C" typedef int binary_int_func( int, int );

extern "C" binary_int_func sum;
extern "C" binary_int_func* const sum_ptr = sum;

inline int call( binary_int_func* binary, int a, int b ) { return (*binary)(a, b); }

template< binary_int_func* binary >
inline int callt( int a, int b ) { return (*binary)(a, b); }

int main( void )
{
show_int( sum(1, 2) );
show_int( call(&sum, 3, 4) );
show_int( callt<&sum>(5, 6) );
show_int( (*sum_ptr)(1, 7) );
show_int( call(sum_ptr, 3, 8) );
// show_int( callt<sum_ptr>(5, 9) );
return 0;
}

// sum.cpp
extern "C" int sum( int x, int y )
{
return x + y;
}

// show_int.cpp
#include <iostream>

void show_int( int n )
{
std::cout << n << std::endl;
}

The functions are separated into multiple compilation units to give better control over inlining. Specifically, I don't want show_int inlined, since it makes the assembly code messy.

The first whiff of trouble is that valid code (the commented line) is rejected by Visual C++. G++ has no problem with it, but Visual C++ complains "expected compile-time constant expression". This is actually a good predictor of all future behavior.

With optimization enabled and normal compilation semantics (no cross-module inlining), the compiler generates:

_main   PROC                        ; COMDAT

; 18 : show_int( sum(1, 2) );

push 2
push 1
call _sum
push eax
call ?show_int@@YAXH@Z ; show_int

; 19 : show_int( call(&sum, 3, 4) );

push 4
push 3
call _sum
push eax
call ?show_int@@YAXH@Z ; show_int

; 20 : show_int( callt<&sum>(5, 6) );

push 6
push 5
call _sum
push eax
call ?show_int@@YAXH@Z ; show_int

; 21 : show_int( (*sum_ptr)(1, 7) );

push 7
push 1
call DWORD PTR _sum_ptr
push eax
call ?show_int@@YAXH@Z ; show_int

; 22 : show_int( call(sum_ptr, 3, 8) );

push 8
push 3
call DWORD PTR _sum_ptr
push eax
call ?show_int@@YAXH@Z ; show_int
add esp, 60 ; 0000003cH

; 23 : //show_int( callt<sum_ptr>(5, 9) );
; 24 : return 0;

xor eax, eax

; 25 : }

ret 0
_main ENDP

There's already a huge difference between using sum_ptr and not using sum_ptr. Statements using sum_ptr generate a indirect function call call DWORD PTR _sum_ptr while all other statements generate a direct function call call _sum, even when the source code used a function pointer.

If we now enable inlining by compiling function_pointer_resolution.cpp and sum.cpp with /GL and linking with /LTCG, we find that the compiler inlines all direct calls. Indirect calls stay as-is.

_main   PROC                        ; COMDAT

; 18 : show_int( sum(1, 2) );

push 3
call ?show_int@@YAXH@Z ; show_int

; 19 : show_int( call(&sum, 3, 4) );

push 7
call ?show_int@@YAXH@Z ; show_int

; 20 : show_int( callt<&sum>(5, 6) );

push 11 ; 0000000bH
call ?show_int@@YAXH@Z ; show_int

; 21 : show_int( (*sum_ptr)(1, 7) );

push 7
push 1
call DWORD PTR _sum_ptr
push eax
call ?show_int@@YAXH@Z ; show_int

; 22 : show_int( call(sum_ptr, 3, 8) );

push 8
push 3
call DWORD PTR _sum_ptr
push eax
call ?show_int@@YAXH@Z ; show_int
add esp, 36 ; 00000024H

; 23 : //show_int( callt<sum_ptr>(5, 9) );
; 24 : return 0;

xor eax, eax

; 25 : }

ret 0
_main ENDP

Bottom-line: Yes, the compiler does inline calls made through a compile-time constant function pointer, as long as that function pointer is not read from a variable. This use of a function pointer got optimized:

call(&sum, 3, 4);

but this did not:

(*sum_ptr)(1, 7);

All tests run with Visual C++ 2010 Service Pack 1, compiling for x86, hosted on x64.

Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 16.00.40219.01 for 80x86

Function pointer runs faster than inline function. Why?

Oh s**t (do I need to censor swearing here?), I found it out. It was somehow related to the timing being inside the loop. When I moved it outside as following,

#include <iostream>
#include <chrono>
inline short toBigEndian(short i)
{
return (i<<8)|(i>>8);
}

short (*toBigEndianPtr)(short i)=toBigEndian;
int main()
{
int total=0;
auto begin=std::chrono::high_resolution_clock::now();
for(int i=0;i<100000000;i++)
{
short a=toBigEndianPtr((short)i);
total+=a;
}
auto end=std::chrono::high_resolution_clock::now();
std::cout<<std::chrono::duration_cast<std::chrono::duration<double>>(end-begin).count()<<", "<<total<<std::endl;
return 0;
}

the results are just as they should be. 0.08 seconds for inline, 0.20 seconds for pointer. Sorry for bothering you guys.

c++ Function pointer inlining

GNU's g++ 4.5 inlines it for me starting at optimization level -O1

main:
subq $8, %rsp
movl $6, %edx
movl $.LC0, %esi
movl $_ZSt4cout, %edi
call _ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_E
movl $0, %eax
addq $8, %rsp
ret

where .LC0 is the .string "Hello\n".

To compare, with no optimization, g++ -O0, it did not inline:

main:
pushq %rbp
movq %rsp, %rbp
subq $16, %rsp
movq $_ZL5Printv, -8(%rbp)
movq -8(%rbp), %rax
call *%rax
movl $0, %eax
leave
ret


Related Topics



Leave a reply



Submit