How to Force a Function Not to Be Inlined

Is it possible to force a function not to be inlined?

In Visual Studio 2010, __declspec(noinline) tells the compiler to never inline a particular member function, for instance:

class X {
__declspec(noinline) int member_func() {
return 0;
}
};

edit: Additionally, when compiling with /clr, functions with security attributes never get inlined (again, this is specific to VS 2010).

I don't think it will prove at all useful at debugging, though.

How can I tell gcc not to inline a function?

You want the gcc-specific noinline attribute.

This function attribute prevents a
function from being considered for
inlining. If the function does not
have side-effects, there are
optimizations other than inlining that
causes function calls to be optimized
away, although the function call is
live. To keep such calls from being
optimized away, put
asm ("");

Use it like this:

void __attribute__ ((noinline)) foo() 
{
...
}

When can i not inline a function?

inline is a keyword of C++, but inlining is a generic process performed by a compiler backend, usually after instruction sequences are already generated.

A C compiler will also inline functions, and a C++ compiler will inline functions that aren't inline. A C++ compiler can also fail to inline an inline function for any arbitrary reason. The keyword actually exists to specify that a function may have multiple, identical definitions in different translation units (source files).

Static variables have no special bearing on whether something can be inlined. Perhaps some compilers have difficulty linking the resulting structure of global variable references, but that's more of a bug than a rule of thumb.

Recursive functions can be inlined, too. The recursive call should be translated to a branch. The branch could then be targeted by loop unrolling.

A function that compiles to more than a kilobyte of code will usually not be inlined. But a compiler may provide #pragma directives or platform-specific attributes to force inlining in such a case.

The biggest factor that would stop a function from being inlined is if its source isn't available to the compiler at the time of code generation. Link-time optimization opens the possibility of inlining functions that are extern and not inline but a function supplied by a DLL is certainly off limits. But then, you could still run it through a JIT style execution engine and that could inline (splice together) any random fragments it likes.

C++ small function not inlining

PS

The always_inline attribute probably does not mean what you think it means. Normally g++ does not inline anything when there are no optimizations turned on (as this makes debugging easier, I assume). By adding this attribute (always_inline) the compiler will inline when not optimizing (probably not what you want) but this does not make a function that was not inline(able) into one that can or will be inline(ed).

see: https://gcc.gnu.org/onlinedocs/gcc/Inline.html

Given your comments you have the following:

File A.h

void F2();

File B.h

void F1();
void F3() __attribute__((always_inline));

File A.cpp

#include "A.h"
#include "B.h"

void F2() {
F3();
}

File B.cpp

#include "B.h"
#include "A.h"

void F1() {
F2();
}

void F3() {}

In the future that would be the minimal viable applications that you should have submitted as it has all the type information and enough to re-build your situation.

The code you provide is not compilable and takes a lot of cognitive load to unwind the english description you provided into compilable code.

If you have set up your compiler this can be done so that F3() will be inlined into A.cpp but that may not always be the case. To be able to do that kind of optimization either the translation unit must have access to the source of F3() or you must be able to cross translation unit optimizations.

You can simplify this by moving the body of F3() into the header file. Then it will be available for inlining directly to the translation unit.

File A.h

void F2();

File B.h

void F1();
void F3() __attribute__((always_inline)); // I would not add this.
// Let the compiler not inline in debug mode.
inline void F3() {}

File A.cpp

#include "A.h"
#include "B.h"

void F2() {
F3();
}

File B.cpp

#include "B.h"
#include "A.h"

void F1() {
F2();
}

Can a very short function become inlined even if it was not explicitly defined as inline?

inline is non-binding with regards to whether or not a function will be inlined by the compiler. This was originally what it was intended to do. But since then, it's been realized that whether or not a function is worth inlining depends as much on the call site as the function itself and is best left to the compiler to decide.

From https://en.cppreference.com/w/cpp/language/inline :

Since this meaning of the keyword inline is non-binding, compilers are free to use inline substitution for any function that's not marked inline, and are free to generate function calls to any function marked inline. Those optimization choices do not change the rules regarding multiple definitions and shared statics listed above.

Edit : Since you asked for C as well, from https://en.cppreference.com/w/c/language/inline :

The intent of the inline specifier is to serve as a hint for the compiler to perform optimizations, such as function inlining, which require the definition of a function to be visible at the call site. The compilers can (and usually do) ignore presence or absence of the inline specifier for the purpose of optimization.

Can I selectively (force) inline a function?

You cannot force the inline. Also, function calls are pretty cheap on modern CPUs, compared to the cost of the work done. If your functions are large enough to need to be broken down, the additional time taken to do the call will be essentially nothing.

Failing that, you could ... try ... to use a macro.

Visual C++ ~ Not inlining simple const function pointer calls

The problem isn't with inlining, which the compiler does at every opportunity. The problem is that Visual C++ doesn't seem to realize that the pointer variable is actually a compile-time constant.

Test-case:

// function_pointer_resolution.cpp : Defines the entry point for the console application.
//

extern void show_int( int );

extern "C" typedef int binary_int_func( int, int );

extern "C" binary_int_func sum;
extern "C" binary_int_func* const sum_ptr = sum;

inline int call( binary_int_func* binary, int a, int b ) { return (*binary)(a, b); }

template< binary_int_func* binary >
inline int callt( int a, int b ) { return (*binary)(a, b); }

int main( void )
{
show_int( sum(1, 2) );
show_int( call(&sum, 3, 4) );
show_int( callt<&sum>(5, 6) );
show_int( (*sum_ptr)(1, 7) );
show_int( call(sum_ptr, 3, 8) );
// show_int( callt<sum_ptr>(5, 9) );
return 0;
}

// sum.cpp
extern "C" int sum( int x, int y )
{
return x + y;
}

// show_int.cpp
#include <iostream>

void show_int( int n )
{
std::cout << n << std::endl;
}

The functions are separated into multiple compilation units to give better control over inlining. Specifically, I don't want show_int inlined, since it makes the assembly code messy.

The first whiff of trouble is that valid code (the commented line) is rejected by Visual C++. G++ has no problem with it, but Visual C++ complains "expected compile-time constant expression". This is actually a good predictor of all future behavior.

With optimization enabled and normal compilation semantics (no cross-module inlining), the compiler generates:

_main   PROC                        ; COMDAT

; 18 : show_int( sum(1, 2) );

push 2
push 1
call _sum
push eax
call ?show_int@@YAXH@Z ; show_int

; 19 : show_int( call(&sum, 3, 4) );

push 4
push 3
call _sum
push eax
call ?show_int@@YAXH@Z ; show_int

; 20 : show_int( callt<&sum>(5, 6) );

push 6
push 5
call _sum
push eax
call ?show_int@@YAXH@Z ; show_int

; 21 : show_int( (*sum_ptr)(1, 7) );

push 7
push 1
call DWORD PTR _sum_ptr
push eax
call ?show_int@@YAXH@Z ; show_int

; 22 : show_int( call(sum_ptr, 3, 8) );

push 8
push 3
call DWORD PTR _sum_ptr
push eax
call ?show_int@@YAXH@Z ; show_int
add esp, 60 ; 0000003cH

; 23 : //show_int( callt<sum_ptr>(5, 9) );
; 24 : return 0;

xor eax, eax

; 25 : }

ret 0
_main ENDP

There's already a huge difference between using sum_ptr and not using sum_ptr. Statements using sum_ptr generate a indirect function call call DWORD PTR _sum_ptr while all other statements generate a direct function call call _sum, even when the source code used a function pointer.

If we now enable inlining by compiling function_pointer_resolution.cpp and sum.cpp with /GL and linking with /LTCG, we find that the compiler inlines all direct calls. Indirect calls stay as-is.

_main   PROC                        ; COMDAT

; 18 : show_int( sum(1, 2) );

push 3
call ?show_int@@YAXH@Z ; show_int

; 19 : show_int( call(&sum, 3, 4) );

push 7
call ?show_int@@YAXH@Z ; show_int

; 20 : show_int( callt<&sum>(5, 6) );

push 11 ; 0000000bH
call ?show_int@@YAXH@Z ; show_int

; 21 : show_int( (*sum_ptr)(1, 7) );

push 7
push 1
call DWORD PTR _sum_ptr
push eax
call ?show_int@@YAXH@Z ; show_int

; 22 : show_int( call(sum_ptr, 3, 8) );

push 8
push 3
call DWORD PTR _sum_ptr
push eax
call ?show_int@@YAXH@Z ; show_int
add esp, 36 ; 00000024H

; 23 : //show_int( callt<sum_ptr>(5, 9) );
; 24 : return 0;

xor eax, eax

; 25 : }

ret 0
_main ENDP

Bottom-line: Yes, the compiler does inline calls made through a compile-time constant function pointer, as long as that function pointer is not read from a variable. This use of a function pointer got optimized:

call(&sum, 3, 4);

but this did not:

(*sum_ptr)(1, 7);

All tests run with Visual C++ 2010 Service Pack 1, compiling for x86, hosted on x64.

Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 16.00.40219.01 for 80x86



Related Topics



Leave a reply



Submit