Why Do We Use Std::Function in C++ Rather Than the Original C Function Pointer

Why do we use std::function in C++ rather than the original C function pointer?

std::function can hold more than function pointers, namely functors.

#include <functional>

void foo(double){}

struct foo_functor{
  void operator()(float) const{}
};

int main(){
  std::function<void(int)> f1(foo), f2((foo_functor()));
  f1(5);
  f2(6);
}

Live example on Ideone.

As the example shows, you also don't need the exact same signature, as long as they are compatible (i.e., the parameter type of std::function can be passed to the contained function / functor).

Should I use std::function or a function pointer in C++?

In short, use std::function unless you have a reason not to.

Function pointers have the disadvantage of not being able to capture some context. You won't be able to for example pass a lambda function as a callback which captures some context variables (but it will work if it doesn't capture any). Calling a member variable of an object (i.e. non-static) is thus also not possible, since the object (this-pointer) needs to be captured.⁽¹⁾

std::function (since C++11) is primarily to store a function (passing it around doesn't require it to be stored). Hence if you want to store the callback for example in a member variable, it's probably your best choice. But also if you don't store it, it's a good "first choice" although it has the disadvantage of introducing some (very small) overhead when being called (so in a very performance-critical situation it might be a problem but in most it should not). It is very "universal": if you care a lot about consistent and readable code as well as don't want to think about every choice you make (i.e. want to keep it simple), use std::function for every function you pass around.

Think about a third option: If you're about to implement a small function which then reports something via the provided callback function, consider a template parameter, which can then be any callable object, i.e. a function pointer, a functor, a lambda, a std::function, ... Drawback here is that your (outer) function becomes a template and hence needs to be implemented in the header. On the other hand you get the advantage that the call to the callback can be inlined, as the client code of your (outer) function "sees" the call to the callback will the exact type information being available.

Example for the version with the template parameter (write & instead of && for pre-C++11):

template <typename CallbackFunction>
void myFunction(..., CallbackFunction && callback) {
    ...
    callback(...);
    ...
}

As you can see in the following table, all of them have their advantages and disadvantages:

	function ptr	std::function	template param
can capture context variables	no¹	yes	yes
no call overhead (see comments)	yes	no	yes
can be inlined (see comments)	no	no	yes
can be stored in a class member	yes	yes	no²
can be implemented outside of header	yes	yes	no
supported without C++11 standard	yes	no³	yes
nicely readable (my opinion)	no	yes	(yes)

std::function vs function pointer

std::function is more generic - you can store in it any callable object with correct signature (function pointer, method pointer, object with operator()) and you can construct std::function using std::bind.

Function pointer can only accept functions with correct signature but might be slightly faster and might generate slightly smaller code.

Why does the implementation of std::any use a function pointer + function op codes, instead of a pointer to a virtual table + virtual calls?

Consider a typical use case of a std::any: You pass it around in your code, move it dozens of times, store it in a data structure and fetch it again later. In particular, you'll likely return it from functions a lot.

As it is now, the pointer to the single "do everything" function is stored right next to the data in the any. Given that it's a fairly small type (16 bytes on GCC x86-64), any fits into a pair of registers. Now, if you return an any from a function, the pointer to the "do everything" function of the any is already in a register or on the stack! You can just jump directly to it without having to fetch anything from memory. Most likely, you didn't even have to touch memory at all: You know what type is in the any at the point you construct it, so the function pointer value is just a constant that's loaded into the appropriate register. Later, you use the value of that register as your jump target. This means there's no chance for misprediction of the jump because there is nothing to predict, the value is right there for the CPU to consume.

In other words: The reason that you get the jump target for free with this implementation is that the CPU must have already touched the any in some way to obtain it in the first place, meaning that it already knows the jump target and can jump to it with no additional delay.

That means there really is no indirection to speak of with the current implementation if the any is already "hot", which it will be most of the time, especially if it's used as a return value.

On the other hand, if you use a table of function pointers somewhere in a read-only section (and let the any instance point to that instead), you'll have to go to memory (or cache) every single time you want to move or access it. The size of an any is still 16 bytes in this case but fetching values from memory is much, much slower than accessing a value in a register, especially if it's not in a cache. In a lot of cases, moving an any is as simple as copying its 16 bytes from one location to another, followed by zeroing out the original instance. This is pretty much free on any modern CPU. However, if you go the pointer table route, you'll have to fetch from memory every time, wait for the reads to complete, and then do the indirect call. Now consider that you'll often have to do a sequence of calls on the any (i.e. move, then destruct) and this will quickly add up. The problem is that you don't just get the address of the function you want to jump to for free every time you touch the any, the CPU has to fetch it explicitly. Indirect jumps to a value read from memory are quite expensive since the CPU can only retire the jump operation once the entire memory operation has finished. That doesn't just include fetching a value (which is potentially quite fast because of caches) but also address generation, store forwarding buffer lookup, TLB lookup, access validation, and potentially even page table walks. So even if the jump address is computed quickly, the jump won't retire for quite a long while. In general, "indirect-jump-to-address-from-memory" operations are among the worst things that can happen to a CPU's pipeline.

TL;DR: As it is now, returning an any doesn't stall the CPU's pipeline (the jump target is already available in a register so the jump can retire pretty much immediately). With a table-based solution, returning an any will stall the pipeline twice: Once to fetch the address of the move function, then another time to fetch the destructor. This delays retirement of the jump quite a bit since it'll have to wait not only for the memory value but also for the TLB and access permission checks.

Code memory accesses, on the other hand, aren't affected by this since the code is kept in microcode form anyway (in the µOp cache). Fetching and executing a few conditional branches in that switch statement is therefore quite fast (and even more so when the branch predictor gets things right, which it almost always does).

Difference between std::function and a standard function pointer?

A function pointer is the address of an actual function defined in C++. An std::function is a wrapper that can hold any type of callable object (objects that can be used like functions).

struct FooFunctor
{
    void operator()(int i) {
        std::cout << i;
    }
};

// Since `FooFunctor` defines `operator()`, it can be used as a function
FooFunctor func;
std::function<void (int)> f(func);

Here, std::function allows you to abstract away exactly what kind of callable object it is you are dealing with — you don't know it's FooFunctor, you just know that it returns void and has one int parameter.

A real-world example where this abstraction is useful is when you are using C++ together with another scripting language. You might want to design an interface that can deal with both functions defined in C++, as well as functions defined in the scripting language, in a generic way.

Edit: Binding

Alongside std::function, you will also find std::bind. These two are very powerful tools when used together.

void func(int a, int b) {
    // Do something important
}

// Consider the case when you want one of the parameters of `func` to be fixed
// You can used `std::bind` to set a fixed value for a parameter; `bind` will
// return a function-like object that you can place inside of `std::function`.

std::function<void (int)> f = std::bind(func, _1, 5);

In that example, the function object returned by bind takes the first parameter, _1, and passes it to func as the a parameter, and sets b to be the constant 5.

std::function to C-style function pointer

You can use lambda functions with C-style function pointers, just not using std::function.

Lambdas that don't have any capture are convertible to a functions pointer:

using callback_t = int(*)(int arg, void* user_param);
void set_c_callback(callback_t, void* user_param);

// ...

void foo() {
    set_c_callback([](int a, void* data) {
        // code
    }, nullptr);
}

But there is also a way with lambda with captures, using std::any:

// The storage can be inside a class instead of a global
std::any lambda_storage;

template<typename T>
void foo(T lambda) {
    lambda_storage = lambda;
    set_c_callback([](int n, void* user_data) {
        auto& lambda = *std::any_cast<T>(static_cast<std::any*>(user_data));
        lambda(n);
    }, &lambda_storage)
}

// ...

foo([k = 1](int n) {
    std::cout << n + k << std::endl;
});

Advantages of using std::function

std::function implements a technique called type-erasure because of which you can store any callable entity in std::function, be it functors, member functions, or free functions — you can even store member-data as well which seems counter intuitive!

Here is one example, which cannot be done without std::function (or type-erasure):

std::vector<std::function<void()>>  funs;
for(int i = 0; i < 10; ++i)
  funs.push_back([i] { std::cout << i < std::endl; });

for(auto const & f: funs)
   f();

Here is another example which stores member-data:

struct person
{ 
    std::string name;
};

int main()
{
    std::vector<person> people {
      {"Bjarne"},
      {"Alex"}
    };

    std::function<std::string(person const&)> name (&person::name);

    for(auto const & p : people)
       std::cout << name(p) << std::endl;
}

Output (demo):

Bjarne
Alex

Hope that helps.

Why must function pointers be used?

TL; DR

"Function" and "pointer to function" is the same.

There is the concept of a pointer, and the syntax of its usage; it's not clear what you are asking about.

Concept

A pointer to a function may be different from the function itself (the difference is not useful in c++ - see below) in that a function may occupy much space - its code can be arbitrarily complex. Manipulating (e.g. copying or searching/modifying) the code of a function is rarely useful, so c/c++ don't support it at all. If you want to modify the code of a function, cast a pointer to char*, applying all the necessary precautions (I have never done it).

So if you are writing C, all you need is pointers to functions.

However...

Syntax

If you have a pointer p to a function, how do you want to call the function?

(*p)(18); // call the function with parameter 18
p(18); // the same, but looks better!

There is the slightly cleaner syntax not involving the * sign. To support it, the authors of c/c++ invented the concept of "decay" - when your code mentions "a function", the compiler silently "corrects" it to mean "a pointer to a function" instead (in almost all circumstances; excuse me for not detailing further). This is very similar to the "decay" of an array to a pointer mentioned by vsoftco.

So in your example

void runprint(int function(int x), int x) {
        cout << function(x) << endl;
}

the "function" type is actually a "pointer to function" type. Indeed, if you try to "overload":

void runprint(int (*function)(int x), int x) {
        cout << function(x) << endl;
}

the compiler will complain about two identical functions with identical set of parameters.

Also, when making a variable of a function / pointer-to-function type

runprint(add, 1);

it also doesn't matter:

runprint(&add, 1); // does exactly the same

P.S. When declaring a function that receives a callback, I have mostly seen (and used) the explicitly written pointer. It has only now occurred to me that it's inconsistent to rely on function-to-pointer decay when calling the callback, but not when declaring my code. So if the question is

why does everyone declare callbacks using a pointer-to-function syntax, when a function syntax would be sufficient?

I'd answer "a matter of habit".

Is there a use case for std::function that is not covered by function pointers, or is it just syntactic sugar?

std::function<> gives you the possibility of encapsulating any type of callable object, which is something function pointers cannot do (although it is true that non-capturing lambdas can be converted to function pointers).

To give you an idea of the kind of flexibility it allows you to achieve:

#include <functional>
#include <iostream>
#include <vector>

// A functor... (could even have state!)
struct X
{
    void operator () () { std::cout << "Functor!" << std::endl; }
};

// A regular function...
void bar()
{
    std::cout << "Function" << std::endl;
}

// A regular function with one argument that will be bound...
void foo(int x)
{
    std::cout << "Bound Function " << x << "!" << std::endl;
}

int main()
{
    // Heterogenous collection of callable objects
    std::vector<std::function<void()>> functions;

    // Fill in the container...
    functions.push_back(X());
    functions.push_back(bar);
    functions.push_back(std::bind(foo, 42));

    // And a add a lambda defined in-place as well...
    functions.push_back([] () { std::cout << "Lambda!" << std::endl; });

    // Now call them all!
    for (auto& f : functions)
    {
        f(); // Same interface for all kinds of callable object...
    }
}

As usual, see a live example here. Among other things, this allows you to realize the Command Pattern.

Can std::function have a function pointer as the type?

Can std::function have a function pointer as the type?

No. The class template std::function is defined only for a function type template argument. The template is undefined for all other argument types, including function pointers.

I also recommend against aliasing pointer types in general (there are exceptions to this rule of thumb though). If you avoid it, then you can re-use the alias just fine:

typedef void callback();
typedef std::function<callback> callback2;
// or preferably using the modern syntax
using callback  = void();
using callback2 = std::function<callback>;

void worker (callback* c);
void worker2(callback2 c);

Why Do We Use Std::Function in C++ Rather Than the Original C Function Pointer