C++11 Lambda Implementation and Memory Model

C++11 lambda implementation and memory model

My current understanding is that a lambda with no captured closure is exactly like a C callback. However, when the environment is captured either by value or by reference, an anonymous object is created on the stack.

No; it is always a C++ object with an unknown type, created on the stack. A capture-less lambda can be converted into a function pointer (though whether it is suitable for C calling conventions is implementation dependent), but that doesn't mean it is a function pointer.

When a value-closure must be returned from a function, one wraps it in std::function. What happens to the closure memory in this case?

A lambda isn't anything special in C++11. It's an object like any other object. A lambda expression results in a temporary, which can be used to initialize a variable on the stack:

auto lamb = []() {return 5;};

lamb is a stack object. It has a constructor and destructor. And it will follow all of the C++ rules for that. The type of lamb will contain the values/references that are captured; they will be members of that object, just like any other object members of any other type.

You can give it to a std::function:

auto func_lamb = std::function<int()>(lamb);

In this case, it will get a copy of the value of lamb. If lamb had captured anything by value, there would be two copies of those values; one in lamb, and one in func_lamb.

When the current scope ends, func_lamb will be destroyed, followed by lamb, as per the rules of cleaning up stack variables.

You could just as easily allocate one on the heap:

auto func_lamb_ptr = new std::function<int()>(lamb);

Exactly where the memory for the contents of a std::function goes is implementation-dependent, but the type-erasure employed by std::function generally requires at least one memory allocation. This is why std::function's constructor can take an allocator.

Is it freed whenever the std::function is freed, i.e., is it reference-counted like a std::shared_ptr?

std::function stores a copy of its contents. Like virtually every standard library C++ type, function uses value semantics. Thus, it is copyable; when it is copied, the new function object is completely separate. It is also moveable, so any internal allocations can be transferred appropriately without needing more allocating and copying.

Thus there is no need for reference counting.

Everything else you state is correct, assuming that "memory allocation" equates to "bad to use in real-time code".

How are C++11 lambdas represented and passed?

Disclaimer: my answer is somewhat simplified compared to the reality (I put some details aside) but the big picture is here. Also, the Standard does not fully specify how lambdas or std::function must be implemented internally (the implementation has some freedom) so, like any discussion on implementation details, your compiler may or may not do it exactly this way.

But again, this is a subject quite similar to VTables: the Standard doesn't mandate much but any sensible compiler is still quite likely to do it this way, so I believe it is worth digging into it a little. :)

Lambdas

The most straightforward way to implement a lambda is kind of an unnamed struct:

auto lambda = [](Args...) -> Return { /*...*/ };

// roughly equivalent to:
struct {
    Return operator ()(Args...) { /*...*/ }
}
lambda; // instance of the unnamed struct

Just like any other class, when you pass its instances around you never have to copy the code, just the actual data (here, none at all).

Objects captured by value are copied into the struct:

Value v;
auto lambda = [=](Args...) -> Return { /*... use v, captured by value...*/ };

// roughly equivalent to:
struct Temporary { // note: we can't make it an unnamed struct any more since we need
                   // a constructor, but that's just a syntax quirk

    const Value v; // note: capture by value is const by default unless the lambda is mutable
    Temporary(Value v_) : v(v_) {}
    Return operator ()(Args...) { /*... use v, captured by value...*/ }
}
lambda(v); // instance of the struct

Again, passing it around only means that you pass the data (v) not the code itself.

Likewise, objects captured by reference are referenced into the struct:

Value v;
auto lambda = [&](Args...) -> Return { /*... use v, captured by reference...*/ };

// roughly equivalent to:
struct Temporary {
    Value& v; // note: capture by reference is non-const
    Temporary(Value& v_) : v(v_) {}
    Return operator ()(Args...) { /*... use v, captured by reference...*/ }
}
lambda(v); // instance of the struct

That's pretty much all when it comes to lambdas themselves (except the few implementation details I ommitted, but which are not relevant to understanding how it works).

`std::function`

std::function is a generic wrapper around any kind of functor (lambdas, standalone/static/member functions, functor classes like the ones I showed, ...).

The internals of std::function are pretty complicated because they must support all those cases. Depending on the exact type of functor this requires at least the following data (give or take implementation details):

A pointer to a standalone/static function.

Or,

A pointer to a copy^{[see note below]} of the functor (dynamically allocated to allow any type of functor, as you rightly noted it).
A pointer to the member function to be called.
A pointer to an allocator that is able to both copy the functor and itself (since any type of functor can be used, the pointer-to-functor should be void* and thus there has to be such a mechanism -- probably using polymorphism aka. base class + virtual methods, the derived class being generated locally in the template<class Functor> function(Functor) constructors).

Since it doesn't know beforehand which kind of functor it will have to store (and this is made obvious by the fact that std::function can be reassigned) then it has to cope with all possible cases and make the decision at runtime.

Note: I don't know where the Standard mandates it but this is definitely a new copy, the underlying functor is not shared:

int v = 0;
std::function<void()> f = [=]() mutable { std::cout << v++ << std::endl; };
std::function<void()> g = f;

f(); // 0
f(); // 1
g(); // 0
g(); // 1

So, when you pass a std::function around it involves at least those four pointers (and indeed on GCC 4.7 64 bits sizeof(std::function<void()> is 32 which is four 64 bits pointers) and optionally a dynamically allocated copy of the functor (which, as I already said, only contains the captured objects, you don't copy the code).

Answer to the question

what is the cost of passing a lambda to a function like this?^{[context of the question: by value]}

Well, as you can see it depends mainly on your functor (either a hand-made struct functor or a lambda) and the variables it contains. The overhead compared to directly passing a struct functor by value is quite negligible, but it is of course much higher than passing a struct functor by reference.

Should I have to mark each function object passed with const& so that a copy is not made?

I'm afraid this is very hard to answer in a generic way. Sometimes you'll want to pass by const reference, sometimes by value, sometimes by rvalue reference so that you can move it. It really depends on the semantics of your code.

The rules concerning which one you should choose are a totally different topic IMO, just remember that they are the same as for any other object.

Anyway, you now have all the keys to make an informed decision (again, depending on your code and its semantics).

Lambda functions with = capture and memory usage

[=] will cause only the variables that are actually used in the lambda to be captured by it.

In your case val will have a copy of a and z. Assuming there is no padding (which there shouldn't be), then sizeof(val) == 2*sizeof(long double).

Memory layout of a C++ Lambda

The standard does not require lambda closures to have a particular layout. See [expr.prim.lambda.closure]:

The type of a lambda-expression (which is also the type of the closure object) is a unique, unnamed non-union class type, called the closure type, whose properties are described below.

...

The closure type is not an aggregate type. An implementation may define the closure type differently from what is described below provided this does not alter the observable behavior of the program other than by changing:

the size and/or alignment of the closure type,

whether the closure type is trivially copyable, or

whether the closure type is a standard-layout class.

An implementation shall not add members of rvalue reference type to the closure type.

However, to conform to the platform ABI and to have object files interoperable, the compilers probably have to layout and name mangle lambda objects in absolutely the same way.

In C++11 lambda syntax, heap-allocated closures?

In f1 you're getting undefined behavior for the reason you say; the lambda contains a reference to a local variable, and after the function returns the reference is no longer valid. To get around this you don't have to allocate on the heap, you simply have to declare that captured values are mutable:

int k = 121;
return std::function<int(void)>([=]() mutable {return k++;});

You will have to be careful about using this lambda though, because different copies of it will be modifying their own copy of the captured variable. Often algorithms expect that using a copy of a functor is equivalent to using the original. I think there's only one algorithm that actually makes allowances for a stateful function object, std::for_each, where it returns another copy of the function object it uses so you can access whatever modifications occurred.

In f3 nothing is maintaining a copy of the shared pointer, so the memory is being freed and accessing it gives undefined behavior. You can fix this by explicitly capturing the shared pointer by value and still capture the pointed-to int by reference.

std::shared_ptr<int> o = std::shared_ptr<int>(new int(k));
int &p = *o;
return std::function<int(void)>([&p,o]{return p++;});

f4 is again undefined behavior because you're again capturing a reference to a local variable, o. You should simply capture by value but then still create your int &p inside the lambda in order to get the syntax you want.

std::shared_ptr<int> o = std::shared_ptr<int>(new int(k));
return std::function<int(void)>([o]() -> int {int &p = *o; return p++;});

Note that when you add the second statement C++11 no longer allows you to omit the return type. (clang and I assume gcc have an extension that allows return type deduction even with multiple statement, but you should get a warning at least.)

free memory of c++ lambda when execute finished

You can capture lambda inside a lambda:

void foo()
{
    std::thread t(
        [func = [](std::string response) {
            printf("recv data: %s", response.c_str());
        }](){
        sleep(2);   // simulate a HTTP request.
        std::string ret = "http result";
        func(ret);
    });
    t.detach();
    // The foo function was finished. bug `func` lambda still in memory ?
}

or if it's supposed to be shared, you can use shared ownership semantics via shared_ptr and then capture it into the lambda by value in order to increase its reference count:

void foo()
{
    auto lambda = [](std::string response){
                printf("recv data: %s", response.c_str());
            };
    
    std::shared_ptr<decltype(lambda)> func{
        std::make_shared<decltype(lambda)>(std::move(lambda))
    };

    std::thread t([func]{
        sleep(2);   // simulate a HTTP request.
        std::string ret = "http result";
        (*func)(ret);
    });
    t.detach();
}

Of for non-capturing lambdas one can just turn it into a function pointer and don't really care

void foo()
{
    auto func_{
        [](std::string response){
            printf("recv data: %s", response.c_str());
        }
    };
    
    std::thread t([func=+func_]{ //note the + to turn lambda into function pointer
        sleep(2);   // simulate a HTTP request.
        std::string ret = "http result";
        (*func)(ret);
    });
    t.detach();

C++11 Lambda Implementation and Memory Model