Do C++11 Lambdas Capture Variables They Don't Use

Do c++11 lambdas capture variables they don't use?

Each variable expressly named in the capture list is captured. The default capture will only capture variables that are both (a) not expressly named in the capture list and (b) used in the body of the lambda expression. If a variable is not expressly named and you don't use the variable in the lambda expression, then the variable is not captured. In your example, my_huge_vector is not captured.

Per C++11 §5.1.2[expr.prim.lambda]/11:

If a lambda-expression has an associated capture-default and its compound-statement odr-uses this or a variable with automatic storage duration and the odr-used entity is not explicitly captured, then the odr-used entity is said to be implicitly captured.

Your lambda expression has an associated capture default: by default, you capture variables by value using the [=].

If and only if a variable is used (in the One Definition Rule sense of the term "used") is a variable implicitly captured. Since you don't use my_huge_vector at all in the body (the "compound statement") of the lambda expression, it is not implicitly captured.

To continue with §5.1.2/14

An entity is captured by copy if
it is implicitly captured and the capture-default is = or if
it is explicitly captured with a capture that does not include an &.

Since your my_huge_vector is not implicitly captured and it is not explicitly captured, it is not captured at all, by copy or by reference.

How much does a C++11 lambda capture actually capture?

According to http://en.cppreference.com/w/cpp/language/lambda, the capture list (the part in the square braces) is:

a comma-separated list of zero or more captures, optionally beginning
with a capture-default. Capture list can be passed as follows [...]:
[a,&b] where a is captured by value and b is captured by reference.
[this] captures the this pointer by value
[&] captures all automatic variables odr-used in the body of
the lambda by reference
[=] captures all automatic variables odr-used
in the body of the lambda by value
[] captures nothing

This means that only the automatic (scope-lifetime) variables used in the body of the lambda will be captured.

I can't see why capturing everything with [&] would be more expensive than individual captures, but one advantage of listing out the captures explicitly is that there's no chance of capturing something you didn't expect.

On the other hand, capturing with [=] could prove expensive since it will make copies of everything. Perhaps that's what your coworker was referring to.

In C++11, when are a lambda expression's bound variables supposed to be captured-by-value?

With a lambda expression, the bound variables are captured at the time of declaration.

This sample will make it very clear: https://ideone.com/Ly38P

 std::function<int()> dowork()
 {
      int answer = 42;
      auto lambda = [answer] () { return answer; };

      // can do what we want
      answer = 666;
      return lambda;
 }

 int main()
 {
      auto ll = dowork();
      return ll(); // 42
 }

It is clear that the capture must be happening before the invocation, since the variables being captured don't even exist (not in scope, neither in lifetime) anymore at a later time.

Why capture lambda does not working in c++?

auto kitten = [=] () {return g+1;}

This lambda doesn't capture anything at all. It's nearly the same as just

int kitten() { return g+1; }

Only local variables can be captured, and there are no local variables visible in the scope of the kitten definition. Note that [=] or [&] don't mean "capture everything", they mean "capture anything necessary", and a global variable is never necessary (or possible) to capture in a lambda, since the meaning of that variable name is always the same no matter when the lambda body is evaluated.

auto cat = [g=g] () {return g+1;}

Here's an init-capture, which is similar to creating a local variable and immediately capturing it. The g before the equal sign declares the init-capture, and the g after the equal sign specifies how to initialize it. Unlike most declarators (see below), the g variable created here is not in scope in its own initializer, so the g after the equal sign means the global variable ::g. So the code is similar to:

auto make_cat()
{
    int & g = ::g;
    return [g]() { return g+1; }
}
auto cat = make_cat();

auto kitten = [] () {int g  = g; return g+1;}

This code has a mistake not really related to lambdas. In the local variable definition int g = g;, the declared variable before the equal sign is in scope during the initializer after the equal sign. So g is initialized with its own indeterminate value. Adding one to that indeterminate value is undefined behavior, so the result is not predictable.

C++ lambda capture this vs capture by reference

For the specific example you've provided, capturing by this is what you want. Conceptually, capturing this by reference doesn't make a whole lot of sense, since you can't change the value of this, you can only use it as a pointer to access members of the class or to get the address of the class instance. Inside your lambda function, if you access things which implicitly use the this pointer (e.g. you call a member function or access a member variable without explicitly using this), the compiler treats it as though you had used this anyway. You can list multiple captures too, so if you want to capture both members and local variables, you can choose independently whether to capture them by reference or by value. The following article should give you a good grounding in lambdas and captures:

https://crascit.com/2015/03/01/lambdas-for-lunch/

Also, your example uses std::function as the return type through which the lambda is passed back to the caller. Be aware that std::function isn't always as cheap as you may think, so if you are able to use a lambda directly rather than having to wrap it in a std::function, it will likely be more efficient. The following article, while not directly related to your original question, may still give you some useful material relating to lambdas and std::function (see the section An alternative way to store the function object, but the article in general may be of interest):

https://crascit.com/2015/06/03/on-leaving-scope-part-2/

Is it a bad practice to always capture all in a lambda expression?

Performance

The standard guarantees that if you do a default capture, the only variables that will be captured by that default capture from the surrounding environment are those that you actually use inside the lambda.

As such, specifying individual variables to capture acts as documentation of what you expect to use, but should never affect performance.

For anybody who cares, the exact wording from the standard is (§5.1.2/11, 12):

11 If a lambda-expression has an associated capture-default and its compound-statement odr-uses (3.2) this or a variable with automatic storage duration and the odr-used entity is not explicitly captured, then the odr-used entity is said to be implicitly captured; such entities shall be declared within the reaching scope of the lambda expression. [Note elided]
12 An entity is captured if it is captured explicitly or implicitly. [...]

Readability

Opinions seems split on this point. Some people like the documentation of intent that comes with explicitly specifying what you capture. Others see this as visual noise that mostly gets in the way of understanding what the lambda is/does.

Personally, I think it's usually kind of irrelevant--most lambadas are (and should be) quite small and simple, and only need to capture a few things. As such, explicit documentation of what they capture doesn't add much noise, but also doesn't add much to making the code more understandable. If explicit capture makes a huge difference in either direction, that may hint at there being other problems with the code.

Summary

An implicit capture specification ([=] or [&]) will only capture variables that are used in the lambda, so implicit vs. explicit capture should never affect performance.

I don't think there's a simple, clear answer with respect to readability.

What is a lambda expression in C++11?

The problem

C++ includes useful generic functions like std::for_each and std::transform, which can be very handy. Unfortunately they can also be quite cumbersome to use, particularly if the functor you would like to apply is unique to the particular function.

#include <algorithm>
#include <vector>

namespace {
  struct f {
    void operator()(int) {
      // do something
    }
  };
}

void func(std::vector<int>& v) {
  f f;
  std::for_each(v.begin(), v.end(), f);
}

If you only use f once and in that specific place it seems overkill to be writing a whole class just to do something trivial and one off.

In C++03 you might be tempted to write something like the following, to keep the functor local:

void func2(std::vector<int>& v) {
  struct {
    void operator()(int) {
       // do something
    }
  } f;
  std::for_each(v.begin(), v.end(), f);
}

however this is not allowed, f cannot be passed to a template function in C++03.

The new solution

C++11 introduces lambdas allow you to write an inline, anonymous functor to replace the struct f. For small simple examples this can be cleaner to read (it keeps everything in one place) and potentially simpler to maintain, for example in the simplest form:

void func3(std::vector<int>& v) {
  std::for_each(v.begin(), v.end(), [](int) { /* do something here*/ });
}

Lambda functions are just syntactic sugar for anonymous functors.

Return types

In simple cases the return type of the lambda is deduced for you, e.g.:

void func4(std::vector<double>& v) {
  std::transform(v.begin(), v.end(), v.begin(),
                 [](double d) { return d < 0.00001 ? 0 : d; }
                 );
}

however when you start to write more complex lambdas you will quickly encounter cases where the return type cannot be deduced by the compiler, e.g.:

void func4(std::vector<double>& v) {
    std::transform(v.begin(), v.end(), v.begin(),
        [](double d) {
            if (d < 0.0001) {
                return 0;
            } else {
                return d;
            }
        });
}

To resolve this you are allowed to explicitly specify a return type for a lambda function, using -> T:

void func4(std::vector<double>& v) {
    std::transform(v.begin(), v.end(), v.begin(),
        [](double d) -> double {
            if (d < 0.0001) {
                return 0;
            } else {
                return d;
            }
        });
}

"Capturing" variables

So far we've not used anything other than what was passed to the lambda within it, but we can also use other variables, within the lambda. If you want to access other variables you can use the capture clause (the [] of the expression), which has so far been unused in these examples, e.g.:

void func5(std::vector<double>& v, const double& epsilon) {
    std::transform(v.begin(), v.end(), v.begin(),
        [epsilon](double d) -> double {
            if (d < epsilon) {
                return 0;
            } else {
                return d;
            }
        });
}

You can capture by both reference and value, which you can specify using & and = respectively:

[&epsilon, zeta] captures epsilon by reference and zeta by value
[&] captures all variables used in the lambda by reference
[=] captures all variables used in the lambda by value
[&, epsilon] captures all variables used in the lambda by reference but captures epsilon by value
[=, &epsilon] captures all variables used in the lambda by value but captures epsilon by reference

The generated operator() is const by default, with the implication that captures will be const when you access them by default. This has the effect that each call with the same input would produce the same result, however you can mark the lambda as mutable to request that the operator() that is produced is not const.

Why can we avoid specifying the type in a lambda capture?

From cppreference:

A capture with an initializer acts as if it declares and explicitly captures a variable declared with type auto, whose declarative region is the body of the lambda expression (that is, it is not in scope within its initializer), [...]

Lambdas used the opportunity of a syntax that was anyhow fresh and new to get some things right and allow a nice and terse syntax. For example lambdas operator() is const and you need to opt-out via mutable instead of the default non-const of member functions.

No auto in this place does not create any issues or ambiguities. The example from cppreference:

int x = 4;
auto y = [&r = x, x = x + 1]()->int
    {
        r += 2;
        return x * x;
    }(); // updates ::x to 6 and initializes y to 25.

From the lambda syntax it is clear that &r is a by reference capture initialized by x and x is a by value capture initialized by x + 1. The types can be deduced from the initializers. There would be no gain in requiring to add auto.

In my experience n could have been just declared inside the lambda body with auto or int as datatype. Isnt it?

Yes, but then it would need to be static. This produces the same output in your example:

std::generate(v.begin(), v.end(), [] () mutable { 
    static int n = 0;
    return n++; });

However, the capture can be considered cleaner than the function local static.

Why can I captureless-capture an int variable, but not a non-capturing lambda?

This has to do with odr-use.

First, from [basic.def.odr]/10:

A local entity is odr-usable in a scope if:
either the local entity is not *this, or an enclosing class or non-lambda function parameter scope exists and, if the innermost such scope is a function parameter scope, it corresponds to a non-static member function, and
for each intervening scope ([basic.scope.scope]) between the point at which the entity is introduced and the scope (where *this is considered to be introduced within the innermost enclosing class or non-lambda function definition scope), either:
the intervening scope is a block scope, or
the intervening scope is the function parameter scope of a lambda-expression that has a simple-capture naming the entity or has a capture-default, and the block scope of the lambda-expression is also an intervening scope.
If a local entity is odr-used in a scope in which it is not odr-usable, the program is ill-formed.

So in this example, a is odr-usable but b is not:

void foo() {
    constexpr const int b { 123 };
    constexpr const auto l1 = [](int a) { return b * a; };
    (void) l1;
}

And in this example, similarly, the a and c are odr-usable, but neither b or nor l1 are.

void foo() {
    constexpr const int b { 123 };
    constexpr const auto l1 = [](int a) { return b * a; };
    (void) [](int c) { return l1(c) * c; };
}

But the rule isn't just "not odr-usable", it's also "odr-used". Which one(s) of these are odr-used? That's [basic.def.odr]/5:

A variable is named by an expression if the expression is an id-expression that denotes it. A variable x whose name appears as a potentially-evaluated expression E is odr-used by E unless
x is a reference that is usable in constant expressions ([expr.const]), or
x is a variable of non-reference type that is usable in constant expressions and has no mutable subobjects, and E is an element of the set of potential results of an expression of non-volatile-qualified non-class type to which the lvalue-to-rvalue conversion ([conv.lval]) is applied, or
x is a variable of non-reference type, and E is an element of the set of potential results of a discarded-value expression ([expr.context]) to which the lvalue-to-rvalue conversion is not applied.

For the b * a case, b is "a variable of non-reference type that is usable in constant expressions" and what we're doing with it is applying "the lvalue-to-rvalue conversion". That's the second bullet exception to the rule, so b is not odr-used, so we don't have the odr-used but not odr-usable problem.

For the l1(c) case, l1 is also "a variable of non-reference type that is usable in constant expressions"... but we're not doing an lvalue-to-rvalue conversion on it. We're invoking the call operator. So we don't hit the exception, so we are odr-using l1... but it's not odr-usable, which makes this ill-formed.

The solution here is to either capture l1 (making it odr-usable) or make it static or global (making the rule irrelevant since l1 wouldn't be a local entity anymore).

Do C++11 Lambdas Capture Variables They Don't Use