Std::Bind VS Lambda Performance

std::bind vs lambda performance

I assume that lambda cannot be that better than bind.

That's quite a preconception.

Lambdas are tied into the compiler internals, so extra optimization opportunities may be found. Moreover, they're designed to avoid inefficiency.

However, there are probably no compiler optimization tricks happening here. The likely culprit is the argument to bind, bind(&decltype(result)::eval, &result). You are passing a pointer-to-member-function (PTMF) and an object. Unlike the lambda type, the PTMF does not capture what function actually gets called; it only contains the function signature (parameter and return types). The slow loop is using an indirect branch function call, because the compiler failed to resolve the function pointer through constant propagation.

If you rename the member eval() to operator () () and get rid of bind, then the explicit object will essentially behave like the lambda and the performance difference should disappear.

Efficiency of std::bind vs lambda

Is there any difference in the performance of these approaches?

Perhaps, perhaps not; as commenters suggest - profile to check, or look at the assemby code you get (e.g. using the GodBolt Compiler Explorer). But you're asking the wrong question, for two main reasons:

You should probably not be passing lambda's, nor bind() results, around in the part of your code that's performance-critical.
You should definitely avoid invoking arbitrary functions via function pointer or std::function variables in performance-critical areas of your code (except if this can be de-virtualized and inlined by the compiler).

and one mind reason:

Lambdas (and std::bind()'s) are usable, and useful, without being wrapped in std::function; this wrapper has its own performance penalty, so you would only be comparing one way of using these constructs.

Bottom line recommendation: Just use Lambdas. They're cleaner, easier to understand, cheaper to compile, and more flexible syntactically. So don't worry and be happy :-) . And in performance-critical code, either use Lambda's without std::function, or don't use any of the two.

Bind Vs Lambda?

As you said, bind and lambdas don't quite exactly aim at the same goal.

For instance, for using and composing STL algorithms, lambdas are clear winners, IMHO.

To illustrate, I remember a really funny answer, here on stack overflow, where someone asked for ideas of hex magic numbers, (like 0xDEADBEEF, 0xCAFEBABE, 0xDEADDEAD etc.) and was told that if he were a real C++ programmer he would simply have download a list of English words and use a simple one-liner of C++ :)

#include <iterator>
#include <string>
#include <algorithm>
#include <iostream>
#include <fstream>
#include <boost/lambda/lambda.hpp>
#include <boost/lambda/bind.hpp>

int main()
{
    using namespace boost::lambda;
    std::ifstream ifs("wordsEn.txt");
    std::remove_copy_if(
        std::istream_iterator<std::string>(ifs),
        std::istream_iterator<std::string>(),
        std::ostream_iterator<std::string>(std::cout, "\n"),
        bind(&std::string::size, _1) != 8u
            ||
        bind(
            static_cast<std::string::size_type (std::string::*)(const char*, std::string::size_type) const>(
                &std::string::find_first_not_of
            ),
            _1,
            "abcdef",
            0u
        ) != std::string::npos
    );
}

This snippet, in pure C++98, open the English words file, scan each word and print only those of length 8 with 'a', 'b', 'c', 'd', 'e' or 'f' letters.

Now, turn on C++0X and lambda :

#include <iterator>
#include <string>
#include <algorithm>
#include <iostream>
#include <fstream>

int main()
{
 std::ifstream ifs("wordsEn.txt");
 std::copy_if(
    std::istream_iterator<std::string>(ifs),
    std::istream_iterator<std::string>(),
    std::ostream_iterator<std::string>(std::cout, "\n"),
    [](const std::string& s)
    {
       return (s.size() == 8 && 
               s.find_first_not_of("abcdef") == std::string::npos);
    }
 );
}

This is still a bit heavy to read (mainly because of the istream_iterator business), but a lot simpler than the bind version :)

Why use std::bind over lambdas in C++14?

Scott Meyers gave a talk about this. This is what I remember:

In C++14 there is nothing useful bind can do that can't also be done with lambdas.

In C++11 however there are some things that can't be done with lambdas:

You can't move the variables while capturing when creating the lambdas. Variables are always captured as lvalues. For bind you can write:
```
auto f1 = std::bind(f, 42, _1, std::move(v));
```
Expressions can't be captured, only identifiers can. For bind you can write:
```
auto f1 = std::bind(f, 42, _1, a + b);
```
Overloading arguments for function objects. This was already mentioned in the question.
Impossible to perfect-forward arguments

In C++14 all of these possible.

Move example:

auto f1 = [v = std::move(v)](auto arg) { f(42, arg, std::move(v)); };

Expression example:

auto f1 = [sum = a + b](auto arg) { f(42, arg, sum); };

See question

Perfect forwarding: You can write

auto f1 = [=](auto&& arg) { f(42, std::forward<decltype(arg)>(arg)); };

Some disadvantages of bind:

Bind binds by name and as a result if you have multiple functions with the same name (overloaded functions) bind doesn't know which one to use. The following example won't compile, while lambdas wouldn't have a problem with it:
```
void f(int); void f(char); auto f1 = std::bind(f, _1, 42);
```
When using bind functions are less likely to be inlined

On the other hand lambdas might theoretically generate more template code than bind. Since for each lambda you get a unique type. For bind it is only when you have different argument types and a different function (I guess that in practice however it doesn't happen very often that you bind several time with the same arguments and function).

What Jonathan Wakely mentioned in his answer is actually one more reason not to use bind. I can't see why you would want to silently ignore arguments.

Speed of bound lambda (via std::function) vs operator() of functor struct

The difference between the two cases is fundamentally that with the functor, the compiler knows exactly what will be called at compile time, so the function call can be inlined. Lambdas, interestingly enough, also have a unique type. This means again, when you use a lambda, at compile type (since the compiler must know all types) the function being called is already known, so inlining can occur. On the other hand, a function pointer is type based only on its signature. The signature must be known so that it can be called to and returned from appropriately, but other than that a function pointer can point to anything at run-time, as far as the compiler is concerned. The same is true about std::function.

When you wrap the lambda in a std::function, you erase the type of the lambda from a compiler perspective. If this sounds weird/impossible, think of it this way: since a std::function of a fixed type can wrap any callable with the same signature, the compiler has no way of knowing that some other instruction won't come alone and change what the std::function is wrapping.

This link, http://goo.gl/60QFjH, shows what I mean (by the way, the godbolt page is very very handy, I suggest getting acquainted with it). I wrote three examples here similar to yours. The first uses std::function wrapping a lambda, the second a functor, the third a naked lambda (unwrapped), using decltype. You can look at the assembly on the right and see that both of the latter two get inlined, but not the first.

My guess is that you can use lambdas to do exactly the same thing. Instead of bind, you can just do value based capture with the lambdas of a and b. Each time you push back the lambda into the vector, modify a and b appropriately, and voila.

Stylistically though, I actually strongly feel you should use the struct. It's much clearer what's going on. The mere fact that you are seeming to want to capture a and b in one place, and test against c in another, means that this is used in your code in not just one place. In exchange for like, 2 extra lines of code, you get something more readable, easier to debug, and more extensible.

Why use `std::bind_front` over lambdas in C++20?

bind_front binds the first X parameters, but if the callable calls for more parameters, they get tacked onto the end. This makes bind_front very readable when you're only binding the first few parameters of a function.

The obvious example would be creating a callable for a member function that is bound to a specific instance:

type *instance = ...;

//lambda
auto func = [instance](auto &&... args) -> decltype(auto) {return instance->function(std::forward<decltype(args)>(args)...);}

//bind
auto func = std::bind_front(&type::function, instance);

The bind_front version is a lot less noisy. It gets right to the point, having exactly 3 named things: bind_front, the member function to be called, and the instance on which it will be called. And that's all that our situation calls for: a marker to denote that we're creating a binding of the first parameters of a function, the function to be bound, and the parameter we want to bind. There is no extraneous syntax or other details.

By contrast, the lambda has a lot of stuff we just don't care about at this location. The auto... args bit, the std::forward stuff, etc. It's a bit harder to figure out what it's doing, and it's definitely much longer to read.

Note that bind_front doesn't allow bind's placeholders at all, so it's not really a replacement. It's more a shorthand for the most useful forms of bind.

Use std::bind and store into a std:: function

The placeholders _1, _2, _3... are placed in namespace std::placeholders, you should qualify it like

Callback c = std::bind(handler, std::placeholders::_1);

using namespace std::placeholders;
Callback c = std::bind(handler, _1);

C++0x lambda wrappers vs. bind for passing member functions

As far as readability and style are concerned, I think std::bind looks cleaner for this purpose, actually. std::placeholders does not have anything other than _[1-29] for use with std::bind as far as I know, so I think it is fine to just use "using namespace std::placeholders;"

As for performance, I tried disassembling some test functions:

#include <functional>

void foo (int, int, int);

template <typename T>
void test_functor (const T &functor)
{
    functor (1, 2, 3);
}

template <typename T>
void test_functor_2 (const T &functor)
{
    functor (2, 3);
}

void test_lambda ()
{
    test_functor ([] (int a, int b, int c) {foo (a, b, c);});
}

void test_bind ()
{
    using namespace std::placeholders;
    test_functor (std::bind (&foo, _1, _2, _3));
}

void test_lambda (int a)
{
    test_functor_2 ([=] (int b, int c) {foo (a, b, c);});
}

void test_bind (int a)
{
    using namespace std::placeholders;
    test_functor_2 (std::bind (&foo, a, _1, _2));
}

When foo() was not defined in the same translation unit, the assembly outputs were more or less the same for both test_lambda and test_bind:

00000000004004d0 <test_lambda()>:
  4004d0:   ba 03 00 00 00          mov    $0x3,%edx
  4004d5:   be 02 00 00 00          mov    $0x2,%esi
  4004da:   bf 01 00 00 00          mov    $0x1,%edi
  4004df:   e9 dc ff ff ff          jmpq   4004c0 <foo(int, int, int)>
  4004e4:   66 66 66 2e 0f 1f 84    data32 data32 nopw %cs:0x0(%rax,%rax,1)
  4004eb:   00 00 00 00 00 

00000000004004f0 <test_bind()>:
  4004f0:   ba 03 00 00 00          mov    $0x3,%edx
  4004f5:   be 02 00 00 00          mov    $0x2,%esi
  4004fa:   bf 01 00 00 00          mov    $0x1,%edi
  4004ff:   e9 bc ff ff ff          jmpq   4004c0 <foo(int, int, int)>
  400504:   66 66 66 2e 0f 1f 84    data32 data32 nopw %cs:0x0(%rax,%rax,1)
  40050b:   00 00 00 00 00 

0000000000400510 <test_lambda(int)>:
  400510:   ba 03 00 00 00          mov    $0x3,%edx
  400515:   be 02 00 00 00          mov    $0x2,%esi
  40051a:   e9 a1 ff ff ff          jmpq   4004c0 <foo(int, int, int)>
  40051f:   90                      nop

0000000000400520 <test_bind(int)>:
  400520:   ba 03 00 00 00          mov    $0x3,%edx
  400525:   be 02 00 00 00          mov    $0x2,%esi
  40052a:   e9 91 ff ff ff          jmpq   4004c0 <foo(int, int, int)>
  40052f:   90                      nop

However, when the body of foo was included into the same translation unit, only the lambda had its contents inlined (by GCC 4.6):

00000000004008c0 <foo(int, int, int)>:
  4008c0:   53                      push   %rbx
  4008c1:   ba 04 00 00 00          mov    $0x4,%edx
  4008c6:   be 2c 0b 40 00          mov    $0x400b2c,%esi
  4008cb:   bf 60 10 60 00          mov    $0x601060,%edi
  4008d0:   e8 9b fe ff ff          callq  400770 <std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)@plt>
  4008d5:   48 8b 05 84 07 20 00    mov    0x200784(%rip),%rax        # 601060 <std::cout@@GLIBCXX_3.4>
  4008dc:   48 8b 40 e8             mov    -0x18(%rax),%rax
  4008e0:   48 8b 98 50 11 60 00    mov    0x601150(%rax),%rbx
  4008e7:   48 85 db                test   %rbx,%rbx
  4008ea:   74 3c                   je     400928 <foo(int, int, int)+0x68>
  4008ec:   80 7b 38 00             cmpb   $0x0,0x38(%rbx)
  4008f0:   74 1e                   je     400910 <foo(int, int, int)+0x50>
  4008f2:   0f b6 43 43             movzbl 0x43(%rbx),%eax
  4008f6:   bf 60 10 60 00          mov    $0x601060,%edi
  4008fb:   0f be f0                movsbl %al,%esi
  4008fe:   e8 8d fe ff ff          callq  400790 <std::basic_ostream<char, std::char_traits<char> >::put(char)@plt>
  400903:   5b                      pop    %rbx
  400904:   48 89 c7                mov    %rax,%rdi
  400907:   e9 74 fe ff ff          jmpq   400780 <std::basic_ostream<char, std::char_traits<char> >::flush()@plt>
  40090c:   0f 1f 40 00             nopl   0x0(%rax)
  400910:   48 89 df                mov    %rbx,%rdi
  400913:   e8 08 fe ff ff          callq  400720 <std::ctype<char>::_M_widen_init() const@plt>
  400918:   48 8b 03                mov    (%rbx),%rax
  40091b:   be 0a 00 00 00          mov    $0xa,%esi
  400920:   48 89 df                mov    %rbx,%rdi
  400923:   ff 50 30                callq  *0x30(%rax)
  400926:   eb ce                   jmp    4008f6 <foo(int, int, int)+0x36>
  400928:   e8 e3 fd ff ff          callq  400710 <std::__throw_bad_cast()@plt>
  40092d:   0f 1f 00                nopl   (%rax)

0000000000400930 <test_lambda()>:
  400930:   53                      push   %rbx
  400931:   ba 04 00 00 00          mov    $0x4,%edx
  400936:   be 2c 0b 40 00          mov    $0x400b2c,%esi
  40093b:   bf 60 10 60 00          mov    $0x601060,%edi
  400940:   e8 2b fe ff ff          callq  400770 <std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)@plt>
  400945:   48 8b 05 14 07 20 00    mov    0x200714(%rip),%rax        # 601060 <std::cout@@GLIBCXX_3.4>
  40094c:   48 8b 40 e8             mov    -0x18(%rax),%rax
  400950:   48 8b 98 50 11 60 00    mov    0x601150(%rax),%rbx
  400957:   48 85 db                test   %rbx,%rbx
  40095a:   74 3c                   je     400998 <test_lambda()+0x68>
  40095c:   80 7b 38 00             cmpb   $0x0,0x38(%rbx)
  400960:   74 1e                   je     400980 <test_lambda()+0x50>
  400962:   0f b6 43 43             movzbl 0x43(%rbx),%eax
  400966:   bf 60 10 60 00          mov    $0x601060,%edi
  40096b:   0f be f0                movsbl %al,%esi
  40096e:   e8 1d fe ff ff          callq  400790 <std::basic_ostream<char, std::char_traits<char> >::put(char)@plt>
  400973:   5b                      pop    %rbx
  400974:   48 89 c7                mov    %rax,%rdi
  400977:   e9 04 fe ff ff          jmpq   400780 <std::basic_ostream<char, std::char_traits<char> >::flush()@plt>
  40097c:   0f 1f 40 00             nopl   0x0(%rax)
  400980:   48 89 df                mov    %rbx,%rdi
  400983:   e8 98 fd ff ff          callq  400720 <std::ctype<char>::_M_widen_init() const@plt>
  400988:   48 8b 03                mov    (%rbx),%rax
  40098b:   be 0a 00 00 00          mov    $0xa,%esi
  400990:   48 89 df                mov    %rbx,%rdi
  400993:   ff 50 30                callq  *0x30(%rax)
  400996:   eb ce                   jmp    400966 <test_lambda()+0x36>
  400998:   e8 73 fd ff ff          callq  400710 <std::__throw_bad_cast()@plt>
  40099d:   0f 1f 00                nopl   (%rax)

00000000004009a0 <test_bind()>:
  4009a0:   ba 03 00 00 00          mov    $0x3,%edx
  4009a5:   be 02 00 00 00          mov    $0x2,%esi
  4009aa:   bf 01 00 00 00          mov    $0x1,%edi
  4009af:   e9 0c ff ff ff          jmpq   4008c0 <foo(int, int, int)>
  4009b4:   66 66 66 2e 0f 1f 84    data32 data32 nopw %cs:0x0(%rax,%rax,1)
  4009bb:   00 00 00 00 00 

00000000004009c0 <test_lambda(int)>:
  4009c0:   53                      push   %rbx
  4009c1:   ba 04 00 00 00          mov    $0x4,%edx
  4009c6:   be 2c 0b 40 00          mov    $0x400b2c,%esi
  4009cb:   bf 60 10 60 00          mov    $0x601060,%edi
  4009d0:   e8 9b fd ff ff          callq  400770 <std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)@plt>
  4009d5:   48 8b 05 84 06 20 00    mov    0x200684(%rip),%rax        # 601060 <std::cout@@GLIBCXX_3.4>
  4009dc:   48 8b 40 e8             mov    -0x18(%rax),%rax
  4009e0:   48 8b 98 50 11 60 00    mov    0x601150(%rax),%rbx
  4009e7:   48 85 db                test   %rbx,%rbx
  4009ea:   74 3c                   je     400a28 <test_lambda(int)+0x68>
  4009ec:   80 7b 38 00             cmpb   $0x0,0x38(%rbx)
  4009f0:   74 1e                   je     400a10 <test_lambda(int)+0x50>
  4009f2:   0f b6 43 43             movzbl 0x43(%rbx),%eax
  4009f6:   bf 60 10 60 00          mov    $0x601060,%edi
  4009fb:   0f be f0                movsbl %al,%esi
  4009fe:   e8 8d fd ff ff          callq  400790 <std::basic_ostream<char, std::char_traits<char> >::put(char)@plt>
  400a03:   5b                      pop    %rbx
  400a04:   48 89 c7                mov    %rax,%rdi
  400a07:   e9 74 fd ff ff          jmpq   400780 <std::basic_ostream<char, std::char_traits<char> >::flush()@plt>
  400a0c:   0f 1f 40 00             nopl   0x0(%rax)
  400a10:   48 89 df                mov    %rbx,%rdi
  400a13:   e8 08 fd ff ff          callq  400720 <std::ctype<char>::_M_widen_init() const@plt>
  400a18:   48 8b 03                mov    (%rbx),%rax
  400a1b:   be 0a 00 00 00          mov    $0xa,%esi
  400a20:   48 89 df                mov    %rbx,%rdi
  400a23:   ff 50 30                callq  *0x30(%rax)
  400a26:   eb ce                   jmp    4009f6 <test_lambda(int)+0x36>
  400a28:   e8 e3 fc ff ff          callq  400710 <std::__throw_bad_cast()@plt>
  400a2d:   0f 1f 00                nopl   (%rax)

0000000000400a30 <test_bind(int)>:
  400a30:   ba 03 00 00 00          mov    $0x3,%edx
  400a35:   be 02 00 00 00          mov    $0x2,%esi
  400a3a:   e9 81 fe ff ff          jmpq   4008c0 <foo(int, int, int)>
  400a3f:   90                      nop

Out of curiosity, I redid the test using GCC 4.7, and found that with 4.7, both tests were inlined in the same manner.

My conclusion is that the performance should be the same in either case, but you might want to check your compiler output if it matters.

What is the performance overhead of std::function?

You can find information from the boost's reference materials: How much overhead does a call through boost::function incur? and Performance

This doesn't determine "yes or no" to boost function. The performance drop may be well acceptable given program's requirements. More often than not, parts of a program are not performance-critical. And even then it may be acceptable. This is only something you can determine.

As to the standard library version, the standard only defines an interface. It is entirely up to individual implementations to make it work. I suppose a similar implementation to boost's function would be used.

Std::Bind VS Lambda Performance