C++: Why Pass-By-Value Is Generally More Efficient Than Pass-By-Reference for Built-In (I.E., C-Like) Types

C++: Why pass-by-value is generally more efficient than pass-by-reference for built-in (i.e., C-like) types

A compiler vendor would typically implement a reference as a pointer. Pointers tend to be the same size as or larger than many of the built-in types. For these built-in types the same amount of data would be passed whether you passed by value or by reference. In the function, in order to get the actual data, you would however need to dereference this internal pointer. This can add an instruction to the generated code, and you will also have two memory locations that may not be in cache. The difference won't be much - but it could be measured in tight loops.

A compiler vendor could choose to disregard const references (and sometimes also non-const references) when they're used on built-in types - all depending on the information available to the compiler when it deals with the function and its callers.

Pass by reference more expensive than pass by value

Prefer passing primitive types (int, char, float, ...) and POD structs that are cheap to copy (Point, complex) by value.

This will be more efficient than the indirection required when passing by reference.

See Boost's Call Traits.

The template class call_traits<T> encapsulates the "best" method to pass a parameter of some type T to or from a function, and consists of a collection of typedefs defined as in the table below. The purpose of call_traits is to ensure that problems like "references to references" never occur, and that parameters are passed in the most efficient manner possible.

Is it better in C++ to pass by value or pass by reference-to-const?

It used to be generally recommended best practice¹ to use pass by const ref for all types, except for builtin types (char, int, double, etc.), for iterators and for function objects (lambdas, classes deriving from std::*_function).

This was especially true before the existence of move semantics. The reason is simple: if you passed by value, a copy of the object had to be made and, except for very small objects, this is always more expensive than passing a reference.

With C++11, we have gained move semantics. In a nutshell, move semantics permit that, in some cases, an object can be passed “by value” without copying it. In particular, this is the case when the object that you are passing is an rvalue.

In itself, moving an object is still at least as expensive as passing by reference. However, in many cases a function will internally copy an object anyway — i.e. it will take ownership of the argument.²

In these situations we have the following (simplified) trade-off:

We can pass the object by reference, then copy internally.
We can pass the object by value.

“Pass by value” still causes the object to be copied, unless the object is an rvalue. In the case of an rvalue, the object can be moved instead, so that the second case is suddenly no longer “copy, then move” but “move, then (potentially) move again”.

For large objects that implement proper move constructors (such as vectors, strings …), the second case is then vastly more efficient than the first. Therefore, it is recommended to use pass by value if the function takes ownership of the argument, and if the object type supports efficient moving.

A historical note:

In fact, any modern compiler should be able to figure out when passing by value is expensive, and implicitly convert the call to use a const ref if possible.

In theory. In practice, compilers can’t always change this without breaking the function’s binary interface. In some special cases (when the function is inlined) the copy will actually be elided if the compiler can figure out that the original object won’t be changed through the actions in the function.

But in general the compiler can’t determine this, and the advent of move semantics in C++ has made this optimisation much less relevant.

¹ E.g. in Scott Meyers, Effective C++.

² This is especially often true for object constructors, which may take arguments and store them internally to be part of the constructed object’s state.

Pass by value faster than pass by reference

A good way to find out why there are any differences is to check the disassembly. Here are the results I got on my machine with Visual Studio 2012.

With optimization flags, both functions generate the same code:

009D1270 57                   push        edi  
009D1271 FF 15 D4 30 9D 00    call        dword ptr ds:[9D30D4h]  
009D1277 8B F8                mov         edi,eax  
009D1279 FF 15 D4 30 9D 00    call        dword ptr ds:[9D30D4h]  
009D127F 8B 0D 48 30 9D 00    mov         ecx,dword ptr ds:[9D3048h]  
009D1285 2B C7                sub         eax,edi  
009D1287 50                   push        eax  
009D1288 E8 A3 04 00 00       call        std::operator<<<std::char_traits<char> > (09D1730h)  
009D128D 8B C8                mov         ecx,eax  
009D128F FF 15 2C 30 9D 00    call        dword ptr ds:[9D302Ch]  
009D1295 33 C0                xor         eax,eax  
009D1297 5F                   pop         edi  
009D1298 C3                   ret

This is basically equivalent to:

int main ()
{
    clock_t start, stop ;
    start = clock () ;
    stop = clock () ;
    cout << "time: " << stop - start ;
    return 0 ;
}

Without optimization flags, you will probably get different results.

function (no optimizations):

00114890 55                   push        ebp  
00114891 8B EC                mov         ebp,esp  
00114893 81 EC C0 00 00 00    sub         esp,0C0h  
00114899 53                   push        ebx  
0011489A 56                   push        esi  
0011489B 57                   push        edi  
0011489C 8D BD 40 FF FF FF    lea         edi,[ebp-0C0h]  
001148A2 B9 30 00 00 00       mov         ecx,30h  
001148A7 B8 CC CC CC CC       mov         eax,0CCCCCCCCh  
001148AC F3 AB                rep stos    dword ptr es:[edi]  
001148AE 8B 45 08             mov         eax,dword ptr [ptr]  
001148B1 8B 08                mov         ecx,dword ptr [eax]  
001148B3 6B C9 05             imul        ecx,ecx,5  
001148B6 8B 55 08             mov         edx,dword ptr [ptr]  
001148B9 89 0A                mov         dword ptr [edx],ecx  
001148BB 5F                   pop         edi  
001148BC 5E                   pop         esi  
001148BD 5B                   pop         ebx  
001148BE 8B E5                mov         esp,ebp  
001148C0 5D                   pop         ebp  
001148C1 C3                   ret

function2 (no optimizations)

00FF4850 55                   push        ebp  
00FF4851 8B EC                mov         ebp,esp  
00FF4853 81 EC C0 00 00 00    sub         esp,0C0h  
00FF4859 53                   push        ebx  
00FF485A 56                   push        esi  
00FF485B 57                   push        edi  
00FF485C 8D BD 40 FF FF FF    lea         edi,[ebp-0C0h]  
00FF4862 B9 30 00 00 00       mov         ecx,30h  
00FF4867 B8 CC CC CC CC       mov         eax,0CCCCCCCCh  
00FF486C F3 AB                rep stos    dword ptr es:[edi]  
00FF486E 8B 45 08             mov         eax,dword ptr [val]  
00FF4871 6B C0 05             imul        eax,eax,5  
00FF4874 89 45 08             mov         dword ptr [val],eax  
00FF4877 5F                   pop         edi  
00FF4878 5E                   pop         esi  
00FF4879 5B                   pop         ebx  
00FF487A 8B E5                mov         esp,ebp  
00FF487C 5D                   pop         ebp  
00FF487D C3                   ret

Why is pass by value faster (in the no optimization case)?

Well, function() has two extra mov operations. Let's take a look at the first extra mov operation:

001148AE 8B 45 08             mov         eax,dword ptr [ptr]  
001148B1 8B 08                mov         ecx,dword ptr [eax]  
001148B3 6B C9 05             imul        ecx,ecx,5

Here we are dereferencing the pointer. In function2 (), we already have the value, so we avoid this step. We first move the address of the pointer into register eax. Then we move the value of the pointer into register ecx. Finally, we multiply the value by five.

Let's look at the second extra mov operation:

001148B3 6B C9 05             imul        ecx,ecx,5  
001148B6 8B 55 08             mov         edx,dword ptr [ptr]  
001148B9 89 0A                mov         dword ptr [edx],ecx

Now we are moving backwards. We have just finished multiplying the value by 5, and we need to place the value back into the memory address.

Because function2 () does not have to deal with referencing and dereferencing a pointer, it gets to skip these two extra mov operations.

When is a const reference better than pass-by-value in C++11?

The general rule of thumb for passing by value is when you would end up making a copy anyway. That is to say that rather than doing this:

void f(const std::vector<int>& x) {
    std::vector<int> y(x);
    // stuff
}

where you first pass a const-ref and then copy it, you should do this instead:

void f(std::vector<int> x) {
    // work with x instead
}

This has been partially true in C++03, and has become more useful with move semantics, as the copy may be replaced by a move in the pass-by-val case when the function is called with an rvalue.

Otherwise, when all you want to do is read the data, passing by const reference is still the preferred, efficient way.

What is the role of passing by reference when you do not modify variables?

When you pass by value you must make a copy of the object. Depending on what object is used to instantiate the template, this can be expensive. (or impossible. Some objects are not copyable)

is there any specific case where pass-by-value is preferred over pass-by-const-reference in C++?

Built-in types and small objects (such as STL iterators) should normally be passed by value.

This is partly to increase the compiler's opportunities for optimisation. It's surprisingly hard for the compiler to know if a reference parameter is aliasing another parameter or global - it may have to reread the state of the object from memory a number of times through the function, to be sure the value hasn't changed.

This is the reason for C99's restrict keyword (the same issue but with pointers).

Are there benefits of passing by pointer over passing by reference in C++?

A pointer can receive a NULL parameter, a reference parameter can not. If there's ever a chance that you could want to pass "no object", then use a pointer instead of a reference.

Also, passing by pointer allows you to explicitly see at the call site whether the object is passed by value or by reference:

// Is mySprite passed by value or by reference?  You can't tell 
// without looking at the definition of func()
func(mySprite);

// func2 passes "by pointer" - no need to look up function definition
func2(&mySprite);

C++: Why Pass-By-Value Is Generally More Efficient Than Pass-By-Reference for Built-In (I.E., C-Like) Types