Performance Cost of Passing by Value VS. by Reference or by Pointer

Performance cost of passing by value vs. by reference or by pointer?

It depends on what you mean by "cost", and properties of the host system (hardware, operating system) with respect to operations.

If your cost measure is memory usage, then the calculation of cost is obvious - add up the sizes of whatever is being copied.

If your measure is execution speed (or "efficiency") then the game is different. Hardware (and operating systems and compiler) tend to be optimised for performance of operations on copying things of particular sizes, by virtue of dedicated circuits (machine registers, and how they are used).

It is common, for example, for a machine to have an architecture (machine registers, memory architecture, etc) which result in a "sweet spot" - copying variables of some size is most "efficient", but copying larger OR SMALLER variables is less so. Larger variables will cost more to copy, because there may be a need to do multiple copies of smaller chunks. Smaller ones may also cost more, because the compiler needs to copy the smaller value into a larger variable (or register), do the operations on it, then copy the value back.

Examples with floating point include some cray supercomputers, which natively support double precision floating point (aka double in C++), and all operations on single precision (aka float in C++) are emulated in software. Some older 32-bit x86 CPUs also worked internally with 32-bit integers, and operations on 16-bit integers required more clock cycles due to translation to/from 32-bit (this is not true with more modern 32-bit or 64-bit x86 processors, as they allow copying 16-bit integers to/from 32-bit registers, and operating on them, with fewer such penalties).

It is a bit of a no-brainer that copying a very large structure by value will be less efficient than creating and copying its address. But, because of factors like the above, the cross-over point between "best to copy something of that size by value" and "best to pass its address" is less clear.

Pointers and references tend to be implemented in a similar manner (e.g. pass by reference can be implemented in the same way as passing a pointer) but that is not guaranteed.

The only way to be sure is to measure it. And realise that the measurements will vary between systems.

Which is faster? Pass by reference vs pass by value C++

The answer to "Which is faster" is usually "It depends".

If instead of passing four bytes of data you are passing an eight byte pointer to data, then you can't really expect that to make things faster. If instead of passing 100 bytes of data you are passing an eight byte pointer to data, that's different.

But now the function doesn't have the data, it only has a reference. So whenever it needs to read the data, it has to do that indirectly through the reference. That takes longer. If you pass a 100 byte object and only read eight byte of it, you still are likely to win. But if you actually read all the data, and maybe multiple times, then it could easily be faster to pass the value even for large objects.

The real difference comes when you pass an object, and passing by value means a more or less complex constructor will be called. Passing by reference means no constructor. But int has no constructor anyway.

And then there is optimisation. Passing by value means the compiler knows your function is the only one with access to the data. Pass by reference means the data could be anywhere. If you have two int& parameters, I could pass the some int twice. So increasing row might increase pos. Or it might not. That kills optimisations.

And then there is the rule of optimisation: "Measure it". You measured it and found what's faster. Sometimes things are faster or slower for no good reason whatsoever.

Is it better in C++ to pass by value or pass by reference-to-const?

It used to be generally recommended best practice1 to use pass by const ref for all types, except for builtin types (char, int, double, etc.), for iterators and for function objects (lambdas, classes deriving from std::*_function).

This was especially true before the existence of move semantics. The reason is simple: if you passed by value, a copy of the object had to be made and, except for very small objects, this is always more expensive than passing a reference.

With C++11, we have gained move semantics. In a nutshell, move semantics permit that, in some cases, an object can be passed “by value” without copying it. In particular, this is the case when the object that you are passing is an rvalue.

In itself, moving an object is still at least as expensive as passing by reference. However, in many cases a function will internally copy an object anyway — i.e. it will take ownership of the argument.2

In these situations we have the following (simplified) trade-off:

  1. We can pass the object by reference, then copy internally.
  2. We can pass the object by value.

“Pass by value” still causes the object to be copied, unless the object is an rvalue. In the case of an rvalue, the object can be moved instead, so that the second case is suddenly no longer “copy, then move” but “move, then (potentially) move again”.

For large objects that implement proper move constructors (such as vectors, strings …), the second case is then vastly more efficient than the first. Therefore, it is recommended to use pass by value if the function takes ownership of the argument, and if the object type supports efficient moving.


A historical note:

In fact, any modern compiler should be able to figure out when passing by value is expensive, and implicitly convert the call to use a const ref if possible.

In theory. In practice, compilers can’t always change this without breaking the function’s binary interface. In some special cases (when the function is inlined) the copy will actually be elided if the compiler can figure out that the original object won’t be changed through the actions in the function.

But in general the compiler can’t determine this, and the advent of move semantics in C++ has made this optimisation much less relevant.


1 E.g. in Scott Meyers, Effective C++.

2 This is especially often true for object constructors, which may take arguments and store them internally to be part of the constructed object’s state.

c++ passing arguments by reference and pointer

The pointer and the reference methods should be quite comparable (both in speed, memory usage and generated code).

Passing a class directly forces the compiler to duplicate memory and put a copy of the bar object on the stack. What's worse, in C++ there are all sort of nasty bits (the default copy constructor and whatnot) associated with this.

In C I always use (possibly const) pointers. In C++ you should likely use references.

Where should I prefer pass-by-reference or pass-by-value?

There are four main cases where you should use pass-by-reference over pass-by-value:

  1. If you are calling a function that needs to modify its arguments, use pass-by-reference or pass-by-pointer. Otherwise, you’ll get a copy of the argument.
  2. If you're calling a function that needs to take a large object as a parameter, pass it by const reference to avoid making an unnecessary copy of that object and taking a large efficiency hit.
  3. If you're writing a copy or move constructor which by definition must take a reference, use pass by reference.
  4. If you're writing a function that wants to operate on a polymorphic class, use pass by reference or pass by pointer to avoid slicing.

What is more efficient: pass parameter by pointer or by value?

In nearly all code, as long as we're dealing with small/simple objects, the overhead of copying the object, vs. passing it as a pointer or reference is pretty small.

Obviously, if we make a std::string with a large chunk of text in it, it will take quite some time to copy it, relative to just passing a reference.

However, the primary objecting ANY TIME when writing code is to focus on correctness. If you have "large" objects, then use const Type &val if the value is not being modified - that way, you can't accidentally modify it. If the object is to be modified, then you NEED to use a reference or pointer to get the updates back to the caller of the function.

It is entirely possible to make code that runs noticeably slower with a reference than with a value. I was once looking into the performance of some code that we were working on, and found a function that looked something like this:

void SomeClass::FindThing(int &thing)
{
for(thing = 0; someVector[thing] != 42; thing++)
;
}

It really looks rather innocent, but since each update of thing means an indirect memory access [at least in the compiler we used, which was certainly not "rubbish"], it was taking quite a lot of time out of the entire process [it was also called twice as much as necessary].

I rewrote it as:

void SomeClass::FindThing(int &thing)
{
for(int i = 0; someVector[i] != 42; i++)
;
thing = i;
}

And the function ran about 4x faster. Taking out the second, unnecessary call, as well, and we ended up with about 30% faster runtime. This was in a "fonts benchmark", and this was one out of a several dozen functions involved in the "draw fonts to screen". It's scary how a simple, innocent looking function can make a BIG difference to performance.

C++: Why pass-by-value is generally more efficient than pass-by-reference for built-in (i.e., C-like) types

A compiler vendor would typically implement a reference as a pointer. Pointers tend to be the same size as or larger than many of the built-in types. For these built-in types the same amount of data would be passed whether you passed by value or by reference. In the function, in order to get the actual data, you would however need to dereference this internal pointer. This can add an instruction to the generated code, and you will also have two memory locations that may not be in cache. The difference won't be much - but it could be measured in tight loops.

A compiler vendor could choose to disregard const references (and sometimes also non-const references) when they're used on built-in types - all depending on the information available to the compiler when it deals with the function and its callers.

what is the overhead of passing a reference?

Regarding complexity, returning or passing a reference is just like passing a pointer. Its overhead is equivalent to passing an integer the size of a pointer, plus a few instructions. In short, that is as fast as is possible in nearly every case. Builtin types (e.g. int, float) less than or equal to the size of a pointer are the obvious exception.

At worst, passing/returning a reference can add a few instructions or disable some optimizations. Those losses rarely exceed the costs of returning/passing objects by value (e.g. calling a copy constructor + destructor is much higher, even for a very basic object). Passing/returning by reference is a good default unless every instruction counts, and you have measured that difference.

Therefore, using references has incredibly low overhead.

One can't really quantify how much faster it would be without knowing the complexity of your types and their constructor/destructor, but if it is not a builtin type, then holding a local and returning it by reference will be fastest in most cases - it all depends on the complexity of the object and its copy, but only incredibly trivial objects could come close the speed of the reference.

Are there benefits of passing by pointer over passing by reference in C++?

A pointer can receive a NULL parameter, a reference parameter can not. If there's ever a chance that you could want to pass "no object", then use a pointer instead of a reference.

Also, passing by pointer allows you to explicitly see at the call site whether the object is passed by value or by reference:

// Is mySprite passed by value or by reference?  You can't tell 
// without looking at the definition of func()
func(mySprite);

// func2 passes "by pointer" - no need to look up function definition
func2(&mySprite);


Related Topics



Leave a reply



Submit