C++ Performance of Accessing Member Variables Versus Local Variables

C++ performance of accessing member variables versus local variables

Executive summary: In virtually all scenarios, it doesn't matter, but there is a slight advantage for local variables.

Warning: You are micro-optimizing. You will end up spending hours trying to understand code that is supposed to win a nanosecond.

Warning: In your scenario, performance shouldn't be the question, but the role of the variables - are they temporary, or state of thisClass?

Warning: First, second and last rule of optimization: measure!


First of all, look at the typical assembly generated for x86 (your platform may vary):

// stack variable: load into eax
mov eax, [esp+10]

// member variable: load into eax
mov ecx, [adress of object]
mov eax, [ecx+4]

Once the address of the object is loaded, int a register, the instructions are identical. Loading the object address can usually be paired with an earlier instruction and doesn't hit execution time.

But this means the ecx register isn't available for other optimizations. However, modern CPUs do some intense trickery to make that less of an issue.

Also, when accessing many objects this may cost you extra. However, this is less than one cycle average, and there are often more opprtunities for pairing instructions.

Memory locality: here's a chance for the stack to win big time. Top of stack is virtually always in the L1 cache, so the load takes one cycle. The object is more likely to be pushed back to L2 cache (rule of thumb, 10 cycles) or main memory (100 cycles).

However, you pay this only for the first access. if all you have is a single access, the 10 or 100 cycles are unnoticable. if you have thousands of accesses, the object data will be in L1 cache, too.

In summary, the gain is so small that it virtually never makes sense to copy member variables into locals to achieve better performance.

C++ performance: Local variable vs data member

On x86-64, I'd expect this code to end up with both now and delta allocated in RAX. In assembly language, the code would look something on this order:

assume RSI:ptr _Delta
call steady_clock::now()
sub rax, [rsi].last
mov [rsi].last, rax
ret

Of course, in real assembly language, you'd see the mangled names for steady_clock::now() (for one example), but you get the general idea. Upon entry to any non-static member function, it's going to have this in some register. The return value always goes in rax. I don't see any particularly good reason a compiler would need (or even want) to allocate space for any other variables.

On 32-bit x86, there's a much higher likelihood that this would end up using some stack space, though it's possible that it would return a 64-bit value in EDX:EAX, in which case things would end up fairly similar to what's above, just using one more register.

Most other processors start out with more registers than an x86, so the register pressure is lower. On a SPARC, for example, a routine will normally start with 8 local registers free and ready for use, so allocating now in a register would be a near certainty.

Bottom line: you're unlikely to see a significant speed difference, but if you do see a difference, I'd guess it's more likely to favor using a local variable than a member variable.

c++ local vs member variable performance

That is because your compiler can see that you never read i in local::incr. If you don't read it, there is no need to increment it, so the compiler just can optimize anything away that has to do with local. And doing nothing of course is faster than doing anything.

However, I doubt that you compiled with full optimization, or else the compiler would have seen that the stuff related to Member doesn't do anything, too, and then you would have gotten 0 and 0 both times, because good optimizers can see enough to optimize even the loops away, because they don't have side effects.

Local variables vs method access

When you use a . (access operator), it means that the runtime environment first needs to find out where the object is located, next calculate at which position the field is located, and then fetch that value.

If you access a property, it is even worse, because the runtime environment sometimes needs to make a call (as @RowlandShaw states: sometimes access to a property can be inlined, in that case it doesn't make any difference whether you access the field directly or use a property). And a property does not always provides the value instantly: there can be calculations in between.

A local variable on the other hand, is can be accessed directly, in many cases it will be put in a register as well to boost performance further (which is roughly a factor of 5-10). So indeed a local variable will nearly always be faster.

In case the field you aim to address is read only, a smart compiler can optimize the call, because it knows for sure, that the value wont change. But in many cases (for instance if you access a property) you cannot know whether the value has changed between two calls because another thread can have modified the object concurrently.

There is not much difference between var being a reference type or a value type. In the case of a reference type, the references will be shortcutted, in the case of a value type (primitives and structs), the value will be copied.

The only possible downside performancewise using a local variable, if the var is a large struct. In that case, copying the values can result in signifant overhead.

Short answer: If you know the value of the thing you access won't change (or you don't care much about it), you better store it in a local variable for performance reasons. Otherwise, you should keep using a access to the field/property. In a very rare case it can result in overhead: if it is a value type and the size of the type is extremely large: in that case the struct must be copied an can result in a large amount of instructions.

Are global variables faster than local variables in C?

It's rather inaccurate.

If you study computer architecture, you'll find out that the fastest storage are the registers, followed by the caches, followed by the RAM. The thing about local variables is that the compiler optimizes them to be allocated from the registers if possible, or from the cache if not. This is why local variables are faster.

For embedded systems, sure it might be possible to compile to a tiny memory model in which case your data segment may possibly fit into a modern controller's SRAM cache. But in such cases, your local variable usage would also be very tight such that they are probably operating entirely on registers.

Conclusion: In most cases, local variables will be faster than global variables.

Use local variables or access multiple times a struct value ( C++ )

Generally speaking, in C/C++ it doesn't matter. In C and C++, the memory layout of every structure is known at compile-time. When you type arr[i].path.to.value, that's going to be essentially the same as *(&arr[0] + i * (something) + offset_1 + offset_2 + offset_3), and all that will get simplified at compile-time to something like *(&arr[0] + i * (something) + something). And those something's will be computed by the compiler and hard-coded into the binary, so effectively looking up arr[i].path.to is not faster than arr[i].path.to.value.

This is not mandated by the standard or anything as far as I know, but it's how most compilers will actually work.

If you want to be sure in some specific case, you can look at godbolt and see what assembly it cooks up: http://gcc.godbolt.org/

Note that I'm assuming that when you make the local variable, you are taking a reference to the value arr[i].path.to.value, which is most similar to what you do in javascript. If you actually copy the value to a new variable then that will create some copying overhead. I don't think that copying it would be advantageous w.r.t. cache misses unless the usage pattern is pretty complicated. Once you access arr[i].path.to.value once, all the stuff around it is going to be in the cache, and there's no reason that copying it onto the stack would make anything faster.



Related Topics



Leave a reply



Submit