What Is More Efficient, I++ or ++I

What is more efficient, i++ or ++i?

i++ :

create a temporary copy of i
increment i
return the temporary copy

++i :

increment i
return i

With optimizations on, it is quite possible that the resulting assembly is identical, however ++i is more efficient.

edit : keep in mind that in C++, i may be whatever object that support the prefix and postfix ++ operator. For complex objects, the temporary copy cost is non negligible.

Why is ++i more efficient than i++?

i++ increments i and returns the initial value of i. Which means:

int i = 1;
i++; // == 1 but i == 2

But ++i returns the actual incremented value:

int i = 1;
++i; // == 2 and i == 2 too, so no need for a temporary variable

In the first case, the compiler has to create a temporary variable (when used) for returning 1 instead of 2 (in the case where it's not a constant of course but a dynamic value, a return from a call for example).

In the second case, it does not have to. So the second case is guaranteed to be at least as effective.

Often, the compiler will be able to optimize the first case into the second case, but sometimes it may not be able to.

Anyway, we're talking about highly negligible impact.

But on more complicated objects such as iterators-like objects, having a temporary state may be pretty slower if iterated millions of times.

Rule of thumb

Use prefix version unless you specifically want the postfix semantics.

Is there a performance difference between i++ and ++i in C?

Executive summary: No.

i++ could potentially be slower than ++i, since the old value of i
might need to be saved for later use, but in practice all modern
compilers will optimize this away.

We can demonstrate this by looking at the code for this function,
both with ++i and i++.

$ cat i++.c
extern void g(int i);
void f()
{
    int i;

    for (i = 0; i < 100; i++)
        g(i);

}

The files are the same, except for ++i and i++:

$ diff i++.c ++i.c
6c6
<     for (i = 0; i < 100; i++)
---
>     for (i = 0; i < 100; ++i)

We'll compile them, and also get the generated assembler:

$ gcc -c i++.c ++i.c
$ gcc -S i++.c ++i.c

And we can see that both the generated object and assembler files are the same.

$ md5 i++.s ++i.s
MD5 (i++.s) = 90f620dda862cd0205cd5db1f2c8c06e
MD5 (++i.s) = 90f620dda862cd0205cd5db1f2c8c06e

$ md5 *.o
MD5 (++i.o) = dd3ef1408d3a9e4287facccec53f7d22
MD5 (i++.o) = dd3ef1408d3a9e4287facccec53f7d22

Efficiency of ++i versus i++ in Java

Both are not atomic operations composed of the multiple steps. Unlike in C++, these operators can't be overloaded. So there is no difference in Java in the matter of performance.

The only and only difference you should mind between x++ and ++x is that x++ returns the value before it's incremented. And ++x does the same but after the incrementation.

This answer provides a bytecode example.

Is there a performance difference between i++ and ++i in C++?

[Executive Summary: Use ++i if you don't have a specific reason to use i++.]

For C++, the answer is a bit more complicated.

If i is a simple type (not an instance of a C++ class), then the answer given for C ("No there is no performance difference") holds, since the compiler is generating the code.

However, if i is an instance of a C++ class, then i++ and ++i are making calls to one of the operator++ functions. Here's a standard pair of these functions:

Foo& Foo::operator++()   // called for ++i
{
    this->data += 1;
    return *this;
}

Foo Foo::operator++(int ignored_dummy_value)   // called for i++
{
    Foo tmp(*this);   // variable "tmp" cannot be optimized away by the compiler
    ++(*this);
    return tmp;
}

Since the compiler isn't generating code, but just calling an operator++ function, there is no way to optimize away the tmp variable and its associated copy constructor. If the copy constructor is expensive, then this can have a significant performance impact.

How i++ is more efficient than i = i + 1 ?

Depends on the optimisation. i++ can, on most processors, be represented as a single machine language instruction. i = i + 1, on the other hand, could be represented by up to four: load i, load 1, add, store to i; although, even a middling smart compiler should be able to recognise it can rewrite it into the former.

Is It More Efficient to Do a Less Than comparison OR a Less Than Or Equal To?

It depends on the architecture.

The original von Neumann IAS architecture (1945) did have only >= comparison.

Intel 8086 can use Loop label paradigm, which corresponds to do { } while (--cx > 0);
In legacy architectures, LOOP was not only smaller, but faster. In modern architectures LOOP is considered complex operation, which is slower than dec ecx; jnz label; When optimizing for size (-Os) this can still have significance.

Further considerations are that some (RISC) architectures do not have explicit flag registers. Then comparison can't be given free, as a side effect of loop decrement. Some RISC architectures have also a special 'zero' register, which means, that comparison (and every other mathematical operations) with zero is always available. RISCs with jump delay slots may even benefit from using post decrement: do { } while (a-- > 0);

An optimizing compiler should be able to convert a simple loop regardless of the syntax to the most optimized version for the given architecture. A complex loop would have a dependency to the iterator, side effects, or both: for (i=0;i<5;i++) func(i);.

Why is i=i+1 faster than i++?

Where your loop is situated can have a large impact on performance. If your loop is inside a function, Flash will perform calculations using local registers. The loop containing i++ produces thus the following opcodes:

000013 inclocal_i (REG_2)  ; increment i 
000015 inclocal_i (REG_3)  ; increment j
000017 getlocal (REG_3)    ; push j onto stack
000018 pushint 5000000     ; push 5000000 onto stack
000020 iflt -12            ; jump backward if less than

The loop containing i = i + 1 produces the following:

000013 getlocal (REG_2)    ; push i onto stack
000014 pushbyte 1          ; push 1 onto stack
000016 add                 ; add the two
000017 convert_i           ; coerce to integer
000018 setlocal (REG_2)    ; save i back to register 2
000019 inclocal_i (REG_3)  
000021 getlocal (REG_3)
000022 pushint 5000000
000024 iflt -16

i++ is faster than i = i + 1 here since inclocal_i modifies the register directly, without having to load the register onto the stack and saving it back.

The loop becomes be far less efficient when you put it inside a frame script. Flash will store declared variables as class variables. Accessing those requires more work. The i++ loop results in the following:

000017 getlocal (REG_0, this)   ; push this onto stack
000018 dup                      ; duplicate it
000019 setlocal (REG_2)         ; save this to register 2
000020 getproperty i            ; get property "i"
000022 increment_i              ; add one to it
000023 setlocal (REG_3)         ; save result to register 3
000024 getlocal (REG_2)         ; get this from register 2
000025 getlocal (REG_3)         ; get value from register 3
000026 setproperty i            ; set property "i"
000028 kill (REG_3)             ; kill register 2
000030 kill (REG_2)             ; kill register 3
000032 getlocal (REG_0, this)   ; do the same thing with j...
000033 dup
000034 setlocal (REG_2)
000035 getproperty j
000037 increment_i
000038 setlocal (REG_3)
000039 getlocal (REG_2)
000040 getlocal (REG_3)
000041 setproperty j
000043 kill (REG_3)
000045 kill (REG_2)
000047 getlocal (REG_0, this)
000048 getproperty j
000050 pushint 5000000
000052 iflt -40

The i = i + 1 version is somewhat shorter:

000017 getlocal (REG_0, this)   ; push this onto stack
000018 getlocal (REG_0, this)   ; push this onto stack
000019 getproperty i            ; get property "i"
000021 pushbyte 1               ; push 1 onto stack
000023 add                      ; add the two
000024 initproperty i           ; save result to property "i"
000026 getlocal (REG_0, this)   ; increment j...
000027 dup
000028 setlocal (REG_2)
000029 getproperty j
000031 increment_i
000032 setlocal (REG_3)
000033 getlocal (REG_2)
000034 getlocal (REG_3)
000035 setproperty j
000037 kill (REG_3)
000039 kill (REG_2)
000041 getlocal (REG_0, this)
000042 getproperty j
000044 pushint 5000000
000046 iflt -34

Is ++x more efficient than x++ in Java?

It's not more efficient in Java. It can be more efficient in languages where the increment/decrement operators can be overloaded, but otherwise the performance is exactly the same.

The difference between x++ and ++x is that x++ returns the value of x before it was incremented, and ++x returns the value of x after it was incremented. In terms of code generation, both make up for the exact same number of instructions, at least when you can use either interchangeably (if you can't use them interchangeably, you shouldn't be worrying about which one is faster, you should be picking the one you need). The only difference is where the increment instruction is placed.

In C++, classes can overload both the prefix (++x) and postfix (x++) operators. When dealing with types that overload them, it is almost universally faster to use the prefix operator because the semantics of the postfix operator will return a copy of the object as it was before the increment, even when you wouldn't use it, while the prefix operator can simply return a reference to the modified object (and God knows C++ developers prefer to return references rather than copies). This could be a reason to consider ++x superior to x++: if you gain the habit of using ++x you could save yourself some slight performance trouble when/if you switch to C++. But in the context of Java only, both are absolutely equivalent.

Much like pst in the comments above, I never use the return value of x++ or ++x, and if you never do either, you should probably just stick to the one you prefer.

What Is More Efficient, I++ or ++I