Optimizing Away a "While(1);" in C++0X

Optimizing away a while(1); in C++0x

Does someone have a good explanation of why this was necessary to allow?

Yes, Hans Boehm provides a rationale for this in N1528: Why undefined behavior for infinite loops?, although this is WG14 document the rationale applies to C++ as well and the document refers to both WG14 and WG21:

As N1509 correctly points out, the current draft essentially gives
undefined behavior to infinite loops in 6.8.5p6. A major issue for
doing so is that it allows code to move across a potentially
non-terminating loop. For example, assume we have the following loops,
where count and count2 are global variables (or have had their address
taken), and p is a local variable, whose address has not been taken:

for (p = q; p != 0; p = p -> next) {
++count;
}
for (p = q; p != 0; p = p -> next) {
++count2;
}

Could these two loops be merged and replaced by the following loop?

for (p = q; p != 0; p = p -> next) {
++count;
++count2;
}

Without the special dispensation in 6.8.5p6 for infinite loops, this
would be disallowed: If the first loop doesn't terminate because q
points to a circular list, the original never writes to count2. Thus
it could be run in parallel with another thread that accesses or
updates count2. This is no longer safe with the transformed version
which does access count2 in spite of the infinite loop. Thus the
transformation potentially introduces a data race.

In cases like this, it is very unlikely that a compiler would be able
to prove loop termination; it would have to understand that q points
to an acyclic list, which I believe is beyond the ability of most
mainstream compilers, and often impossible without whole program
information.

The restrictions imposed by non-terminating loops are a restriction on
the optimization of terminating loops for which the compiler cannot
prove termination, as well as on the optimization of actually
non-terminating loops. The former are much more common than the
latter, and often more interesting to optimize.

There are clearly also for-loops with an integer loop variable in
which it would be difficult for a compiler to prove termination, and
it would thus be difficult for the compiler to restructure loops
without 6.8.5p6. Even something like

for (i = 1; i != 15; i += 2)

or

for (i = 1; i <= 10; i += j)

seems nontrivial to handle. (In the former case, some basic number
theory is required to prove termination, in the latter case, we need
to know something about the possible values of j to do so. Wrap-around
for unsigned integers may complicate some of this reasoning further.)

This issue seems to apply to almost all loop restructuring
transformations, including compiler parallelization and
cache-optimization transformations, both of which are likely to gain
in importance, and are already often important for numerical code.
This appears likely to turn into a substantial cost for the benefit of
being able to write infinite loops in the most natural way possible,
especially since most of us rarely write intentionally infinite loops.

The one major difference with C is that C11 provides an exception for controlling expressions that are constant expressions which differs from C++ and makes your specific example well-defined in C11.

Optimizing away static variable / passing by reference

that compilers are allowed to optimize away a static variable if the address is never taken

You seem to concentrated on the wrong part of the answer. The answer states:

the compiler can do anything it wants to your code so long as the observable behavior is the same

The end. You can take the address, don't take it, calculate the meaning of life and calculate how to heal cancer, the only thing that matters is observable effect. As long as you don't actually heal cancer (or output the results of calculations...), all calculations are just no-op.

f there exists a function which takes its arguments by reference, is the compiler still allowed to optimize away the variable

Yes. The code is just putc('3').

Is the situation different for the "perfect forwarding" case

No. The code is still just putc('3').

would the compiler be allowed to optimize the call in the following case

Yes. This code has no observable effect, contrary to the previous ones. The call to f() can just be removed.

in whether compilers do this kind of optimization?

Copy your code to https://godbolt.org/ and inspect the assembly code. Even with no experience in assembly code, you will see differences with different code and compilers.

Choose x86 gcc (trunk) and remember to enable optimizations -O. Copy code with static, then remove static - did the code change? Repeat for all code snippets.

Why doesn't my C++ compiler optimize these memory writes away?

Because GCC isn't smart enough to perform this optimization on dynamically allocated memory. However, if you change garbageto be a local array instead, GCC compiles the loop to this:

.L4:
call rand
call rand
call rand
jmp .L4

This just calls rand repeatedly (which is needed because the call has side effects), but optimizes out the reads and writes.

If GCC was even smarter, it could also optimize out the randcalls, because its side effects only affect any later randcalls, and in this case there aren't any. However, this sort of optimization would probably be a waste of compiler writers' time.

Can unused function-arguments become optimized away?

You don't need to worry - the lifetime of the function argument at the call site is guaranteed to survive the function call. (This is why things like foo(s.c_str()) for a std::string s work.)

A compiler is not allowed to break that rule, subject to as if rule flexibility.



Related Topics



Leave a reply



Submit