"Observable Behaviour" and Compiler Freedom to Eliminate/Transform Pieces C++ Code

Observable behaviour and compiler freedom to eliminate/transform pieces c++ code

The important bit is that the compiler must be able to prove that the code has no side effects before it can remove it (or determine which side effects it has and replace it with some equivalent piece of code). In general, and because of the separate compilation model, that means that the compiler is somehow limited as to what library calls have observable behavior and can be eliminated.

As to the deepness of it, it depends on the library implementation. In gcc, the C standard library uses compiler attributes to inform the compiler of potential side effects (or absence of them). For example, strlen is tagged with a pure attribute that allows the compiler to transform this code:

char p[] = "Hi there\n";
for ( int i = 0; i < strlen(p); ++i ) std::cout << p[i];

into

char * p = get_string();
int __length = strlen(p);
for ( int i = 0; i < __length; ++i ) std::cout << p[i];

But without the pure attribute the compiler cannot know whether the function has side effects or not (unless it is inlining it, and gets to see inside the function), and cannot perform the above optimization.

That is, in general, the compiler will not remove code unless it can prove that it has no side effects, i.e. will not affect the outcome of the program. Note that this does not only relate to volatile and io, since any variable change might have observable behavior at a later time.

As to question 3, the compiler will only remove your code if the program behaves exactly as if the code was present (copy elision being an exception), so you should not even care whether the compiler removes it or not. Regarding question 4, the as-if rule stands: If the outcome of the implicit refactor made by the compiler yields the same result, then it is free to perform the change. Consider:

unsigned int fact = 1;
for ( unsigned int i = 1; i < 5; ++i ) fact *= i;

The compiler can freely replace that code with:

unsigned int fact = 120; // I think the math is correct... imagine it is

The loop is gone, but the behavior is the same: each loop interaction does not affect the outcome of the program, and the variable has the correct value at the end of the loop, i.e. if it is later used in some observable operation, the result will be as-if the loop had been executed.

Don't worry too much on what observable behavior and the as-if rule mean, they basically mean that the compiler must yield the output that you programmed in your code, even if it is free to get to that outcome by a different path.

EDIT

@Konrad raises a really good point regarding the initial example I had with strlen: how can the compiler know that strlen calls can be elided? And the answer is that in the original example it cannot, and thus it could not elide the calls. There is nothing telling the compiler that the pointer returned from the get_string() function does not refer to memory that is being modified elsewhere. I have corrected the example to use a local array.

In the modified example, the array is local, and the compiler can verify that there are no other pointers that refer to the same memory. strlen takes a const pointer and so it promises not to modify the contained memory, and the function is pure so it promises not to modify any other state. The array is not modified inside the loop construct, and gathering all that information the compiler can determine that a single call to strlen suffices. Without the pure specifier, the compiler cannot know whether the result of strlen will differ in different invocations and has to call it.

What does section 5.1.2.3, paragraph 4 (in n1570.pdf) mean for null operations?

I think the "including any" applies to the "needed side-effects" , whereas you seem to be reading it as applying to "part of an expression".

So the intent was to say:

... An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced .
Examples of needed side-effects include:
Needed side-effects caused by a function which this expression calls
Accesses to volatile variables

Now, the term needed side-effect is not defined by the Standard. Section /4 is not attempting to define it either -- it's trying (and not succeeding very well) to provide examples.

I think the only sensible interpretation is to treat it as meaning observable behaviour which is defined by 5.1.2.3/6. So it would have been a lot simpler to write:

An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no observable behaviour would be caused.

Your questions in the edit are answered by 5.1.2.3/6, sometimes known as the as-if rule, which I'll quote here:

The least requirements on a conforming implementation are:
Accesses to volatile objects are evaluated strictly according to the rules of the abstract machine.
At program termination, all data written into files shall be identical to the result that execution of the program according to the abstract semantics would have produced.
The input and output dynamics of interactive devices shall take place as specified in 7.21.3. The intent of these requirements is that unbuffered or line-buffered output appear as soon as possible, to ensure that prompting messages actually appear prior to a program waiting for input.
This is the observable behaviour of the program.

Answering the specific questions in the edit:

Is it possible for a compiler to distinguish between a side effect that's "needed", and a side effect that isn't? If timing is considered a needed side effect, then why are compilers allowed to optimise away null operations like do_nothing(); or int unused_variable = 0;?

Timing isn't a side-effect. A "needed" side-effect presumably here means one that causes observable behaviour.

If a compiler is able to deduce that a function does nothing (eg. void do_nothing() { }), then is it possible that the compiler might have justification to optimise calls to that function away?

Yes, these can be optimized out because they do not cause observable behaviour.

If a compiler is able to deduce that a volatile object isn't mapped to anything crucial (i.e. perhaps it's mapped to /dev/null to form a null operation), then is it possible that the compiler might also have justification to optimise that non-crucial side-effect away?

No, because accesses to volatile objects are defined as observable behaviour.

If a compiler can perform optimisations to eliminate unnecessary code such as calls to do_nothing() in a process called "dead code elimination" (which is quite the common practice), then why can't the compiler also eliminate volatile writes to a null device?

Because volatile accesses are defined as observable behaviour and empty functions aren't.

Is store reordering allowed by C++ as-if rule?

If cppreference.com disagrees with the actual text of the C++ standard, cppreference.com is wrong. The only things that can supersede the text of the standard are a newer version of the standard, and official resolutions of defect reports (which sometimes get rolled up into documents called "technical corrigienda", which is a fancy name for a minor release of the standard).

However, in this case, you have misunderstood what cppreference.com means by "input and output operations". (If memory serves, that text is taken verbatim from an older version of the standard.) Stores to memory are NOT output operations. Only writing to a file (that is, any stdio.h or iostream output stream, or other implementation-defined mechanism, e.g. a Unix file descriptor) counts as output for purposes of this rule.

The C and C++ standards, prior to their 2011 revisions, assumed a single-threaded abstract machine, and therefore did not bother specifying anything about store ordering, because there was no way to observe stores out of program order. C(++)11 added a whole bunch of rules for store ordering as part of the new multithreading specification.

Is program termination of a C++ program observable behavior?

Yes it is legal for the compile to generate an empty main body for the above code (thus terminating almost immediately).

The C++0x FCD says 6.5 says (pay special attention to the note):

A loop that, outside of the for-init-statement in the case of a for statement,

* makes no calls to library I/O functions, and

* does not access or modify volatile objects, and

* performs no synchronization operations (1.10) or atomic operations (Clause 29)

may be assumed by the implementation to terminate. [ Note: This is intended to allow compiler transformations, such as removal of empty loops, even when termination cannot be proven. — end note ]

So the compiler can assume that your for loop will terminate and since the body is empty it can optimize it away altogether.

The quotation from the draft was copied from this question and verified against my copy.

Is writing to memory an observable behaviour?

This kind of thing is what volatile exists for. Else, writing to memory and never apparently reading from it is not observable behaviour. However, in the general case, it would be quite impossible for the optimizer to prove that you never read it back except in relatively trivial examples, so it's not usually an issue.

Does as-if rule prevent compiler reordering of accesses to global/member variables?

Absolutely they may. The compiler has no obligation whatsoever to consider side effects to other threads or hardware.

The compiler is only forced to consider this if you use volatile or synchronization (and those two are not interchangable).

The Standard memory model is known as SC-DRF, or Sequentially Consistent Data Race Free. A data race would be exactly the scenario you've just described- where one thread is observing non-synchronized variables whilst another is mutating them. This is undefined behaviour. Effectively, the Standard explicitly gives the compiler the freedom to assume that there are no other threads or hardware which is reading non-volatile non-synchronized variables. It is absolutely legitimate for a compiler to optimize on that basis.

By the way, that link is kinda crap. His "fix" does not fix anything at all. The only correct fix for multi-threaded code is to use synchronization.

Why isn't this C++ member function optimized by the compiler with -O3?

The compiler can calculate the result of norm and reuse it multiple times. E.g. with the -Os switch:

test(vector<100u> const&):
        xorps   xmm0, xmm0
        xor     eax, eax
.L2:
        movsd   xmm1, QWORD PTR [rdi+rax]
        add     rax, 8
        cmp     rax, 800
        mulsd   xmm1, xmm1
        addsd   xmm0, xmm1
        jne     .L2
        addsd   xmm0, xmm0
        ret

The missing optimization isn't due to not-associative-floating-point-math or some observable-behavior-issue.

In a not properly mutexed environment another function might change the contents in the array in between the calls of norm

It may happen, but it's not a concern for the compiler (e.g. https://stackoverflow.com/a/25472679/3235496).

Compiling the example with the -O2 -fdump-tree-all switch you can see that:

g++ correctly detects vector<N>::norm() as a pure function (output file .local-pure-const1);
inlining happens at early stages (output file .einline).

Also note that marking norm with __attribute__ ((noinline)) the compiler performs CSE:

test(vector<100u> const&):
    sub     rsp, 8
    call    vector<100u>::norm() const
    add     rsp, 8
    addsd   xmm0, xmm0
    ret

Marc Glisse is (probably) right.

A more advanced form of Common Subexpression Elimination is required to un-inline the recurrent expression.

Is variable defined out of scope Optimized?

For trivial types (and this has a technical meaning), the destruction/creation of them simply indicates where the compiler is guaranteed not to need to know what their values are. Any possible danging reference or pointers after the variable goes away can be "ignored" when optimizing.

So creating data in the smallest scope it will be used can sometimes result in the compiler making more optimal code. If "something" is a function that the compiler cannot see the details of which takes an integer by reference (or pointer), the compiler does not know how long it needs to keep that actual integer around in actual memory. But it is guaranteed (by the programmer) that after the end of the scope, it doesn't need that actual integer in actual memory.

Barring a case like that, no, there is no difference. Compilers do not have to actually create objects of type int when you ask for them; but sometimes they are forced to by other requirements.

In debug builds, compilers create variables where you ask for them and clean them up when you ask them to, because that makes stepping through the code and examining the program state easier.

For non-trivial types (basically, objects with vtables, or objects with constructors, or objects with destructors, or ones that contain non-trivial types in a recursive definition), object creation/destruction can have visible side effects, so when it happens can matter. The compiler can still move them around under the as-if rule (ie, the assembly only needs to act as-if it was what you wrote), so if it can prove that those non-trivial construction/destruction operations do not care when exactly they run (in a standard formal sense) they can be moved around by the compiler.

"Observable Behaviour" and Compiler Freedom to Eliminate/Transform Pieces C++ Code