Is Infinite Loop Still Undefined Behavior in C++ If It Calls Shared Library

Does while(i--) s+= a[i]; contain undefined behavior in C and C++?

Does while loop contain undefined behavior because of integer underflow?

No, overflow/underflow is only undefined behavior in case of signed integers.

Do compilers have right to assume that while loop condition is always true because of that and end up with endless loop?

No, because the expression will eventually turn out to be zero.

What if i is signed int? Doesn't it contain pitfalls related to array access?

If it is signed and over/underflows, you invoke undefined behavior.

C11 - omitting potentially infinite loop by the compiler

Can v->cntr, in the controlling expression, be considered as a synchronization

No.

From https://port70.net/~nsz/c/c11/n1570.html#5.1.2.4p5 :

The library defines a number of atomic operations (7.17) and operations on mutexes (7.26.4) that are specially identified as synchronization operations.

So basically, functions from stdatomic.h and mtx_* from thread.h are synchronization operations.

since v may be a pointer to a global structure which can be modified externally (for example by another thread)?

Does not matter. Assumptions like sound to me like they would disallow many sane optimizations, I wouldn't want my compiler to assume that.

If v were modified in another thread, then it would be unsequenced, that would just result in undefined behavior https://port70.net/~nsz/c/c11/n1570.html#5.1.2.4p25 .

Is the compiler allowed not to re-read v->cntr on each iteration if v is not defined as volatile?

Yes.

Are compilers allowed to eliminate infinite loops?

C11 clarifies the answer to this question, in the draft C11 standard section 6.8.5 Iteration statements added the following paragraph:

An iteration statement whose controlling expression is not a constant
expression,156) that performs no input/output operations, does not
access volatile objects, and performs no synchronization or atomic
operations in its body, controlling expression, or (in the case of a
for statement) its expression-3, may be assumed by the implementation
to terminate.157)

and footnote 157 says:

This is intended to allow compiler transformations such as removal of empty loops even when
termination cannot be proven.

So your specific example:

while(1) 
/* noop */;

is not fair game for optimization since the controlling expression is a constant expression.

Infinite loops as UB

So why are compilers allowed to optimize away infinite loops with the exception provided above, well Hans Boehm provides a rationale for making infinite loops undefined behavior in: N1528: Why undefined behavior for infinite loops?, the following quote gives a good feeling for the issue involved:

As N1509 correctly points out, the current draft essentially gives
undefined behavior to infinite loops in 6.8.5p6. A major issue for
doing so is that it allows code to move across a potentially
non-terminating loop. For example, assume we have the following loops,
where count and count2 are global variables (or have had their address
taken), and p is a local variable, whose address has not been taken:

for (p = q; p != 0; p = p -> next) {
++count;
}
for (p = q; p != 0; p = p -> next) {
++count2;
}

Could these two loops be merged and replaced by the following loop?

for (p = q; p != 0; p = p -> next) {
++count;
++count2;
}

Without the special dispensation in 6.8.5p6 for infinite loops, this
would be disallowed: If the first loop doesn't terminate because q
points to a circular list, the original never writes to count2. Thus
it could be run in parallel with another thread that accesses or
updates count2. This is no longer safe with the transformed version
which does access count2 in spite of the infinite loop. Thus the
transformation potentially introduces a data race.

In cases like this, it is very unlikely that a compiler would be able
to prove loop termination; it would have to understand that q points
to an acyclic list, which I believe is beyond the ability of most
mainstream compilers, and often impossible without whole program
information.

C99

Since C99 does not have this carve out, we would look to the as-if rule covered in section 5.1.2.3 which basically says that the compiler only has to emulate the observable behavior of a program, the requirements are as follows:

The least requirements on a conforming implementation are:

  • At sequence points, volatile objects are stable in the sense that previous accesses are
    complete and subsequent accesses have not yet occurred.
  • At program termination, all data written into files shall be identical to the result that
    execution of the program according to the abstract semantics would have produced.
  • The input and output dynamics of interactive devices shall take place as specified in
    7.19.3. The intent of these requirements is that unbuffered or line-buffered output
    appear as soon as possible, to ensure that prompting messages actually appear prior to
    a program waiting for input.

A strict reading of this would seem to allow an implementation to optimize an infinite loop away. We can certainly come up with scenarios where optimizing away an infinite loop would cause a change in observable behavior:

while(1) ;
printf( "hello world\n" ) ;

Many would argue that effecting the termination of a process is also observable behavior, this position is taken in C Compilers Disprove Fermat’s Last Theorem:

The compiler is given considerable freedom in how it implements the C program, but its output must have the same externally visible behavior that the program would have when interpreted by the “C abstract machine” that is described in the standard. Many knowledgeable people (including me) read this as saying that the termination behavior of a program must not be changed. Obviously some compiler writers disagree, or else don’t believe that it matters. The fact that reasonable people disagree on the interpretation would seem to indicate that the C standard is flawed.

Update

I somehow missed the the follow-up to the above article, Compilers and Termination Revisited which says the following about section 5.1.2.3:

The second requirement is the tricky one. If it is talking about termination of the program running on the abstract machine, then it is vacuously met because our program does not terminate. If it is talking about termination of the actual program generated by the compiler, then the C implementation is buggy because the data written into files (stdout is a file) differs from the data written by the abstract machine. (This reading is due to Hans Boehm; I had failed to tease this subtlety out of the standard.)

One could also make a weaker argument that the need to create a carve out in C11 to allow empty loop removal implies this was not an allowable optimization previously.

Does this apply to infinite goto loops as well?

I believe the intent is that this also applies to infinite goto loops as well. In C++ this is clearly the case since section 1.10 [intro.multithread] says:

The implementation may assume that any thread will eventually do one of the following

  • terminate,
  • make a call to a library I/O function,
  • access or modify a volatile object, or
  • perform a synchronization operation or an atomic operation.

and then intent as expressed in N1528 is that the C and C++ standard agree:

Since compiler back-ends are typically shared between C and C++ compilers, it appears most important that WG14 and WG21 agree on whatever solution is adopted. The alternatives would be special treatment of the two languages by the back-end, or disabling optimizations when processing C code. Neither appears at all desirable.

and at the end says:

WG21 is considering improved wording that makes the treatment consistent. Hopefully WG14 will track any resulting changes.

Currently the C11 standard does not contain the similar wording in section 5.1.2.4 Multi-threaded executions and data races but considering N1528 it seems wise to assume compilers will treat infinite goto loops as undefined behavior in C and C++.

Note also see US comment 38 here and N3196 which is the paper that this changed was applied from.

C++ compilation bug?

This is due to undefined behavior, you are accessing the array mc out of bounds on the last iteration of your loop. Some compilers may perform aggressive loop optimization around the assumptions of no undefined behavior. The logic would be similar to the following:

  • Accessing mc out of bounds is undefined behavior
  • Assume no undefined behavior
  • Therefore di < 4 is always true since otherwise mc[di] would invoke undefined behavior

gcc with optimization turned on and using the -fno-aggressive-loop-optimizations flag causes the infinite loop behavior to disappear(see it live). While a live example with optimization but without -fno-aggressive-loop-optimizations exhibits the infinite loop behavior you observe.

A godbolt live example of the code shows the di < 4 check is removed and replaced with and unconditional jmp:

jmp .L6

This is almost identical to the case outlined in GCC pre-4.8 Breaks Broken SPEC 2006 Benchmarks. The comments to this article are excellent and well worth the read. It notes that clang caught the case in the article using -fsanitize=undefined which I can not reproduce for this case but gcc using -fsanitize=undefined does (see it live). Probably the most infamous bug around an optimizer making an inference around undefined behavior is the Linux kernel null pointer check removal.

Although this is an aggressive optimizations, it is important to note that as the C++ standard says undefined behavior is:

behavior for which this International Standard imposes no requirements

Which essentially means anything is possible and it notes (emphasis mine):

[...]Permissible undefined behavior
ranges from ignoring the situation completely with unpredictable results, to behaving during translation or
program execution in a documented manner characteristic of the environment (with or without the issuance of
a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).[...]

In order to get a warning from gcc we need to move the cout outside the loop and then we see the following warning (see it live):

warning: iteration 3u invokes undefined behavior [-Waggressive-loop-optimizations]
for(di=0; di<4;di++,delta=mc[di]){ }
^

which would have likely been sufficient to provide the OP with enough information to figure out what was going on. Inconsistency like this are typical of the types of behavior we can see with undefined behavior. To get a better understanding of why such waring can be inconsitent in the face of undefined behavior Why can't you warn when optimizing based on undefined behavior? is a good read.

Note, -fno-aggressive-loop-optimizations is documented in the gcc 4.8 release notes.

Why does this loop produce warning: iteration 3u invokes undefined behavior and output more than 4 lines?

Signed integer overflow (as strictly speaking, there is no such thing as "unsigned integer overflow") means undefined behaviour. And this means anything can happen, and discussing why does it happen under the rules of C++ doesn't make sense.

C++11 draft N3337: §5.4:1

If during the evaluation of an expression, the result is not mathematically defined or not in the range of
representable values for its type, the behavior is undefined. [ Note: most existing implementations of C++
ignore integer overflows. Treatment of division by zero, forming a remainder using a zero divisor, and all
floating point exceptions vary among machines, and is usually adjustable by a library function. —end note ]

Your code compiled with g++ -O3 emits warning (even without -Wall)

a.cpp: In function 'int main()':
a.cpp:11:18: warning: iteration 3u invokes undefined behavior [-Waggressive-loop-optimizations]
std::cout << i*1000000000 << std::endl;
^
a.cpp:9:2: note: containing loop
for (int i = 0; i < 4; ++i)
^

The only way we can analyze what the program is doing, is by reading the generated assembly code.

Here is the full assembly listing:

    .file   "a.cpp"
.section .text$_ZNKSt5ctypeIcE8do_widenEc,"x"
.linkonce discard
.align 2
LCOLDB0:
LHOTB0:
.align 2
.p2align 4,,15
.globl __ZNKSt5ctypeIcE8do_widenEc
.def __ZNKSt5ctypeIcE8do_widenEc; .scl 2; .type 32; .endef
__ZNKSt5ctypeIcE8do_widenEc:
LFB860:
.cfi_startproc
movzbl 4(%esp), %eax
ret $4
.cfi_endproc
LFE860:
LCOLDE0:
LHOTE0:
.section .text.unlikely,"x"
LCOLDB1:
.text
LHOTB1:
.p2align 4,,15
.def ___tcf_0; .scl 3; .type 32; .endef
___tcf_0:
LFB1091:
.cfi_startproc
movl $__ZStL8__ioinit, %ecx
jmp __ZNSt8ios_base4InitD1Ev
.cfi_endproc
LFE1091:
.section .text.unlikely,"x"
LCOLDE1:
.text
LHOTE1:
.def ___main; .scl 2; .type 32; .endef
.section .text.unlikely,"x"
LCOLDB2:
.section .text.startup,"x"
LHOTB2:
.p2align 4,,15
.globl _main
.def _main; .scl 2; .type 32; .endef
_main:
LFB1084:
.cfi_startproc
leal 4(%esp), %ecx
.cfi_def_cfa 1, 0
andl $-16, %esp
pushl -4(%ecx)
pushl %ebp
.cfi_escape 0x10,0x5,0x2,0x75,0
movl %esp, %ebp
pushl %edi
pushl %esi
pushl %ebx
pushl %ecx
.cfi_escape 0xf,0x3,0x75,0x70,0x6
.cfi_escape 0x10,0x7,0x2,0x75,0x7c
.cfi_escape 0x10,0x6,0x2,0x75,0x78
.cfi_escape 0x10,0x3,0x2,0x75,0x74
xorl %edi, %edi
subl $24, %esp
call ___main
L4:
movl %edi, (%esp)
movl $__ZSt4cout, %ecx
call __ZNSolsEi
movl %eax, %esi
movl (%eax), %eax
subl $4, %esp
movl -12(%eax), %eax
movl 124(%esi,%eax), %ebx
testl %ebx, %ebx
je L15
cmpb $0, 28(%ebx)
je L5
movsbl 39(%ebx), %eax
L6:
movl %esi, %ecx
movl %eax, (%esp)
addl $1000000000, %edi
call __ZNSo3putEc
subl $4, %esp
movl %eax, %ecx
call __ZNSo5flushEv
jmp L4
.p2align 4,,10
L5:
movl %ebx, %ecx
call __ZNKSt5ctypeIcE13_M_widen_initEv
movl (%ebx), %eax
movl 24(%eax), %edx
movl $10, %eax
cmpl $__ZNKSt5ctypeIcE8do_widenEc, %edx
je L6
movl $10, (%esp)
movl %ebx, %ecx
call *%edx
movsbl %al, %eax
pushl %edx
jmp L6
L15:
call __ZSt16__throw_bad_castv
.cfi_endproc
LFE1084:
.section .text.unlikely,"x"
LCOLDE2:
.section .text.startup,"x"
LHOTE2:
.section .text.unlikely,"x"
LCOLDB3:
.section .text.startup,"x"
LHOTB3:
.p2align 4,,15
.def __GLOBAL__sub_I_main; .scl 3; .type 32; .endef
__GLOBAL__sub_I_main:
LFB1092:
.cfi_startproc
subl $28, %esp
.cfi_def_cfa_offset 32
movl $__ZStL8__ioinit, %ecx
call __ZNSt8ios_base4InitC1Ev
movl $___tcf_0, (%esp)
call _atexit
addl $28, %esp
.cfi_def_cfa_offset 4
ret
.cfi_endproc
LFE1092:
.section .text.unlikely,"x"
LCOLDE3:
.section .text.startup,"x"
LHOTE3:
.section .ctors,"w"
.align 4
.long __GLOBAL__sub_I_main
.lcomm __ZStL8__ioinit,1,1
.ident "GCC: (i686-posix-dwarf-rev1, Built by MinGW-W64 project) 4.9.0"
.def __ZNSt8ios_base4InitD1Ev; .scl 2; .type 32; .endef
.def __ZNSolsEi; .scl 2; .type 32; .endef
.def __ZNSo3putEc; .scl 2; .type 32; .endef
.def __ZNSo5flushEv; .scl 2; .type 32; .endef
.def __ZNKSt5ctypeIcE13_M_widen_initEv; .scl 2; .type 32; .endef
.def __ZSt16__throw_bad_castv; .scl 2; .type 32; .endef
.def __ZNSt8ios_base4InitC1Ev; .scl 2; .type 32; .endef
.def _atexit; .scl 2; .type 32; .endef

I can barely even read assembly, but even I can see the addl $1000000000, %edi line.
The resulting code looks more like

for(int i = 0; /* nothing, that is - infinite loop */; i += 1000000000)
std::cout << i << std::endl;

This comment of @T.C.:

I suspect that it's something like: (1) because every iteration with i of any value larger than 2 has undefined behavior -> (2) we can assume that i <= 2 for optimization purposes -> (3) the loop condition is always true -> (4) it's optimized away into an infinite loop.

gave me idea to compare the assembly code of the OP's code to the assembly code of the following code, with no undefined behaviour.

#include <iostream>

int main()
{
// changed the termination condition
for (int i = 0; i < 3; ++i)
std::cout << i*1000000000 << std::endl;
}

And, in fact, the correct code has termination condition.

    ; ...snip...
L6:
mov ecx, edi
mov DWORD PTR [esp], eax
add esi, 1000000000
call __ZNSo3putEc
sub esp, 4
mov ecx, eax
call __ZNSo5flushEv
cmp esi, -1294967296 // here it is
jne L7
lea esp, [ebp-16]
xor eax, eax
pop ecx
; ...snip...

Unfortunately this is the consequences of writing buggy code.

Fortunately you can make use of better diagnostics and better debugging tools - that's what they are for:

  • enable all warnings

  • -Wall is the gcc option that enables all useful warnings with no false positives. This is a bare minimum that you should always use.

  • gcc has many other warning options, however, they are not enabled with -Wall as they may warn on false positives

  • Visual C++ unfortunately is lagging behind with the ability to give useful warnings. At least the IDE enables some by default.

  • use debug flags for debugging

    • for integer overflow -ftrapv traps the program on overflow,
    • Clang compiler is excellent for this: -fcatch-undefined-behavior catches a lot of instances of undefined behaviour (note: "a lot of" != "all of them")

I have a spaghetti mess of a program not written by me that needs to be shipped tomorrow! HELP!!!!!!111oneone

Use gcc's -fwrapv

This option instructs the compiler to assume that signed arithmetic overflow of addition, subtraction and multiplication wraps around using twos-complement representation.

1 - this rule does not apply to "unsigned integer overflow", as §3.9.1.4 says that

Unsigned integers, declared unsigned, shall obey the laws of arithmetic modulo 2n where n is the number
of bits in the value representation of that particular size of integer.

and e.g. result of UINT_MAX + 1 is mathematically defined - by the rules of arithmetic modulo 2n

Why I can't escape in this while loop in Release mode?

What may be happening is that your flag variable is being optimized out by the compiler because it think it can't change. Here is a case you could use volatile, and also std::atomic:

volatile std::atomic<bool> bIsTrue = false;

Also, I would check your use of WinAPI functions. I would suggest instead using the standard C++ thread library: https://en.cppreference.com/w/cpp/header/thread .

Undefined behaviour on std::prev for transform-view

This means there’s a compiler bug or the code calling std::prev invokes undefined behaviour – which one is it?

The latter, although libstdc++ should be able to detect this failure and diagnose it better as it does if you ask it to.

The issue here is that given:

auto view = v | std::views::transform([] (auto i) { return static_cast<int>(i); });

view's iterators are C++20 random access iterators, but because their reference type is a prvalue int, they can only be C++17 input iterators. That's just the way the pre-C++20 iterator model worked: forward iterator required a true reference.

std::ranges::prev uses the C++20 iterator categories, std::prev uses the C++17 (or really C++98) iterator categories. And std::prev requires BidirectionalIterator. It's unspecified to what extent the library needs to try to validate that the iterator is indeed bidirectional, but prev(it, n) is specified as advance(it, -n); return it; and advance(it, n) for non-bidirectional iterators will just loop until it hits n... which for negative n will clearly never happen.

That said, if you're using Ranges, you should be using std::ranges::meow instead of std::meow in all cases, because of this iterator category difference. This case is pretty dramatic since it's "works" vs "infinite loop", but note that std::next(it, 100) would be a loop that increments it 100 times while std::ranges::next(it, 100) would return it + 100, so even when they both "work" it's still a significant difference.



Related Topics



Leave a reply



Submit