Does "Undefined Behaviour" Extend to Compile-Time

Does undefined behaviour extend to compile-time?

"You're all ignoring the actual definition and focusing on the note, The standard imposes no requirements." - @R.MartinhoFernandes

The message above was written by the given user in Lounge<C++> and makes a very valid argument; the standard doesn't impose any requirements when it comes to code that invokes undefined behavior.



! ! !

undefined-behavior stretches even to the far corner of parsing the input data (ie. code) by the compiler, as verified with the below quotations from both the C++11 and C99 standards.

To answer your question with one sentence;

  • undefined behavior is not limited to runtime execution, and it's permissible to crash during compilation "in a documented manner characteristic of the environment" 1

"in a documented manner characteristic of the environment" is a kind of odd statement, you could pretty much write a compiler documenting that it might crash upon any given code (that's invalid) to grant it the possibility to crash whenever it wants to.

1. quote from the C++11/C99 standards


###c++11

###1.3.24 [defns.undefined]

Undefined behavior; behavior for which this International Standard
imposes no requirements

[ Note:

Undefined behavior may be expected
when this International Standard omits any explicit definition of
behavior or when a program uses an erroneous construct or erroneous
data.

Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation
or program execution in a documented manner characteristic of the
environment (with or without the issuance of a diagnostic message), to
terminating a translation or execution (with the issuance of a
diagnostic message).

Many erroneous program constructs do not engender undefined behavior;
they are required to be diagnosed.

end note ]


###c99

3.4.3 - Undefined Behavior

  1. behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this >International Standard imposes no requirements

  2. NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation
    or program execution
    in a documented manner characteristic of the
    environment (with or without the issuance of a diagnostic message), to
    terminating a translation or execution
    (with the issuance of a
    diagnostic message).

Can guaranteed UB be rejected at compile-time?

Can the compiler therefore reject the program and not generate an executable?

Yes. The definition of undefined behaviour is:

behavior for which this International Standard imposes no requirements
[ Note: Undefined behavior may be expected when this International Standard omits any explicit definition of behavior or when a program uses an erroneous construct or erroneous data. Permissible undefined behavior
ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). Many erroneous program constructs do not engender undefined behavior; they are required to be diagnosed. — end note ]

constexpr undefined behaviour

§5.19/2 (on the second page; it really should be split into many paragraphs) forbids constant expressions containing

— an lvalue-to-rvalue conversion (4.1) unless it is applied to

— a glvalue of integral or enumeration type that refers to a non-volatile const object with a preceding initialization, initialized with a constant expression, or


— a glvalue of literal type that refers to a non-volatile object defined with constexpr, or that refers to a sub-object of such an object

str[1000] translates to * ( str + 1000 ), which does not refer to a subobject of str, in contrast with an in-bounds array access. So this is a diagnosable rule, and the compiler is required to complain.

EDIT: It seems there's some confusion about how this diagnosis comes about. The compiler checks an expression against §5.19 when it needs to be constant. If the expression doesn't satisfy the requirements, the compiler is required to complain. In effect, it is required to validate constant expressions against anything that might otherwise cause undefined behavior.* This may or may not involve attempting to evaluate the expression.

 * In C++11, "a result that is not mathematically defined." In C++14, "an operation that would have undefined behavior," which by definition (§1.3.24) ignores behavior that the implementation might define as a fallback.

Undefined behavior causing time travel

There is a flow in the reasoning.

When a compiler writer says: we use Undefined Behavior to optimize a program, there are two different interpretations:

  • most people hear: we identify Undefined Behavior and decide we can do whatever we want (*)
  • the compiler writer meant: we assume Undefined Behavior does not occur

Thus, in your case:

  • dereferencing a nullptr is Undefined Behavior
  • thus executing value_or_fallback(nullptr) is Undefined Behavior
  • thus executing the else branch is Undefined Behavior
  • thus door_is_open being false is Undefined Behavior

And since Undefined Behavior does not occur (the programmer swears she will follow the terms of use), door_is_open is necessarily true and the compiler can elide the else branch.

(*) I am slightly annoyed that Raymond Chen actually formulated it this way...

Is it legal for source code containing undefined behavior to crash the compiler?

The normative definition of undefined behavior is as follows:

[defns.undefined]

behavior for which this International Standard imposes no requirements

[ Note: Undefined behavior may be expected when this International
Standard omits any explicit definition of behavior or when a program
uses an erroneous construct or erroneous data. Permissible undefined
behavior ranges from ignoring the situation completely with
unpredictable results, to behaving during translation or program
execution in a documented manner characteristic of the environment
(with or without the issuance of a diagnostic message), to terminating
a translation or execution (with the issuance of a diagnostic
message). Many erroneous program constructs do not engender undefined
behavior; they are required to be diagnosed. Evaluation of a constant
expression never exhibits behavior explicitly specified as undefined.
 — end note ]

While the note itself is not normative, it does describe a range of behaviors implementations are known to exhibit. So crashing the compiler (which is translation terminating abruptly), is legitimate according to that note. But really, as the normative text says, the standard doesn't place any bounds for either execution or translation. If an implementation steals your passwords, it's not a violation of any contract laid forth in the standard.

How undefined is undefined behavior?

I'd say that the behavior is undefined only if the users inserts any number different from 0. After all, if the offending code section is not actually run the conditions for UB aren't met (i.e. the non-initialized pointer is not created neither dereferenced).

A hint of this can be found into the standard, at 3.4.3:

behavior, upon use of a nonportable or erroneous program construct or of erroneous data,
for which this International Standard imposes no requirements

This seems to imply that, if such "erroneous data" was instead correct, the behavior would be perfectly defined - which seems pretty much applicable to our case.


Additional example: integer overflow. Any program that does an addition with user-provided data without doing extensive check on it is subject to this kind of undefined behavior - but an addition is UB only when the user provides such particular data.



Related Topics



Leave a reply



Submit