Why Dereferencing a Null Pointer Is Undefined Behaviour

Why dereferencing a null pointer is undefined behaviour?

Defining consistent behavior for dereferencing a NULL pointer would require the compiler to check for NULL pointers before each dereference on most CPU architectures. This is an unacceptable burden for a language that is designed for speed.

It also only fixes a small part of a larger problem - there are many ways to have an invalid pointer beyond a NULL pointer.

At what point does dereferencing the null pointer become undefined behavior?

Yes it is undefined behavior, because the spec says that an "lvalue designates an object or function" (at clause 3.10) and it says for the *-operator "the result [of dereferencing] is an lvalue referring to the object or function to which the expression points" (at clause 5.3.1).

That means there is no description for what happens when you dereference a null pointer. It's simply undefined behavior.

Is dereferencing a NULL pointer considered unspecified or undefined behaviour?

The examples are associated with the wrong things. Regardless of what version of the C++ standard you assume (i.e. nothing has changed within the standards, in this regard).

Dereferencing a NULL pointer gives undefined behaviour. The standard does not define any constraint on what happens as a result.

The order of globals initialisation is an example of unspecified behaviour (the standard guarantees that all globals will be initialised [that's a constraint on how globals are initialised] but the order is not specified).

Why is dereferencing of nullptr while using a static method not undefined behaviour in C++?

Standard citations in this answer are from the C++17 spec (N4713).

One of the sections cited in your question answers the question for non-static member functions. [class.mfct.non-static]/2:

If a non-static member function of a class X is called for an object that is not of type X, or of a type derived from X, the behavior is undefined.

This applies to, for example, accessing an object through a different pointer type:

std::string foo;

A *ptr = reinterpret_cast<A *>(&foo); // not UB by itself
ptr->non_static_mem_fn(); // UB by [class.mfct.non-static]/2

A null pointer doesn't point at any valid object, so it certainly doesn't point to an object of type A either. Using your own example:

p->non_static_mem_fn(); // UB by [class.mfct.non-static]/2

With that out of the way, why does this work in the static case? Let's pull together two parts of the standard:

[expr.ref]/2:

... The expression E1->E2 is converted to the equivalent form (*(E1)).E2 ...

[class.static]/1 (emphasis mine):

... A static member may be referred to using the class member access syntax, in which case the object expression is evaluated.

The second block, in particular, says that the object expression is evaluated even for static member access. This is important if, for example, it is a function call with side effects.

Put together, this implies that these two blocks are equivalent:

// 1
p->static_mem_fn();

// 2
*p;
A::static_mem_fn();

So the final question to answer is whether *p alone is undefined behavior when p is a null pointer value.

Conventional wisdom would say "yes" but this is not actually true. There is nothing in the standard that states dereferencing a null pointer alone is UB and there are several discussions that directly support this:

  • Issue 315, as you have mentioned in your question, explicitly states that *p is not UB when the result is unused.
  • DR 1102 removes "dereferencing the null pointer" as an example of UB. The given rationale is:

    There are core issues surrounding the undefined behavior of dereferencing a null pointer. It appears the intent is that dereferencing is well defined, but using the result of the dereference will yield undefined behavior. This topic is too confused to be the reference example of undefined behavior, or should be stated more precisely if it is to be retained.

  • This DR links to issue 232 where it is discussed to add wording that explicitly indicates *p as defined behavior when p is a null pointer, as long as the result is not used.

In conclusion:

p->non_static_mem_fn(); // UB by [class.mfct.non-static]/2
p->static_mem_fn(); // Defined behavior per issue 232 and 315.

Is dereferencing a pointer that's equal to nullptr undefined behavior by the standard?

As you quote C, dereferencing a null pointer is clearly undefined behavior from this Standard quote (emphasis mine):

(C11, 6.5.3.2p4) "If an invalid value has been assigned to the pointer, the
behavior of the unary * operator is undefined
.102)"

102): "Among the invalid values for dereferencing a pointer by the unary * operator are a null pointer, an address inappropriately aligned for the type of object pointed to, and the address of an object after the end of its lifetime."

Exact same quote in C99 and similar in C89 / C90.

Is null pointer dereference undefined behavior in Objective-C?

Since Objective-C is nothing more than an object-oriented layer on top of C, pure C statements don't have special additional meanings. According to this, in this case, *(long*)0 = 0; is evaluated and interpreted just like in C (since it is C) and thus it invokes undefined behavior. As such, it is not guaranteed to do anything.

Why null pointer dereference is not an exception

Because they would require an extraordinary level of runtime support, mandating checks on every single pointer access and vastly slowing down everybody's C++ programs whether they wanted this heavy-handed behaviour or not.

You are free to create a wrapper class that validates nullity on every access, and use that class when (and only when) you feel you need it. However, this would be considered a design smell, as you should never need such a device.

Instead, use proper memory management techniques that leave you without any null pointers whatsoever; the end of life of your pointees and your pointers should be the same.

Assigning a reference by dereferencing a NULL pointer

Dereferencing a null pointer is Undefined Behavior.

An Undefined Behavior means anything can happen, So it is not possible to define a behavior for this.

Admittedly, I am going to add this C++ standard quote for the nth time, but seems it needs to be.

Regarding Undefined Behavior,

C++ Standard section 1.3.24 states:

Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).

NOTE:

Also, just to bring it to your notice:

Using a returned reference or pointer to a local variable inside a function is also an Undefined Behavior. You should be allocating the pointer on freestore(heap) using new and then returning a reference/pointer to it.

EDIT:

As @James McNellis, appropriately points out in the comments,

If the returned pointer or reference is not used, the behavior is well defined.

Is Dereferencing a Constant Undefined Behavior

An integer constant expression with value 0, when converted to a pointer, yields a NULL pointer regardless of the actual representation of a NULL pointer.

Section 6.3.2.3p3 of the C standard states:

An integer constant expression with the value 0, or such an
expression cast to type void *, is called a null pointer
constant. If a null pointer constant is converted to a pointer type,
the resulting pointer, called a null pointer, is guaranteed to compare
unequal to a pointer to any object or function.

Converting any other integer value to a pointer value is implementation defined. From section 6.3.2.3p5:

An integer may be converted to any pointer type. Except as
previously specified, the result is implementation-defined, might
not be correctly aligned, might not point to an entity of the
referenced type, and might be a trap representation.

The above typically applies to embedded implementations where it makes sense to access a specific memory address.

If you had an implementation that supported a non-zero NULL pointer, you could assign the value 0 to it through a variable, for example:

int zero = 0;
int *zeroptr = (int *)zero;

In this case, the value of the pointer would be 0 but would not be NULL.



Related Topics



Leave a reply



Submit