Is Incrementing a Null Pointer Well-Defined

Is incrementing a null pointer well-defined?

§5.2.6/1:

The value of the operand object is modified by adding 1 to it, unless the object is of type bool [..]

And additive expressions involving pointers are defined in §5.7/5:

If both the pointer operand and the result point to elements of the
same array object, or one past the last element of the array object,
the evaluation shall not produce an overflow; otherwise, the behavior
is undefined.

Incrementing NULL pointer in C

The behaviour is always undefined. You can never own the memory at NULL.

Pointer arithmetic is only valid within arrays, and you can set a pointer to an index of the array or one location beyond the final element. Note I'm talking about setting a pointer here, not dereferencing it.

You can also set a pointer to a scalar and one past that scalar.

You can't use pointer arithmetic to traverse other memory that you own.

Is comparing to a pointer one element past the end of an array well-defined?

Comparing to a pointer one step past the end of an array is well defined. However, your pointer and pointer2 examples are undefined, even if you do literally nothing with those pointers.

A pointer may point to one element past the end of the array. That pointer may not be dereferenced (otherwise that would be undefined behavior) but it can be compared to another pointer within the array.

Section 6.5.6 of the C standard says the following regarding pointer addition (emphasis added):

8 If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array
object
, the evaluation shall not produce an overflow; otherwise, the
behavior is undefined
. If the result points one past the last element
of the array object, it shall not be used as the operand of a unary *
operator that is evaluated.

Section 6.5.8 says the following regarding pointer comparisons (emphasis added):

5 When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. If two
pointers to object types both point to the same object, or both point
one past the last element of the same array object, they compare
equal.
If the objects pointed to are members of the same aggregate
object, pointers to structure members declared later compare greater
than pointers to members declared earlier in the structure, and
pointers to array elements with larger subscript values compare
greater than pointers to elements of the same array with lower
subscript values. All pointers to members of the same union object
compare equal. If the expression P points to an element of an array
object and the expression Q points to the last element of the same
array object, the pointer expression Q+1 compares greater than P. In
all other cases, the behavior is undefined.

In the case of pointer1, it starts out pointing to NULL. Incrementing this pointer invokes undefined behavior because it don't point to a valid object.

For pointer2, it is increased by 4, putting it two elements past the end of the array, not one, so this is again undefined behavior. Had it been increased by 3, the behavior would be well defined.

What pointer values are well-defined to compute?

In your first example, the first p++ is well-defined, because a non-array is considered a one-length array.

Here's the relevant quote (basic.compound/3.4):

For purposes of pointer arithmetic ([expr.add]) and comparison ([expr.rel], [expr.eq]), a pointer past the end of the last element of an array x of n elements is considered to be equivalent to a pointer to a hypothetical array element n of x and an object of type T that is not an array element is considered to belong to an array with one element of type T.

After p++, p it will point past the last (and only) element of the (hypothetical) array, which is well-defined. It is not "invalid, but ok", as pointers pointing to past the end of an object are not invalid pointers, basic.compound/3.2:

Every value of pointer type is one of the following:

  • [...]

  • a pointer past the end of an object

  • [...]

  • an invalid pointer value.

The second p++ of the first example is UB, because the result will point after the hypothetical (&a)[1] element, which is not defined.

In your second example, p++ is UB, because only 0 can be added to a nullptr (expr.add/4.1):

  • If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value.

  • [...]

  • Otherwise, the behavior is undefined.

Can pointer taken from reference ever be null in well-defined c++?

Can pointer taken from reference ever be null in well-defined c++?

No. Standard quotes in this answer: Is null reference possible?

Although, in particular case of taking the pointer using an overloaded operator& of a class type can return anything, including null.

Is it somehow possible for the if to trigger?

Not in A::f nor ::f. It is possible to trigger in g(A*) but not when called from g(A&).

a warning would be a really useful diagnostic.

GCC nor Clang are not smart enough to detect the mistake in that case as you've observed, but they do detect a simpler version of the same mistake:

GCC

warning: the compiler can assume that the address of 'a' will never be NULL [-Waddress]
if(&a == nullptr) {
~~~^~~~~~~~~~
warning: nonnull argument 'a' compared to NULL [-Wnonnull-compare]
if(&a == nullptr) {
^~

Clang

warning: reference cannot be bound to dereferenced null pointer in well-defined C++ code; comparison may be assumed to always evaluate to false [-Wtautological-undefined-compare]
if(&a == nullptr) {

Pointer is increment to NULL till end of string as below code but while if check its prove to be wrong why?

The only way you could increase a pointer and get it to be NULL would be if you loop enough so the pointer address wraps and become zero. Or if you subtract the pointer from itself so the result becomes zero.

A valid pointer will otherwise not become a null pointer by simple pointer arithmetic. It might point out of bounds, but it will not become NULL.

What happens here is that temp is the one-character string ",". This is the same as an array containing the two characters ',' and '\0'. What happens when you do *temp++ = '\0' is that you modify the string to become the two characters '\0' followed by '\0' (you replace the ocmma with the string terminator). After the operation temp points to the second '\0'. The variable temp is itself not a null pointer, but it points to the null character (which is something completely different).

In other words what you possibly want might be something like this:

*temp++ = '\0';
if (*temp == '\0') { ... }

It might be easier to understand if we look at it a little more "graphically".

When you create the duplicate string

temp = strdup(q);

you will have something like this


----+-----+------+----
... | ',' | '\0' | ...
----+-----+------+----
^
|
+------+
| temp |
+------+

I.e. the variable temp points to a memory location which happens to be the "string" containing a single comma.

When you do *temp++ = '\0' what first happens is that you replace the comma that temp points to, then increases the pointer, which means it will look like this instead:


----+------+------+----
... | '\0' | '\0' | ...
----+------+------+----
^
|
+------+
| temp |
+------+

Example of error caused by UB of incrementing a NULL pointer

How about this example:

int main(int argc, char* argv[])
{
int a[] = { 111, 222 };

int *p = (argc > 1) ? &a[0] : nullptr;
p++;
p--;

return (p == nullptr);
}

At face value, this code says: 'If there are any command line arguments, initialise p to point to the first member of a[], otherwise initialise it to null. Then increment it, then decrement it, and tell me if it's null.'

On the face of it this should return '0' (indicating p is non-null) if we supply a command line argument, and '1' (indicating null) if we don't.
Note that at no point do we dereference p, and if we supply an argument then p always points within the bounds of a[].

Compiling with the command line clang -S --std=c++11 -O2 nulltest.cpp (Cygwin clang 3.5.1) yields the following generated code:

    .text
.def main;
.scl 2;
.type 32;
.endef
.globl main
.align 16, 0x90
main: # @main
.Ltmp0:
.seh_proc main
# BB#0:
pushq %rbp
.Ltmp1:
.seh_pushreg 5
movq %rsp, %rbp
.Ltmp2:
.seh_setframe 5, 0
.Ltmp3:
.seh_endprologue
callq __main
xorl %eax, %eax
popq %rbp
retq
.Leh_func_end0:
.Ltmp4:
.seh_endproc

This code says 'return 0'. It doesn't even bother to check the number of command line args.

(And interestingly, commenting out the decrement has no effect on the generated code.)

Can incrementing a pointer without dereferencing still segfault or have other (un)defined nastiness?

Section 5.7, "Additive operators", paragraph 5 specifies this - the result of the addition itself is undefined; the program isn't valid even if you never dereference the pointers.

If both the pointer operand and the result point to elements of the
same array object, or one past the last element of the array object,
the evaluation shall not produce an overflow; otherwise, the behavior
is undefined.

It's highly unlikely to segfault even though it's allowed to, but it's still undefined with all that entails.



Related Topics



Leave a reply



Submit