Should I Worry About the Alignment During Pointer Casting

Should I worry about the alignment during pointer casting?

1. Is it REALLY safe to dereference the pointer after casting in a real project?

If the pointer happens to not be aligned properly it really can cause problems. I've personally seen and fixed bus errors in real, production code caused by casting a char* to a more strictly aligned type. Even if you don't get an obvious error you can have less obvious issues like slower performance. Strictly following the standard to avoid UB is a good idea even if you don't immediately see any problems. (And one rule the code is breaking is the strict aliasing rule, § 3.10/10*)

A better alternative is to use std::memcpy() or std::memmove if the buffers overlap (or better yet bit_cast<>())

unsigned char data[16];
int i1, i2, i3, i4;
std::memcpy(&i1, data     , sizeof(int));
std::memcpy(&i2, data +  4, sizeof(int));
std::memcpy(&i3, data +  8, sizeof(int));
std::memcpy(&i4, data + 12, sizeof(int));

Some compilers work harder than others to make sure char arrays are aligned more strictly than necessary because programmers so often get this wrong though.

#include <cstdint>
#include <typeinfo>
#include <iostream>

template<typename T> void check_aligned(void *p) {
    std::cout << p << " is " <<
      (0==(reinterpret_cast<std::intptr_t>(p) % alignof(T))?"":"NOT ") <<
      "aligned for the type " << typeid(T).name() << '\n';
}

void foo1() {
    char a;
    char b[sizeof (int)];
    check_aligned<int>(b); // unaligned in clang
}

struct S {
    char a;
    char b[sizeof(int)];
};

void foo2() {
    S s;
    check_aligned<int>(s.b); // unaligned in clang and msvc
}

S s;

void foo3() {
    check_aligned<int>(s.b); // unaligned in clang, msvc, and gcc
}

int main() {
    foo1();
    foo2();
    foo3();
}

http://ideone.com/FFWCjf

2. Is there any difference between C-style casting and reinterpret_cast?

It depends. C-style casts do different things depending on the types involved. C-style casting between pointer types will result in the same thing as a reinterpret_cast; See § 5.4 Explicit type conversion (cast notation) and § 5.2.9-11.

3. Is there any difference between C and C++?

There shouldn't be as long as you're dealing with types that are legal in C.

* Another issue is that C++ does not specify the result of casting from one pointer type to a type with stricter alignment requirements. This is to support platforms where unaligned pointers cannot even be represented. However typical platforms today can represent unaligned pointers and compilers specify the results of such a cast to be what you would expect. As such, this issue is secondary to the aliasing violation. See [expr.reinterpret.cast]/7.

Does casting to a char pointer to increment a pointer by a certain amount and then accessing as a different type violate strict aliasing?

Does casting to a char pointer to increment a pointer by a certain amount and then accessing as a different type violate strict aliasing?

Not inherently so.

Normally, accessing an int * casted from a char * violates strict aliasing rules

Not necessarily. Strict aliasing is about the (effective) type of the pointed-to object. It is quite possible for the object to which a char * points to be an int, or compatible with int, or to be assigned effective type int as a consequence of the (write) access. In such cases, casting to int * and dereferencing the result is perfectly valid.

There are, yes, lots of cases in which casting a char * to an int * and then dereferencing the result would constitute a strict-aliasing violation, but it is not specifically because of the involvement of, or the casting to or from, type char *.

The above applies regardless of how the particular char * value was obtained, so in your particular example case, too. If the result of your pointer computation is a valid pointer, and the object to which it points is genuinely an (effective) int or is compatible with int in one of the specific ways documented in section 6.5 of the language spec, then reading the pointed-to value via the pointer is fine. Otherwise, it is a strict-aliasing violation.

Attempting to dereference a pointer value that is not correctly aligned for its type is a potential issue in general with pointer manipulation, but the strict aliasing rule is stronger than and effectively inclusive of pointer alignment considerations. If you have an access that satisfies the strict aliasing rule then the pointer involved must be satisfactorily aligned for its type. The reverse is not necessarily true.

Do note, however, that although on many platforms, your align16() will indeed attempt to perform a read of a 16-byte-aligned object, the C language specifications do not require that to be so. Pointer-to-integer and integer-to-pointer conversions are explicitly allowed, but their results are implementation defined. It is not necessarily the case that value on the integer side of such a conversion reports on or controls the alignment of the pointer on the other side.

How does the standard deal with such case, accessing a pointer modified while casted to a uintptr_t?

See above. Pointer-to-integer and integer-to-pointer conversions have implementation-defined effect as far as the language spec is concerned. However, on most implementations you're likely to meet, your two versions of align16() will have equivalent behavior.

Is pointer arithmetic still well defined after casting with alignment violation?

This can in fact result in undefined behavior if ptr is not properly aligned for uint32_t. Some systems might allow it but others could trigger a fault.

A safe conversion would be to char *, then doing the pointer arithmetic on that.

return (char *)ptr + dword_offset * sizeof(uint32_t);

Assert that a pointer is aligned to some value

Just to satisfy my own neuroses, I went and checked the the source of ptr::align_offset.

There's a lot of careful work around edge cases (e.g. const-evaluated it always returns usize::MAX, similarly for a pointer to a zero-sized type, and it panics if alignment is not a power of 2). But the crux of the implementation, for your purposes, is here: it takes (ptr as usize) % alignment == 0 to check if it's aligned.

Edit:
This PR is adding a ptr::is_aligned_to function, which is much more readable and also safer and better reviewed than simply (ptr as usize) % alginment == 0 (though the core of it is still that logic).

There's then some more complexity to calculate the exact offset (which may not be possible), but that's not relevant for this question.

Therefore:

assert_eq!(ptr.align_offset(alignment), 0);

should be plenty for your assertion.

Incidentally, this proves that the current rust standard library cannot target anything that does not represent pointers as simple numerical addresses, otherwise this function would not work. In the unlikely situation that the rust standard library is ported to the Intel 8086 or some weird DSP that doesn't represent pointers in the expected way, this function would have to change. But really, do you care for that hypothetical that much?

Does this violate strict aliasing or pointer alignment rules?

If alignof(unit16_t) != 1 then this line may cause undefined behaviour due to alignment:

uint16_t* data16 = reinterpret_cast<uint16_t*>(data);

Putting an alignment check after this is no good; for a compiler could hardcode the check to say 1 because it knows that correct code couldn't reach that point otherwise.

In Standard C++ , for this check to be meaningful it must occur before the cast, and then the cast must not be performed if the check fails. (UB can time travel).

Of course, individual compilers may choose to define behaviour that is not defined by the Standard, e.g. perhaps g++ targeting x86 or x64 includes a definition that you're allowed to form unaligned pointers and dereference them.

There is no strict aliasing violation, as __builtin_bswap16 is not covered by the standard and we presume g++ implements it in such a way that is consistent with itself. MSVC doesn't do strict aliasing optimizations anyway.

Should I Worry About the Alignment During Pointer Casting