C++ Aliasing Rules

What is the strict aliasing rule?

A typical situation where you encounter strict aliasing problems is when overlaying a struct (like a device/network msg) onto a buffer of the word size of your system (like a pointer to uint32_ts or uint16_ts). When you overlay a struct onto such a buffer, or a buffer onto such a struct through pointer casting you can easily violate strict aliasing rules.

So in this kind of setup, if I want to send a message to something I'd have to have two incompatible pointers pointing to the same chunk of memory. I might then naively code something like this:

typedef struct Msg
{
unsigned int a;
unsigned int b;
} Msg;

void SendWord(uint32_t);

int main(void)
{
// Get a 32-bit buffer from the system
uint32_t* buff = malloc(sizeof(Msg));

// Alias that buffer through message
Msg* msg = (Msg*)(buff);

// Send a bunch of messages
for (int i = 0; i < 10; ++i)
{
msg->a = i;
msg->b = i+1;
SendWord(buff[0]);
SendWord(buff[1]);
}
}

The strict aliasing rule makes this setup illegal: dereferencing a pointer that aliases an object that is not of a compatible type or one of the other types allowed by C 2011 6.5 paragraph 71 is undefined behavior. Unfortunately, you can still code this way, maybe get some warnings, have it compile fine, only to have weird unexpected behavior when you run the code.

(GCC appears somewhat inconsistent in its ability to give aliasing warnings, sometimes giving us a friendly warning and sometimes not.)

To see why this behavior is undefined, we have to think about what the strict aliasing rule buys the compiler. Basically, with this rule, it doesn't have to think about inserting instructions to refresh the contents of buff every run of the loop. Instead, when optimizing, with some annoyingly unenforced assumptions about aliasing, it can omit those instructions, load buff[0] and buff[1] into CPU registers once before the loop is run, and speed up the body of the loop. Before strict aliasing was introduced, the compiler had to live in a state of paranoia that the contents of buff could change by any preceding memory stores. So to get an extra performance edge, and assuming most people don't type-pun pointers, the strict aliasing rule was introduced.

Keep in mind, if you think the example is contrived, this might even happen if you're passing a buffer to another function doing the sending for you, if instead you have.

void SendMessage(uint32_t* buff, size_t size32)
{
for (int i = 0; i < size32; ++i)
{
SendWord(buff[i]);
}
}

And rewrote our earlier loop to take advantage of this convenient function

for (int i = 0; i < 10; ++i)
{
msg->a = i;
msg->b = i+1;
SendMessage(buff, 2);
}

The compiler may or may not be able to or smart enough to try to inline SendMessage and it may or may not decide to load or not load buff again. If SendMessage is part of another API that's compiled separately, it probably has instructions to load buff's contents. Then again, maybe you're in C++ and this is some templated header only implementation that the compiler thinks it can inline. Or maybe it's just something you wrote in your .c file for your own convenience. Anyway undefined behavior might still ensue. Even when we know some of what's happening under the hood, it's still a violation of the rule so no well defined behavior is guaranteed. So just by wrapping in a function that takes our word delimited buffer doesn't necessarily help.

So how do I get around this?

  • Use a union. Most compilers support this without complaining about strict aliasing. This is allowed in C99 and explicitly allowed in C11.

      union {
    Msg msg;
    unsigned int asBuffer[sizeof(Msg)/sizeof(unsigned int)];
    };
  • You can disable strict aliasing in your compiler (f[no-]strict-aliasing in gcc))

  • You can use char* for aliasing instead of your system's word. The rules allow an exception for char* (including signed char and unsigned char). It's always assumed that char* aliases other types. However this won't work the other way: there's no assumption that your struct aliases a buffer of chars.

Beginner beware

This is only one potential minefield when overlaying two types onto each other. You should also learn about endianness, word alignment, and how to deal with alignment issues through packing structs correctly.

Footnote

1 The types that C 2011 6.5 7 allows an lvalue to access are:

  • a type compatible with the effective type of the object,
  • a qualified version of a type compatible with the effective type of the object,
  • a type that is the signed or unsigned type corresponding to the effective type of the object,
  • a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
  • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
  • a character type.

Is void** an exception to strict aliasing rules?

Basically, is this code legal when strict aliasing is enabled?

No. The effective type of pi is int* but you lvalue access the pointer variable through a void*. De-referencing a pointer to give an access which doesn't correspond to the effective type of the object is a strict aliasing violation - with some exceptions, this isn't one.

In your second example, both parameters to the function are set to point at an object of effective type int* which is done here: f(&a, (char **) &a);. Therefore *b inside the function is indeed a strict aliasing violation, since you are using a char* type for the access.

In your third example you do the same but with a void*. This is also a strict aliasing violation. There is nothing special with void* or void** in this context.

Why your compilers exhibits a certain form of undefined behavior in some situations is not very meaningful to speculate about. Although void* must by definition be convertible to/from any other object pointer type, so they very likely have the representation internally, even though that's not an explicit requirement from the standard.

Also you are using -fno-strict-aliasing which turns off various pointer aliasing-based optimizations. If you wish to provoke strange and unexpected results, you shouldn't use that option.

strict aliasing in C

They both violate the strict aliasing rule, I am going to quote my answer here which says (emphasis mine going forward):

code violates the strict aliasing rules which makes it illegal to access an object through a pointer of a different type, although access through a char * is allowed. The compiler is allowed to assume that pointers of different types do not point to the same memory and optimize accordingly.

gcc is a little more detailed in the documetation of -Wstrict-aliasing=n here which says:

This option is only active when -fstrict-aliasing is active. It warns about code that might break the strict aliasing rules that the compiler is using for optimization. Higher levels correspond to higher accuracy (fewer false positives). Higher levels also correspond to more effort, similar to the way -O works. -Wstrict-aliasing is equivalent to -Wstrict-aliasing=3.

and describes each level as follows:

  • Level 1: Most aggressive, quick, least accurate. Possibly useful when
    higher levels do not warn but -fstrict-aliasing still breaks the code,
    as it has very few false negatives. However, it has many false
    positives. Warns for all pointer conversions between possibly
    incompatible types, even if never dereferenced. Runs in the front end
    only.

  • Level 2: Aggressive, quick, not too precise. May still have many false
    positives (not as many as level 1 though), and few false negatives
    (but possibly more than level 1). Unlike level 1, it only warns when
    an address is taken. Warns about incomplete types. Runs in the front
    end only.

  • Level 3 (default for -Wstrict-aliasing): Should have very few false
    positives and few false negatives. Slightly slower than levels 1 or 2
    when optimization is enabled. Takes care of the common pun+dereference
    pattern in the front end: *(int*)&some_float.
    If optimization is
    enabled, it also runs in the back end, where it deals with multiple
    statement cases using flow-sensitive points-to information. Only warns
    when the converted pointer is dereferenced. Does not warn about
    incomplete types.

So it is not guaranteed to catch all instances and different levels have different degrees of accuracy.

Typically the effect you are looking for can be accomplished using type punning through a union, which I cover in my linked answer above and gcc explicitly supports.

What is the rationale behind the strict aliasing rule?

Since, in this example, all the code is visible to a compiler, a compiler can, hypothetically, determine what is requested and generate the desired assembly code. However, demonstration of one situation in which a strict aliasing rule is not theoretically needed does nothing to prove there are not other situations where it is needed.

Consider if the code instead contains:

foo(&val, ptr)

where the declaration of foo is void foo(uint64_t *a, uint32_t *b);. Then, inside foo, which may be in another translation unit, the compiler would have no way of knowing that a and b point to (parts of) the same object.

Then there are two choices: One, the language may permit aliasing, in which case the compiler, while translating foo, cannot make optimizations relying on the fact that *a and *b are different. For example, whenever something is written to *b, the compiler must generate assembly code to reload *a, since it may have changed. Optimizations such as keeping a copy of *a in registers while working with it would not be allowed.

The second choice, two, is to prohibit aliasing (specifically, not to define the behavior if a program does it). In this case, the compiler can make optimizations relying on the fact that *a and *b are different.

The C committee chose option two because it offers better performance while not unduly restricting programmers.

Strict Aliasing Rule and Type Aliasing in C++

No, it's not legal and you have Undefined Behavior:

8.2.1 Value category [basic.lval]

11 If a program attempts to access the stored value of an object
through a glvalue of other than one of the following types the
behavior is undefined: 63

(11.1) — the dynamic type of the object,

(11.2) — a cv-qualified version of the dynamic type of the object,

(11.3) — a type similar (as defined in 7.5) to the dynamic type of the
object,

(11.4) — a type that is the signed or unsigned type corresponding to
the dynamic type of the object,

(11.5) — a type that is the signed or unsigned type corresponding to a
cv-qualified version of the dynamic type of the object,

(11.6) — an aggregate or union type that includes one of the
aforementioned types among its elements or nonstatic data members
(including, recursively, an element or non-static data member of a
subaggregate or contained union),

(11.7) — a type that is a (possibly cv-qualified) base class type of
the dynamic type of the object,

(11.8) — a char, unsigned char, or std::byte type



63) The intent of this list is to specify those circumstances in which
an object may or may not be aliased.

Is this strict aliasing violation? Can any type pointer alias a char pointer?

Strict aliasing means that to dereference a T* ptr, there must be a T object at that address, alive obviously. Effectively this means you cannot naively bit-cast between two incompatible types and also that a compiler can assume that no two pointers of incompatible types point to the same location.

The exception is unsigned char , char and std::byte, meaning you can reinterpret cast any object pointer to a pointer of these 3 types and dereference it.

(T*)ptr; is valid because at ptr there exists a T object. That is all that is required, it does not matter how you got that pointer*, through how many casts it went. There are some more requirements when T has constant members but that has to do more with placement new and object resurrection - see this answer if you are interested.

*It does matter even in case of no const members, probably, not sure, relevant question . @eerorika 's answer is more correct to suggest std::launder or assigning from the placement new expression.

For the record, a void* can alias any other type pointer, and any type pointer can alias a void*.

That is not true, void is not one of the three allowed types. But I assume you are just misinterpreting the word "alias" - strict aliasing only applies when a pointer is dereferenced, you are of course free to have as many pointers pointing to wherever you want as long as you do not dereference them. Since void* cannot be dereferenced, it's a moo point.

Addresing your second example

char* buffer = (char*)malloc(16); //OK

// Assigning pointers is always defined the rules only say when
// it is safe to dereference such pointer.
// You are missing a cast here, pointer cannot be casted implicitly in C++, C produces a warning only.
float* pFloat = buffer;
// -> float* pFloat =reinterpret_cast<float*>(buffer);

// NOT OK, there is no float at `buffer` - violates strict aliasing.
*pFloat = 6;
// Now there is a float
new (pFloat) float;
// Yes, now it is OK.
*pFloat = 7;

Why doesn't strict aliasing rule apply to int* and unsigned*?

To understand the intended meaning of the signed/unsigned exemption, one must first understand the background of those types. The C language didn't originally have an "unsigned" integer type, but was instead designed for use on two's-complement machines with quiet wraparound on overflow. While there were a few operations, most notably the relational operators, divide, remainder, and right-shift, where signed and unsigned behaviors would differ, performing most operations on signed types would yield the same bit patterns as performing those same operations on unsigned types, thus minimizing the need for the latter.

Although unsigned types are certainly useful even on quiet-wraparound two's-complement machines, they are indispensable on platforms that do not support quiet-wraparound two's-complement semantics. Because C did not initially support such platforms, however, a lot of code which logically "should" have used used unsigned types, and would have used them if they'd existed sooner, was written to use signed types instead. The authors of the Standard did not want the type-access rules to create any difficulty interfacing between code which used signed types because unsigned types weren't available when it was written, and code which used unsigned types because they were available and their use would make sense.

The historical reasons for treating int and unsigned interchangeably would apply equally to allowing objects of type int* to be accessed using lvalues of type unsigned* and vice versa, int** to be accessed using unsigned**, etc. While the Standard doesn't explicitly specify that any such usages should be allowed, it also neglects to mention some other uses that should obviously be allowed, and thus cannot be reasonably viewed as fully and completely describing everything that implementations should support.

The Standard fails to distinguish between two kinds of circumstances involving pointer-based type punning--those which involve aliasing, and those which don't-- beyond a non-normative footnote saying that the purpose of the rules is to indicate when things may alias. The distinction is illustrated below:

int *x;
unsigned thing;
int *usesAliasingUnlessXandPDisjoint(unsigned **p)
{
if (x)
*p = &thing;
return x;
}

if x and *p identify the same storage there would be aliasing between *p and x, because the creation of p and the write via *p would be separated by a conflicting access to the storage using the lvalue x. However, given something like:

unsigned thing;
unsigned writeUnsignedPtr(unsigned **p)
{ *p = &thing; }

int *x;
int *doesNotUseAliasing(void)
{
if (x)
writeUnsignedPtr((unsigned**)&x);
return x;
}

there would be no aliasing between the *p argument and x, since within the lifetime of the passed pointer p, neither x nor any other other pointer or lvalue not derived from p, is used to access the same storage as *p. I think it's clear the authors of the Standard wanted to allow for the latter pattern. I think it's less clear whether they wanted to allow the former even for lvalues of type signed and unsigned [as opposed to signed* or unsigned*], or didn't realize that limiting application of the rule to cases that actually involve aliasing would be sufficient to allow the latter.

The way gcc and clang interpret the aliasing rules does not extend the compatibility between int and unsigned to int* and unsigned*--a limitation which is allowable given the wording of the Standard, but which--at least in cases not involving aliasing, I would regard as contrary to the Standard's stated purpose.

Your particular example does involve aliasing in cases where *a and *b overlap, since either a was created first and a conflicting access via *b happens between such creation and the last use of *a, or b was created first and a conflicting access via *a happens between such creation and the last use of b. I'm not sure whether the authors of the Standard intended to allow such usage or not, but the same reasons that would justify allowing int and unsigned would apply equally to int* and unsigned*. On the other hand, gcc and clang's behavior does not seem to be dictated by what the authors of the Standard meant to say as indicated by the published Rationale, but rather by what they fail to demand that compilers do.



Related Topics



Leave a reply



Submit