What's a Proper Way of Type-Punning a Float to an Int and Vice-Versa

What's a proper way of type-punning a float to an int and vice-versa?

Forget casts. Use memcpy.

float xhalf = 0.5f*x;
uint32_t i;
assert(sizeof(x) == sizeof(i));
std::memcpy(&i, &x, sizeof(i));
i = 0x5f375a86 - (i>>1);
std::memcpy(&x, &i, sizeof(i));
x = x*(1.5f - xhalf*x*x);
return x;

The original code tries to initialize the int32_t by first accessing the float object through an int32_t pointer, which is where the rules are broken. The C-style cast is equivalent to a reinterpret_cast, so changing it to reinterpret_cast would not make much difference.

The important difference when using memcpy is that the bytes are copied from the float into the int32_t, but the float object is never accessed through an int32_t lvalue, because memcpy takes pointers to void and its insides are "magical" and don't break the aliasing rules.

Opinions on type-punning in C++?

As far as the C++ standard is concerned, litb's answer is completely correct and the most portable. Casting const char *data to a const uint3_t *, whether it be via a C-style cast, static_cast, or reinterpret_cast, breaks the strict aliasing rules (see Understanding Strict Aliasing). If you compile with full optimization, there's a good chance that the code will not do the right thing.

Casting through a union (such as litb's my_reint) is probably the best solution, although it does technically violate the rule that if you write to a union through one member and read it through another, it results in undefined behavior. However, practically all compilers support this, and it results in the the expected result. If you absolutely desire to conform to the standard 100%, go with the bit-shifting method. Otherwise, I'd recommend going with casting through a union, which is likely to give you better performance.

Unions and type-punning

To re-iterate, type-punning through unions is perfectly fine in C (but not in C++). In contrast, using pointer casts to do so violates C99 strict aliasing and is problematic because different types may have different alignment requirements and you could raise a SIGBUS if you do it wrong. With unions, this is never a problem.

The relevant quotes from the C standards are:

C89 section 3.3.2.3 §5:

if a member of a union object is accessed after a value has been stored in a different member of the object, the behavior is implementation-defined

C11 section 6.5.2.3 §3:

A postfix expression followed by the . operator and an identifier designates a member of a structure or union object. The value is that of the named member

with the following footnote 95:

If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’). This might be a trap representation.

This should be perfectly clear.


James is confused because C11 section 6.7.2.1 §16 reads

The value of at most one of the members can be stored in a union object at any time.

This seems contradictory, but it is not: In contrast to C++, in C, there is no concept of active member and it's perfectly fine to access the single stored value through an expression of an incompatible type.

See also C11 annex J.1 §1:

The values of bytes that correspond to union members other than the one last stored into [are unspecified].

In C99, this used to read

The value of a union member other than the last one stored into [is unspecified]

This was incorrect. As the annex isn't normative, it did not rate its own TC and had to wait until the next standard revision to get fixed.


GNU extensions to standard C++ (and to C90) do explicitly allow type-punning with unions. Other compilers that don't support GNU extensions may also support union type-punning, but it's not part of the base language standard.

Type punning in a const / static initializer (building a float constant from bits)

If you can use C++20 or above, then use std::bit_cast like

auto myvar = std::bit_cast<type_to_cast_to>(value_to_cast);

If you want to support older versions, you can do this same thing using std::memcpy to copy the bytes from one type to another. That would give you a function like

template <class To, class From>
To bit_cast(const From& src)
{
To dst;
std::memcpy(&dst, &src, sizeof(To));
return dst;
}

Unions, aliasing and type-punning in practice: what works and what does not?

Aliasing can be taken literally for what it means: it is when two different expressions refer to the same object. Type-punning is to "pun" a type, ie to use a object of some type as a different type.

Formally, type-punning is undefined behaviour with only a few exceptions. It happens commonly when you fiddle with bits carelessly

int mantissa(float f)
{
return (int&)f & 0x7FFFFF; // Accessing a float as if it's an int
}

The exceptions are (simplified)

  • Accessing integers as their unsigned/signed counterparts
  • Accessing anything as a char, unsigned char or std::byte

This is known as the strict-aliasing rule: the compiler can safely assume two expressions of different types never refer to the same object (except for the exceptions above) because they would otherwise have undefined behaviour. This facilitates optimizations such as

void transform(float* dst, const int* src, int n)
{
for(int i = 0; i < n; i++)
dst[i] = src[i]; // Can be unrolled and use vector instructions
// If dst and src alias the results would be wrong
}

What gcc says is it relaxes the rules a bit, and allows type-punning through unions even though the standard doesn't require it to

union {
int64_t num;
struct {
int32_t hi, lo;
} parts;
} u = {42};
u.parts.hi = 420;

This is the type-pun gcc guarantees will work. Other cases may appear to work but may one day silently be broken.

Safe and Efficient Type Punning in C++

Bit-wise arithmetic is well defined and perhaps more efficient. For this example:

return seed ^ (seed >> 32);

How do I reinterpret data through a different type? (type punning confusion)

Use memcpy:

memcpy(&f2, &a, sizeof(float));

If you are worried about type safety and semantics, you can easily write a wrapper:

void convert(float& x, int a) {
memcpy(&x, &a, sizeof(float));
}

And if you want, you can make this wrapper template to satisfy your needs.

type-punned warning

The warning is because the string is not guaranteed to be aligned the same way as when an integer variable is declared. Thus when the CPU needs to fetch the integer values, you are potentially making it less efficient than it could be (hence the warning).

You could start with integers to begin with:

int a;
int b;
char* as=(char*)(&a);
char* bs=(char*)(&b);
as[0]='f'; as[1]='o'; ...
bs[0]='f'; bs[1]='r'; ...
return EQ4(a, b);

Notes:

1) you will have to make sure you do not copy the terminating '\0' character of the string, because that will be touching memory outside of a (or b) in case of the examples you provided (see next Note).

2) you will have to make sure your strings are no larger than the size of int on the particular platform you are using, otherwise you are (again) touching memory that does not belong to the int.



Related Topics



Leave a reply



Submit