What's a proper way of type-punning a float to an int and vice-versa?
Forget casts. Use memcpy
.
float xhalf = 0.5f*x;
uint32_t i;
assert(sizeof(x) == sizeof(i));
std::memcpy(&i, &x, sizeof(i));
i = 0x5f375a86 - (i>>1);
std::memcpy(&x, &i, sizeof(i));
x = x*(1.5f - xhalf*x*x);
return x;
The original code tries to initialize the int32_t
by first accessing the float
object through an int32_t
pointer, which is where the rules are broken. The C-style cast is equivalent to a reinterpret_cast
, so changing it to reinterpret_cast
would not make much difference.
The important difference when using memcpy is that the bytes are copied from the float
into the int32_t
, but the float
object is never accessed through an int32_t
lvalue, because memcpy
takes pointers to void and its insides are "magical" and don't break the aliasing rules.
Opinions on type-punning in C++?
As far as the C++ standard is concerned, litb's answer is completely correct and the most portable. Casting const char *data
to a const uint3_t *
, whether it be via a C-style cast, static_cast
, or reinterpret_cast
, breaks the strict aliasing rules (see Understanding Strict Aliasing). If you compile with full optimization, there's a good chance that the code will not do the right thing.
Casting through a union (such as litb's my_reint
) is probably the best solution, although it does technically violate the rule that if you write to a union through one member and read it through another, it results in undefined behavior. However, practically all compilers support this, and it results in the the expected result. If you absolutely desire to conform to the standard 100%, go with the bit-shifting method. Otherwise, I'd recommend going with casting through a union, which is likely to give you better performance.
Unions and type-punning
To re-iterate, type-punning through unions is perfectly fine in C (but not in C++). In contrast, using pointer casts to do so violates C99 strict aliasing and is problematic because different types may have different alignment requirements and you could raise a SIGBUS if you do it wrong. With unions, this is never a problem.
The relevant quotes from the C standards are:
C89 section 3.3.2.3 §5:
if a member of a union object is accessed after a value has been stored in a different member of the object, the behavior is implementation-defined
C11 section 6.5.2.3 §3:
A postfix expression followed by the . operator and an identifier designates a member of a structure or union object. The value is that of the named member
with the following footnote 95:
If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’). This might be a trap representation.
This should be perfectly clear.
James is confused because C11 section 6.7.2.1 §16 reads
The value of at most one of the members can be stored in a union object at any time.
This seems contradictory, but it is not: In contrast to C++, in C, there is no concept of active member and it's perfectly fine to access the single stored value through an expression of an incompatible type.
See also C11 annex J.1 §1:
The values of bytes that correspond to union members other than the one last stored into [are unspecified].
In C99, this used to read
The value of a union member other than the last one stored into [is unspecified]
This was incorrect. As the annex isn't normative, it did not rate its own TC and had to wait until the next standard revision to get fixed.
GNU extensions to standard C++ (and to C90) do explicitly allow type-punning with unions. Other compilers that don't support GNU extensions may also support union type-punning, but it's not part of the base language standard.
Type punning in a const / static initializer (building a float constant from bits)
If you can use C++20 or above, then use std::bit_cast
like
auto myvar = std::bit_cast<type_to_cast_to>(value_to_cast);
If you want to support older versions, you can do this same thing using std::memcpy
to copy the bytes from one type to another. That would give you a function like
template <class To, class From>
To bit_cast(const From& src)
{
To dst;
std::memcpy(&dst, &src, sizeof(To));
return dst;
}
Unions, aliasing and type-punning in practice: what works and what does not?
Aliasing can be taken literally for what it means: it is when two different expressions refer to the same object. Type-punning is to "pun" a type, ie to use a object of some type as a different type.
Formally, type-punning is undefined behaviour with only a few exceptions. It happens commonly when you fiddle with bits carelessly
int mantissa(float f)
{
return (int&)f & 0x7FFFFF; // Accessing a float as if it's an int
}
The exceptions are (simplified)
- Accessing integers as their unsigned/signed counterparts
- Accessing anything as a
char
,unsigned char
orstd::byte
This is known as the strict-aliasing rule: the compiler can safely assume two expressions of different types never refer to the same object (except for the exceptions above) because they would otherwise have undefined behaviour. This facilitates optimizations such as
void transform(float* dst, const int* src, int n)
{
for(int i = 0; i < n; i++)
dst[i] = src[i]; // Can be unrolled and use vector instructions
// If dst and src alias the results would be wrong
}
What gcc says is it relaxes the rules a bit, and allows type-punning through unions even though the standard doesn't require it to
union {
int64_t num;
struct {
int32_t hi, lo;
} parts;
} u = {42};
u.parts.hi = 420;
This is the type-pun gcc guarantees will work. Other cases may appear to work but may one day silently be broken.
Safe and Efficient Type Punning in C++
Bit-wise arithmetic is well defined and perhaps more efficient. For this example:
return seed ^ (seed >> 32);
How do I reinterpret data through a different type? (type punning confusion)
Use memcpy
:
memcpy(&f2, &a, sizeof(float));
If you are worried about type safety and semantics, you can easily write a wrapper:
void convert(float& x, int a) {
memcpy(&x, &a, sizeof(float));
}
And if you want, you can make this wrapper template to satisfy your needs.
type-punned warning
The warning is because the string is not guaranteed to be aligned the same way as when an integer variable is declared. Thus when the CPU needs to fetch the integer values, you are potentially making it less efficient than it could be (hence the warning).
You could start with integers to begin with:
int a;
int b;
char* as=(char*)(&a);
char* bs=(char*)(&b);
as[0]='f'; as[1]='o'; ...
bs[0]='f'; bs[1]='r'; ...
return EQ4(a, b);
Notes:
1) you will have to make sure you do not copy the terminating '\0'
character of the string, because that will be touching memory outside of a
(or b
) in case of the examples you provided (see next Note).
2) you will have to make sure your strings are no larger than the size of int on the particular platform you are using, otherwise you are (again) touching memory that does not belong to the int.
Related Topics
Does It Make Any Sense to Use Inline Keyword With Templates
Visual Studio Code, #Include ≪Stdio.H≫ Saying "Add Include Path to Settings"
How to Append Text to a Text File in C++
Why Is Address Zero Used For the Null Pointer
Why Does This Function Push Rax to the Stack as the First Operation
Checking Cin Input Stream Produces an Integer
If (Cin ≫≫ X) - Why Can You Use That Condition
Opengl - Index Buffers Difficulties
"Downcasting" Unique_Ptr≪Base≫ to Unique_Ptr≪Derived≫
Detect If Stdin Is a Terminal or Pipe
Right Way to Split an Std::String into a Vector≪String≫
Defining Static Const Integer Members in Class Definition
Implementing Comparison Operators Via 'Tuple' and 'Tie', a Good Idea
Catching Access Violation Exceptions
Dynamical Two Dimension Array According to Input