Floating Point Endianness

Floating point Endianness?

Yes, floating point can be endianess dependent. See Converting float values from big endian to little endian for info, be sure to read the comments.

Would float point format be affected by big-endian and little endian?

The IEEE754 specification for floating point numbers simply doesn't cover the endianness problem. Floating point numbers therefore may use different representations on different machines and, in theory, it's even possible that for two processors integer endianness is the same and floating point is different or vice-versa.

See this wikipedia article for more information.

What is the difference between big and little endian floats?

Some sources say IEEE754 floats are always stored little-endian but The IEEE754 specification for floating point numbers simply doesn't cover the endianness problem and may vary from machine to machine.
Here is sample code for floating point / byte array conversion:

#include <stdio.h>

int main(int argc, char** argv){
char *a;
float f = 3.14159; // number to start with

a = (char *)&f; // point a to f's location

// print float & byte array as hex
printf("float: %f\n", f);
printf("byte array: %hhX:%hhX:%hhX:%hhX\n", \
a[0], a[1], a[2], a[3]);

// toggle the sign of f -- using the byte array
a[3] = ((unsigned int)a[3]) ^ 128;

//print the numbers again
printf("float: %f\n", f);
printf("byte array: %hhX:%hhX:%hhX:%hhX\n", \
a[0], a[1], a[2], a[3]);

return 0;
}

It's output on a little-indian machine:

float: 3.141590
byte array: D0:F:49:40
float: -3.141590
byte array: D0:F:49:C0

Theoretically, on a big-endian machine the order of bytes would be reversed.

Reference:
http://betterexplained.com/articles/understanding-big-and-little-endian-byte-order/

Endianness for floating point

A check for float endian is also a check for the encoding.

If the encoding is not float32, detect that.

Instead of checking with a byte pattern like 0xBF800000 (-1.0f) with multiple zero bytes, consider using a pattern where the expected byte pattern is different for every byte. Also check every byte.

  const float f = -0x1.ca8642p-113f;  // 0x87654321, IEEE-754 binary32
if (sizeof(float) != 4) {
printf("float is not 4 bytes\n");
} else if (memcmp(&f, (uint8_t[4]){0x87, 0x65, 0x43, 0x21}, sizeof f) == 0) {
printf("Big\n");
} else if (memcmp(&f, (uint8_t[4]){0x21, 0x43, 0x65, 0x87}, sizeof f) == 0) {
printf("Little\n");
} else {
printf("Unknown\n"); // TBD endian or float encoding
}

What is the most correct way to change endianess of floating point numbers

Sebastian Redl's answer is correct if you stay with simple non-Intel IEEE-754 float or double, but it will fail with Intel's special representation of double and long double, and all other special ideas for their long double formats. Only very few architectures use the standard IEEE-754 floating point formats.
Even the easiest mips, which can use BE/LE at will, has a special MIPS64 16 byte long double format.

So there's no correct and easy way to do a fast byteswap for floats. However I wrote code to read floats from various architectures into the current architecture, which is a herculean task. https://github.com/parrot/parrot/blob/native_pbc2/src/packfile/pf_items.c#L553
Note: The intel speciality is the extra normalization bit (the highest bit 63 of the mantissa) marked with i in https://github.com/parrot/parrot/blob/native_pbc2/src/packfile/pf_items.c#L605

I.e. I convert between those, BE and LE:

  • Floattype 0 = IEEE-754 8 byte double (binary64)
  • Floattype 1 = Intel 80-bit long double stored in 12 byte (i386) or aligned to 16 byte (x86_64/ia64)
  • Floattype 2 = IEEE-754 128 bit quad precision stored in 16 byte, Sparc64 quad-float or __float128, gcc since 4.3 (binary128)
  • Floattype 3 = IEEE-754 4 byte float (binary32)
  • Floattype 4 = PowerPC 16 byte double-double (-mlong-double-128)

not yet:

  • Floattype 5 = IEEE-754 2 byte half-precision float (binary16)
  • Floattype 6 = MIPS64 16 byte long double
  • Floattype 7 = AIX 16 byte long double
  • CRAY and more crazyness

Since there was no big need, I never made a proper library for this float-conversion code.
Btw. I use much faster native byteswap functions, see https://github.com/parrot/parrot/blob/native_pbc2/include/parrot/bswap.h

Usually you print with max. precision to a string and read this string. There you only have the problem to find out your max. precision.

c++20 and float endianness

Does the c++20 endian proposal only deal with integer types, or does it also give information on float types?

As it stands, it'll tell you if all scalar types are big or little endian - or, the horror, they are not and you're dealing with mixed endianess.

All arithmetic types are included in scalars, both integer and floating point types.

The why: Pure speculation, but making a portable test when having seen gazillions of C type punning versions ported to C++ would be one why.

Also, the articles I could find seemed to suggest the additions will only provide a way to detect endianness, but not provide standard library functions for converting between different encodings. If that was the case, this doesn't seem any more useful than just standardizing a preprocessor definition.

You only get a portable way to detect endianess.

Data storage IEEE754 with little and big endian

Your interpretation is perfectly correct. It can be easily verified with a simple program.

#include <stdio.h>
#include <string.h>

int main() {

float f=-123456.75f;
int i;
unsigned char c[4];

memcpy(c,&f,4);
memcpy(&i,&f,4);
printf("decimal representation of f: %f\n",f);
printf("hex representation of f: %a\n",f);
printf("hex value of integer with the same bytes as f: %x\n",i);
printf("sucessive bytes in f (0:3): %.2x %.2x %.2x %.2x\n",c[0],c[1],c[2],c[3]);
/* gives
decimal representation of f: -123456.750000
hex representation of f: -0x1.e240cp+16
hex value of integer with the same bytes as f: c7f12060
sucessive bytes in f (0:3): 60 20 f1 c7
*/
}

I’m slightly confused now, because this must mean that e.g. the bit for the sign of the number is the first one of the AB Byte, isn’t it?

No reason to be confused. In memory, sign bit will indeed be the MSB of the fourth byte of a float representation on a little endian architecture.

But endianness only concerns how bytes are stored in memory. Whatever the endianness, once loaded in a register and manipulated by a program, the behavior will be identical.

We are used to write left to right and to start numbers by the most significant digit. But other representation schemes are perfectly valid provided the mathematical properties are kept.

If it can help you, write the bytes with the least significant bit at the left (but right and left have no meaning on a computer).

|--------- Byte 1 ------|--------- Byte 2 ------------|--------- Byte 3 -------------|--------- Byte 4 --------|
|m0 m1 m2 m3 m4 m5 m6 m7|m8 m0 m10 m11 m12 m13 m14 m15|m16 m17 m18 m19 m20 m21 m22 e0|e1 e2 e3 e4 e5 e6 e7 sign|

Issues with converting from floating point little-endian to big-endian and back again

I believe you are encountering NaN collapsing. If you really need to distinguish different NaN values you are going to have more problems than just your file storage. According to the Java Language Specification, 4.2.3. Floating-Point Types, Formats, and Values:

IEEE 754 allows multiple distinct NaN values for each of its single
and double floating-point formats. While each hardware architecture
returns a particular bit pattern for NaN when a new NaN is generated,
a programmer can also create NaNs with different bit patterns to
encode, for example, retrospective diagnostic information.

For the most part, the Java SE Platform treats NaN values of a given
type as though collapsed into a single canonical value, and hence this
specification normally refers to an arbitrary NaN as though to a
canonical value.

I asked "Why do you use Float.floatToRawIntBits(raf.readFloat()) rather than raf.readInt()?" because I was trying to understand, and possibly simplify, your test program, not in any expectation of fixing the problem.

Handling endianness of floating point values when there is no fixed size floating point type available

If you want to be completely cross-platform and standards-compliant, then the frexp/ldexp solution is the best way to go. (Although you might need to consider the highly theoretical case where either the source or the target hardware uses decimal floating point.)

Suppose that one or the other machine did not have a 32-bit floating point representation. Then there is no datatype on that machine bit-compatible with a 32-bit floating pointer number, regardless of endianness. So there is then no standard way of converting the non-32-bit float to a transmittable 32-bit representation, or to convert the transmitted 32-bit representation to a native non-32-bit floating point number.

You could restrict your scope to machines which have a 32-bit floating point representation, but then you will need to assume that both machines have the same number and order of bits dedicated to sign, exponent and mantissa. That's likely to be the case, since IEEE-754 format is almost universal these days, but C++ does not insist on it and it is at least conceivable that there is a machine which implements 1/8/23-bit floating point numbers with the sign bit at the low-order end instead of the high-order end.

In short, endianness is only one of the possible incompatibilities between binary floating point formats. Reducing every floating point number to two integers, however, avoids having to deal with other incompatibilities (other than radix).



Related Topics



Leave a reply



Submit