Portability of Binary Serialization of Double/Float Type in C++

C - Serialization of the floating point numbers (floats, doubles)

Assuming you're using mainstream compilers, floating point values in C and C++ obey the IEEE standard and when written in binary form to a file can be recovered in any other platform, provided that you write and read using the same byte endianess. So my suggestion is: pick an endianess of choice, and before writing or after reading, check if that endianess is the same as in the current platform; if not, just swap the bytes.

Serialize double and float with C

Following your update, you mention the data is to be transmitted using UDP and ask for best practices. I would highly recommend sending the data as text, perhaps even with some markup added (XML). Debugging endian-related errors across a transmission-line is a waste of everybody's time

Just my 2 cents on the "best practices" part of your question

Data serialization in C?

If you want to be as portable as possible with floats you can use frexp and ldexp:

void WriteFloat (float number)
{
  int exponent;
  unsigned long mantissa;

  mantissa = (unsigned int) (INT_MAX * frexp(number, &exponent);

  WriteInt (exponent);
  WriteUnsigned (mantissa);
}

float ReadFloat ()
{
  int exponent = ReadInt();
  unsigned long mantissa = ReadUnsigned();

  float value = (float)mantissa / INT_MAX;

  return ldexp (value, exponent);
}

The Idea behind this is, that ldexp, frexp and INT_MAX are standard C. Also the precision of an unsigned long is usually at least as high as the width of the mantissa (no guarantee, but it is a valid assumption and I don't know a single architecture that is different here).

Therefore the conversion works without precision loss. The division/multiplication with INT_MAX may loose a bit of precision during conversion, but that's a compromise one can live with.

Portable way to serialize float as 32-bit integer

You seem to have a bug in serialize_float: the last 4 lines should read:

buffer[(*index)++] = (res >> 24) & 0xFF;
buffer[(*index)++] = (res >> 16) & 0xFF;
buffer[(*index)++] = (res >> 8) & 0xFF;
buffer[(*index)++] = res & 0xFF;

Your method might not work correctly for infinities and/or NaNs because of the offset by 126 instead of 128. Note that you can validate it by extensive testing: there are only 4 billion values, trying all possibilities should not take very long.

The actual representation in memory of float values may differ on different architectures, but IEEE 854 (or more precisely IEC 60559) is largely prevalent today. You can verify if your particular targets are compliant or not by checking if __STDC_IEC_559__ is defined. Note however that even if you can assume IEEE 854, you must handle potentially different endianness between the systems. You cannot assume the endianness of floats to be the same as that of integers for the same platform.

Note also that the simple cast would be incorrect: uint32_t res = *(uint32_t *)&number; violates the strict aliasing rule. You should either use a union or use memcpy(&res, &number, sizeof(res));

Issues saving double as binary in c++

The trouble is that base 10 representation of double in ascii is flawed and not guaranteed to give you the correct result (especially if you only use 10 digits). There is a potential for a loss of information even if you use all std::numeric_limits<max_digits10> digits as the number may not be representable in base 10 exactly.

The other issue you have is that the binary representation of a double is not standardized so using it is very fragile and can lead to code breaking very easily. Simply changing the compiler or compiler sittings can result in a different double format and changing architectures you have absolutely no guarantees.

You can serialize it to text in a non lossy representation by using the hex format for doubles.

 stream << std::fixed << std::scientific << particles[i].pos[0];

 // If you are using C++11 this was simplified to

 stream << std::hexfloat << particles[i].pos[0];

This has the affect of printing the value with the same as "%a" in printf() in C, that prints the string as "Hexadecimal floating point, lowercase". Here both the radix and mantissa are converted into hex values before being printed in a very specific format. Since the underlying representation is binary these values can be represented exactly in hex and provide a non lossy way of transferring data between systems. IT also truncates proceeding and succeeding zeros so for a lot of numbers is relatively compact.

On the python side. This format is also supported. You should be able to read the value as a string then convert it to a float using float.fromhex()

see: https://docs.python.org/3/library/stdtypes.html#float.fromhex

But your goal is to save space:

But now, to save space, I am trying to save the configuration as a binary file.

I would ask the question do you really need to save space? Are you running on a low powered low resource environment? Sure then space saving can definitely be a thing (but that is rare nowadays (but these environments do exist)).

But it seems like you are running some form of particle simulation. This does not scream low resource use case. Even if you have tera bytes of data I would still go with a portable easy to read format over binary. Preferably one that is not lossy. Storage space is cheap.

how to convert values of type float and double into binary format and push into vector of type uint8_t?

If you just need to push them into vector and pop them (like a stack) you can do this:

void push( std::vector<uint8_t> &v, float f )
{
   auto offs = v.size();
   v.resize( offs + sizeof( f ) );
   std::memcpy( v.data() + offs, &f, sizeof( f ) );
}

float popFloat( std::vector<uint8_t> &v )
{
    float f = 0;
    if( v.size() >= sizeof( f ) ) {
        auto offs = v.size() - sizeof( f );
        std::memcpy( &f, v.data() + offs, sizeof( f ) );
        v.resize( offs );
    }
    return f;
}

Note this would store them in not portable format, but should work for storing/reading them to/from file on the same hardware.

You may rewrite those 2 functions as template and it will work with all integral and floating point types ( short int long double etc )

How to implement serialization and de-serialization of a double?

Answering question #2:

This is probably my "C-way" kind of thinking, but you could copy the double into a uint64_t (mem-copying, not type-casting), serialize the uint64_t instead, and do the opposite on de-serialization.

Here is an example (without even having to copy from double into uint64_t and vice-versa):

uint64_t* pi = (uint64_t*)&small;
stringstream stream;
stream.precision(maxPrecision);
stream << *pi;
cout << "serialized:    " << stream.str() << endl;
uint64_t out = stoull(stream.str());
double* pf = (double*)&out;
cout << "de-serialized: " << *pf << endl;

Please note that in order to avoid breaking strict-aliasing rule, you actually do need to copy it first, because the standard does not impose the allocation of double and uint64_t to the same address-alignment:

uint64_t ismall;
memcpy((void*)&ismall,(void*)&small,sizeof(small));
stringstream stream;
stream.precision(maxPrecision);
stream << ismall;
cout << "serialized:    " << stream.str() << endl;
ismall = stoull(stream.str());
double fsmall;
memcpy((void*)&fsmall,(void*)&ismall,sizeof(small));
cout << "de-serialized: " << fsmall << endl;