Why Does "Dtoa.C" Contain So Much Code

Why does dtoa.c contain so much code?

dtoa.c contains two main functions: dtoa(), which converts a double to string, and strtod(), which converts a string to a double. It also contains a lot of support functions, most of which are for its own implementation of arbitrary-precision arithmetic. dtoa.c's claim to fame is getting these conversions right, and that can only be done, in general, with arbitrary-precision arithmetic. It also has code to round conversions correctly in four different rounding modes.

Your code only tries to implement the equivalent of dtoa(), and since it uses floating-point to do its conversions, will not always get them right. (Update: see my article http://www.exploringbinary.com/quick-and-dirty-floating-point-to-decimal-conversion/ for details.)

(I've written a lot about this on my blog, http://www.exploringbinary.com/ . Six of my last seven articles have been about strtod() conversions alone. Read through them to see how complicated it is to do correctly rounded conversions.)

What is wrong with the following code? It converts double to string without using sprintf or ostream

The problem is in your implementation of itoa. What happens if the input to itoa is 0?

Your output of -2.9879947598364142621957397469375 for an input of -2.987 should be -2.9870000000000000994759830064140260219573974609375. Notice that the zeros in my result are missing from yours. Those missing zeros are because of that bug in itoa.

Once you get to the point of dealing with single decimal digits, your itoa is complete overkill. It would be better to use an array that maps the integers 0 to 9 to the characters '0' to '9'. (Or you could just use the fact that '0' to '9' are almost certainly contiguous characters on your computer. That hasn't alway been the case, but I can pretty much guarantee that you aren't working with such a beast.)

Even better would be to recognize that the substring starting with 99475983… is completely extraneous. It would be better to print this as -2.9870000000000001, and even better to print it as -2.987.

How does printf extract digits from a floating point number?

The classic implementation is David Gay's dtoa. The exact details are somewhat arcane (see Why does "dtoa.c" contain so much code?), but in general it works by doing the base conversion using more precision beyond what you can get from a 32-bit, 64-bit, or even 80-bit floating point number. To do this, it uses so-called "bigints" or arbitrary-precision numbers, which can hold as many digits as you can fit in memory. Gay's code has been copied, with modifications, into countless other libraries including common implementations for the C standard library (so it might power your printf), Java, Python, PHP, JavaScript, etc.

(As a side note... not all of these copies of Gay's dtoa code were kept up to date, so because PHP used an old version of strtod it hung when parsing 2.2250738585072011e-308.)

In general, if you do things the "obvious" and simple way like multiplying by a power of 10 and then converting the integer, you will lose a small amount of precision and some of the results will be inaccurate... but maybe you will get the first 14 or 15 digits correct. Gay's implementation of dtoa() claims to get all the digits correct... but as a result, the code is quite difficult to follow. Skip to the bottom to see strtod itself, you can see that it starts with a "fast path" which just uses ordinary floating-point arithmetic, but then it detects if that result is incorrect and uses a more reliable algorithm using bigints which works in all cases (but is slower).

The implementation has the following citation, which you may find interesting:


* Inspired by "How to Print Floating-Point Numbers Accurately" by
* Guy L. Steele, Jr. and Jon L. White [Proc. ACM SIGPLAN '90, pp. 112-126].

The algorithm works by calculating a range of decimal numbers which produce the given binary number, and by using more digits, the range gets smaller and smaller until you either have an exact result or you can correctly round to the requested number of digits.

In particular, from sec 2.2 Algorithm,

The algorithm uses exact rational arithmetic to perform its computations so that there is no loss of accuracy. In order to generate digits, the algorithm scales the number so that it is of the form 0.d1d2..., where d1, d2, ..., are base-B digits. The first digit is computed by multiplying the scaled number by the output base, B, and taking the integer part. The remainder is used to compute the rest of the digits using the same approach.

The algorithm can then continue until it has the exact result (which is always possible, since floating-point numbers are base 2, and 2 is a factor of 10) or until it has as many digits as requested. The paper goes on to prove the algorithm's correctness.


Also note that not all implementations of printf are based on Gay's dtoa, this is just a particularly common implementation that's been copied a lot.

Get the digits of a long double

Rounding error amplified by mod

ftob_tmp * powl(...) product likely needs to round to nearest long double and so is not the exact math result. This rounded product is then modded and sometimes returns 0 or 9 as it is on or just under an integer value for the later digits of 0.255999999999999999999.

//                  v- rounding introduced error -v
str[i] = (int)fmodl((ftob_tmp * powl(base, index++)), base);
// ^-- error magnified -----------------^

With more debug info, one can see the sometimes 0, sometimes 9 when only 9 was expected.

printf("bbb %0.20Lf * %Lf = %0.20Lf  %d\n", 
ftob_tmp, powl(base, index), ftob_tmp * powl(base, index),
(int) fmodl((ftob_tmp * powl(base, index++)), base));

bbb 0.25599999999999999999 * 100.000000 = 25.59999999999999999861 2
bbb 0.25599999999999999999 * 10000.000000 = 2560.00000000000000000000 5
bbb 0.25599999999999999999 * 1000000.000000 = 255999.99999999999998578915 9
bbb 0.25599999999999999999 * 100000000.000000 = 25599999.99999999999818101060 0
bbb 0.25599999999999999999 * 10000000000.000000 = 2560000000.00000000000000000000 9
bbb 0.25599999999999999999 * 1000000000000.000000 = 255999999999.99999998509883880615 9
bbb 0.25599999999999999999 * 100000000000000.000000 = 25599999999999.99999809265136718750 0
bbb 0.25599999999999999999 * 10000000000000000.000000 = 2560000000000000.00000000000000000000 9
...


how I can achieve this, or if it is achievable at all (?)

Yes, achievable but not with OP's approach as too much error is injected in various steps. These corner cases are quite difficult and usual oblige an wide or extended integer computation instead of floating point.

Example code to print a double in base 10 exactly may help.


Other lesser issues

More rounding error

The loops with ftob_tmp *= base and ftob_tmp /= base each inject up to a 0.5 ULP error. These loops can then form an off-by-one j calculation.

-0.0

Test the sign, not the value, else -0.0 will print as 0.0.

// sign = (ld < 0.0L) ? -1 : 1;
sign = signbit(ld) ? -1 : 1;

String size

char ftl[300]; is insufficient for LDBL_MAX in base 2. Look to LDBL_MAX_EXP, LDBL_MIN_EXP to help determine minimum max string size.

What causes Python's float_repr_style to use legacy?

Short answer: it's likely to be not a limitation of the platform, but a limitation of Python's build machinery: it doesn't have a universal way to set 53-bit precision for floating-point computations.

For more detail, take a look at the Include/pyport.h file in the Python source distribution. Here's an excerpt:

/* If we can't guarantee 53-bit precision, don't use the code
in Python/dtoa.c, but fall back to standard code. This
means that repr of a float will be long (17 sig digits).

Realistically, there are two things that could go wrong:

(1) doubles aren't IEEE 754 doubles, or
(2) we're on x86 with the rounding precision set to 64-bits
(extended precision), and we don't know how to change
the rounding precision.
*/

#if !defined(DOUBLE_IS_LITTLE_ENDIAN_IEEE754) && \
!defined(DOUBLE_IS_BIG_ENDIAN_IEEE754) && \
!defined(DOUBLE_IS_ARM_MIXED_ENDIAN_IEEE754)
#define PY_NO_SHORT_FLOAT_REPR
#endif

/* double rounding is symptomatic of use of extended precision on x86. If
we're seeing double rounding, and we don't have any mechanism available for
changing the FPU rounding precision, then don't use Python/dtoa.c. */
#if defined(X87_DOUBLE_ROUNDING) && !defined(HAVE_PY_SET_53BIT_PRECISION)
#define PY_NO_SHORT_FLOAT_REPR
#endif

Essentially, there are two things that can go wrong. One is that the Python configuration fails to identify the floating-point format of a C double. That format is almost always IEEE 754 binary64, but sometimes the config script fails to figure that out. That's the first #if preprocessor check in the snippet above. Look at the pyconfig.h file generated at compile time, and see if at least one of the DOUBLE_IS_... macros is #defined. Alternatively, try this at a Python prompt:

>>> float.__getformat__('double')
'IEEE, little-endian'

If you see something like the above, this part should be okay. If you see something like 'unknown', then Python hasn't managed to identify the floating-point format.

The second thing that can go wrong is that we do have IEEE 754 binary64 format doubles, but Python's build machinery can't figure out how to ensure 53-bit precision for floating-point computations for this platform. The dtoa.c source requires that we're able to do all floating-point operations (whether implemented in hardware or software) at a precision of 53 bits. That's particularly a problem on Intel processors that are using the x87 floating-point unit for double-precision computations (as opposed to the newer SSE2 instructions): the default precision of the x87 is 64-bits, and using it for double-precision computations with that default precision setting leads to double rounding, which breaks the dtoa.c assumptions. So at config time, the build machinery runs a check to see (1) whether double rounding is a potential problem, and (2) if so, whether there's a way to put the FPU into 53-bit precision. So now you want to look at pyconfig.h for the X87_DOUBLE_ROUNDING and HAVE_PY_SET_53BIT_PRECISION macros.

So it could be either of the above. If I had to guess, I'd guess that on that platform, double rounding is being detected as a problem, and it's not known how to fix it. The solution in that case is to adapt pyport.h to define the _Py_SET_53BIT_PRECISION_* macros in whatever platform-specific way works to get that 53-bit precision mode, and then to define HAVE_PY_SET_53BIT_PRECISION.

Text to floating point: how does it work

I once described this, in a simple, easily understood way, for the value 5.2, see this SO article.

C dynamically printf double, no loss of precision and no trailing zeroes

There's probably no easier way. It's a quite involved problem.

Your code isn't solving it right for several reasons:

  • Most practical implementations of floating-point arithmetic aren't decimal, they are binary. So, when you multiply a floating-point number by 10 or divide it by 10, you may lose precision (this depends on the number).
  • Even though the standard 64-bit IEEE-754 floating-point format reserves 53 bits for the mantissa, which is equivalent to floor(log10(2 ^ 53)) = 15 decimal digits, a valid number in this format may need up to some 1080 decimal digits in the fractional part when printed exactly, which is what you appear to be asking about.

One way of solving this is to use the %a format type specifier in snprintf(), which is going to print the floating-point value using hexadecimal digits for the mantissa and the C standard from 1999 guarantees that this will print all significant digits if the floating-point format is radix-2 (AKA base-2 or simply binary). So, with this you can obtain all the binary digits of the mantissa of the number. And from here you will be able to figure out how many decimal digits are in the fractional part.

Now, observe that:

1.00000 = 2+0 = 1.00000 (binary)

0.50000 = 2-1 = 0.10000

0.25000 = 2-2 = 0.01000

0.12500 = 2-3 = 0.00100

0.06250 = 2-4 = 0.00010

0.03125 = 2-5 = 0.00001

and so on.

You can clearly see here that a binary digit at i-th position to the right of the point in the binary representation produces the last non-zero decimal digit also in i-th position to the right of the point in the decimal representation.

So, if you know where the least significant non-zero bit is in a binary floating-point number, you can figure out how many decimal digits are needed to print the fractional part of the number exactly.

And this is what my program is doing.

Code:

// file: PrintFullFraction.c
//
// compile with gcc 4.6.2 or better:
// gcc -Wall -Wextra -std=c99 -O2 PrintFullFraction.c -o PrintFullFraction.exe
#include <limits.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <math.h>
#include <float.h>
#include <assert.h>

#if FLT_RADIX != 2
#error currently supported only FLT_RADIX = 2
#endif

int FractionalDigits(double d)
{
char buf[
1 + // sign, '-' or '+'
(sizeof(d) * CHAR_BIT + 3) / 4 + // mantissa hex digits max
1 + // decimal point, '.'
1 + // mantissa-exponent separator, 'p'
1 + // mantissa sign, '-' or '+'
(sizeof(d) * CHAR_BIT + 2) / 3 + // exponent decimal digits max
1 // string terminator, '\0'
];
int n;
char *pp, *p;
int e, lsbFound, lsbPos;

// convert d into "+/- 0x h.hhhh p +/- ddd" representation and check for errors
if ((n = snprintf(buf, sizeof(buf), "%+a", d)) < 0 ||
(unsigned)n >= sizeof(buf))
return -1;

//printf("{%s}", buf);

// make sure the conversion didn't produce something like "nan" or "inf"
// instead of "+/- 0x h.hhhh p +/- ddd"
if (strstr(buf, "0x") != buf + 1 ||
(pp = strchr(buf, 'p')) == NULL)
return 0;

// extract the base-2 exponent manually, checking for overflows
e = 0;
p = pp + 1 + (pp[1] == '-' || pp[1] == '+'); // skip the exponent sign at first
for (; *p != '\0'; p++)
{
if (e > INT_MAX / 10)
return -2;
e *= 10;
if (e > INT_MAX - (*p - '0'))
return -2;
e += *p - '0';
}
if (pp[1] == '-') // apply the sign to the exponent
e = -e;

//printf("[%s|%d]", buf, e);

// find the position of the least significant non-zero bit
lsbFound = lsbPos = 0;
for (p = pp - 1; *p != 'x'; p--)
{
if (*p == '.')
continue;
if (!lsbFound)
{
int hdigit = (*p >= 'a') ? (*p - 'a' + 10) : (*p - '0'); // assuming ASCII chars
if (hdigit)
{
static const int lsbPosInNibble[16] = { 0,4,3,4, 2,4,3,4, 1,4,3,4, 2,4,3,4 };
lsbFound = 1;
lsbPos = -lsbPosInNibble[hdigit];
}
}
else
{
lsbPos -= 4;
}
}
lsbPos += 4;

if (!lsbFound)
return 0; // d is 0 (integer)

// adjust the least significant non-zero bit position
// by the base-2 exponent (just add them), checking
// for overflows

if (lsbPos >= 0 && e >= 0)
return 0; // lsbPos + e >= 0, d is integer

if (lsbPos < 0 && e < 0)
if (lsbPos < INT_MIN - e)
return -2; // d isn't integer and needs too many fractional digits

if ((lsbPos += e) >= 0)
return 0; // d is integer

if (lsbPos == INT_MIN && -INT_MAX != INT_MIN)
return -2; // d isn't integer and needs too many fractional digits

return -lsbPos;
}

const double testData[] =
{
0,
1, // 2 ^ 0
0.5, // 2 ^ -1
0.25, // 2 ^ -2
0.125,
0.0625, // ...
0.03125,
0.015625,
0.0078125, // 2 ^ -7
1.0/256, // 2 ^ -8
1.0/256/256, // 2 ^ -16
1.0/256/256/256, // 2 ^ -24
1.0/256/256/256/256, // 2 ^ -32
1.0/256/256/256/256/256/256/256/256, // 2 ^ -64
3.14159265358979323846264338327950288419716939937510582097494459,
0.1,
INFINITY,
#ifdef NAN
NAN,
#endif
DBL_MIN
};

int main(void)
{
unsigned i;
for (i = 0; i < sizeof(testData) / sizeof(testData[0]); i++)
{
int digits = FractionalDigits(testData[i]);
assert(digits >= 0);
printf("%f %e %.*f\n", testData[i], testData[i], digits, testData[i]);
}
return 0;
}

Output (ideone):

0.000000 0.000000e+00 0
1.000000 1.000000e+00 1
0.500000 5.000000e-01 0.5
0.250000 2.500000e-01 0.25
0.125000 1.250000e-01 0.125
0.062500 6.250000e-02 0.0625
0.031250 3.125000e-02 0.03125
0.015625 1.562500e-02 0.015625
0.007812 7.812500e-03 0.0078125
0.003906 3.906250e-03 0.00390625
0.000015 1.525879e-05 0.0000152587890625
0.000000 5.960464e-08 0.000000059604644775390625
0.000000 2.328306e-10 0.00000000023283064365386962890625
0.000000 5.421011e-20 0.0000000000000000000542101086242752217003726400434970855712890625
3.141593 3.141593e+00 3.141592653589793115997963468544185161590576171875
0.100000 1.000000e-01 0.1000000000000000055511151231257827021181583404541015625
inf inf inf
nan nan nan
0.000000 2.225074e-308 0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002225073858507201383090232717332404064219215980462331830553327416887204434813918195854283159012511020564067339731035811005152434161553460108856012385377718821130777993532002330479610147442583636071921565046942503734208375250806650616658158948720491179968591639648500635908770118304874799780887753749949451580451605050915399856582470818645113537935804992115981085766051992433352114352390148795699609591288891602992641511063466313393663477586513029371762047325631781485664350872122828637642044846811407613911477062801689853244110024161447421618567166150540154285084716752901903161322778896729707373123334086988983175067838846926092773977972858659654941091369095406136467568702398678315290680984617210924625396728515625

You can see that π and 0.1 are only true up to 15 decimal digits and the rest of the digits show what the numbers got really rounded to since these numbers cannot be represented exactly in a binary floating-point format.

You can also see that DBL_MIN, the smallest positive normalized double value, has 1022 digits in the fractional part and of those there are 715 significant digits.

Possible issues with this solution:

  • Your compiler's printf() functions do not support %a or do not correctly print all digits requested by the precision (this is quite possible).
  • Your computer uses non-binary floating-point formats (this is extremely rare).


Related Topics



Leave a reply



Submit