Matlab VS C++ Double Precision

Matlab vs C++ Double Precision

You got confused by the different ways C++ and MATLAB are printing double values. MATLAB's format long only prints 15 significant digits while C++ prints 17 significant digits. Internally both use the same numbers: IEEE 754 64 bit floating point numbers. To reproduce the C++-behaviour in MATLAB, I defined a anonymous function disp17 which prints numbers with 17 significant digits:

>> disp17=@(x)(disp(num2str(x,17)))

disp17 = 

    @(x)(disp(num2str(x,17)))

>> 1.0 / 1.45

ans =

   0.689655172413793

>> disp17(1.0 / 1.45)
0.68965517241379315

You see the result in MATLAB and C++ is the same, they just print a different number of digits. If you now continue in both programming languages with the same constant, you get the same result.

>> D = 0.68965517241379315 %17 digits, enough to represent a double.

D =

   0.689655172413793

>> ans = 2600 / D %Result looks wrong

ans =

     3.770000000000000e+03

>> disp17(2600 / D) %But displaying 17 digits it is the same.
3769.9999999999995

The background for printing 17 or 15 digits:

A double requires 17 significant digits to be stored without precision loss, which is what C prints.
For up to 15 digits any number can be converted from string to double to string and results back in the original number, which is what MATLAB does.

precision differences in matlab and c++

A similar discussion occurred before, the conclusion was that IEEE 754 tolerates error in the last bit for transcendental functions (cos, sin, exp, etc..). So you can't expect exactly same results between MATLAB and C (not even same C code compiled in different compilers).

Double precision - MS visual C++ 2005 Vs Matlab

It would be more accurate if this question (and any answers) were framed in terms of significant digits rather than decimal places. The IEEE standards do, as Itamar Katz has hinted, store double precision floating-point numbers in 52 bits for the significand (what some call the mantissa). The standard also implies an extra bit, so doubles have 53 significant bits. When the binary number is translated into a decimal representation this translates to 15 or 16 significant digits.

Neither Matlab nor Visual C++ can (without additional facilities such as arbitrary-precision libraries or using 128-bit f-p numbers) store doubles with more than the standard size of significand. If your program, in either language, presents a number to you with more than 15 (or 16) decimal digits you cannot trust any of the excess digits. They haven't come from the stored representation of the number, they have been added somewhere along the line between memory and screen -- perhaps a 'helpful' numeric formatter has simply extended the rightmost digit until you see the 19 digits you've requested (or whatever).

It's not entirely clear from the question how you transfer numbers from C++ to/from Matlab, or even that you do transfer numbers; maybe you are just trying to write a C++ program which reproduces the results of your Matlab program. (We do that a lot here so I have some experience in this field.)

If you use 'text' files then you are transferring not numbers but representations of numbers. If your program reads the text '15.833' into a double variable, it is not safe to make any assumptions about the the values taken by the extra digits in the significand. In particular you should not assume that they will be set to 0 -- well, I guess someone more knowledgeable about C++ might be able to tell us that the language standard does guarantee this but Matlab doesn't and I don't think that C++ does either. If you want to set the extra digits, specify them in your textual representation. Even this will not guarantee that you store the value exactly as specified in your text file, your variable will (probably) hold the nearest f-p number to the value in the text.

However, if your text file is written by Matlab (or C++) and writes 15 or 16 digits into the textual representation of a number then it should be a textual representation of the entire f-p number and, on reading by another program, should get translated into the same f-p number. Note, though, that I write 'should' and that the number has been translated at least twice and strange things happen when you take your eye off the numbers on the computer.

A better choice for bit-accurate transfer of data between C++ and Matlab is to use a binary file format, one which stores all 64 bits of a double as 64 bits. The Matlab MAT file format certainly stores IEEE754 numbers in the form specified by the standard.

It's possible that all of the preceding guff is irrelevant to another, underlying, issue. That issue may be that your algorithm is not stable -- which is a whole other topic.

To summarise:

Transfer numbers from program-to-program in their binary representation (whether you use a file or message-passing or other mechanism entirely).
Don't trust more than 15 significant digits in any decimal representation of double precision floating-point numbers.

Furthermore, unless you take special measures in your code, your programs probably gradually lose accuracy as they progress, making all the low-order digits of dubious reality. For the application that you hint at, it's unlikely that the science behind the code supports the hypothesis that two outputs which differ in the 15th significant digit represent different values. What is the accuracy of the measurements on which your inputs are based ?

Matlab single precision vs C floating point?

The number 0.001044397222448 (like the vast majority of decimal fractions) cannot be exactly represented in binary floating point.

As a single-precision float, it's most closely represented as (hex) 0x0.88e428 × 2^-9, which in decimal is 0.001044397242367267608642578125.

In double precision, it's most closely represented as 0x0.88e427d4327300 × 2^-9, which in decimal is 0.001044397222447999984407118745366460643708705902099609375.

Those are what the numbers are, internally, in both C and Matlab.

Everything else you see is an artifact of how the numbers are printed back out, possibly rounded and/or truncated.

When I said that the single-precision representation "in decimal is 0.001044397242367267608642578125", that's mildly misleading, because it makes it look like there are 28 or more digits' worth of precision. Most of those digits, however, are an artifact of the conversion from base 2 back to base 10. As other answers have noted, single-precision floating point actually gives you only about 7 decimal digits of precision, as you can see if you notice where the single- and double-precision equivalents start to diverge:

0.001044397242367267608642578125
0.001044397222447999984407118745366460643708705902099609375
            ^
        difference

Similarly, double precision gives you roughly 16 decimal digits worth of precision, as you can see if you compare the results of converting a few previous and next mantissa values:

0x0.88e427d43272f8  0.00104439722244799976756668424826557384221814572811126708984375
0x0.88e427d4327300  0.001044397222447999984407118745366460643708705902099609375
0x0.88e427d4327308  0.00104439722244800020124755324246734744519926607608795166015625
0x0.88e427d4327310  0.0010443972224480004180879877395682342466898262500762939453125
                                        ^
                                     changes

This also demonstrates why you can never exactly represent your original value 0.001044397222448 in binary. If you're using double, you can have 0.00104439722244799998, or you can have 0.0010443972224480002, but you can't have anything in between. (You'd get a little less close with float, and you could get considerably closer with long double, but you'll never get your exact value.)

In C, and whether you're using float or double, you can ask for as little or as much precision as you want when printing things with %f, and under a high-quality implementation you'll always get properly-rounded results. (Of course the results you get will always be the result of rounding the actual, internal value, not necessarily the decimal value you started with.) For example, if I run this code:

printf("%.5f\n", 0.001044397222448);
printf("%.10f\n", 0.001044397222448);
printf("%.15f\n", 0.001044397222448);
printf("%.20f\n", 0.001044397222448);
printf("%.30f\n", 0.001044397222448);
printf("%.40f\n", 0.001044397222448);
printf("%.50f\n", 0.001044397222448);
printf("%.60f\n", 0.001044397222448);
printf("%.70f\n", 0.001044397222448);

I see these results, which as you can see match the analysis above.
(Note that this particular example is using double, not float.)

0.00104
0.0010443972
0.001044397222448
0.00104439722244799998
0.001044397222447999984407118745
0.0010443972224479999844071187453664606437
0.00104439722244799998440711874536646064370870590210
0.001044397222447999984407118745366460643708705902099609375000
0.0010443972224479999844071187453664606437087059020996093750000000000000

I'm not sure how Matlab prints things.

In answer to your specific questions:

What is the real value of this variable, that I later use in my Simulink simulation? Is it really truncated/rounded to 0.0010444?

As a float, it is really "truncated" to a number which, converted back to decimal, is exactly 0.001044397242367267608642578125. But as we've seen, most of those digits are essentially meaningless, and the result can more properly thought of as being about 0.0010443972.

In the C code the value is read as float gf = 0.001044397222448f; and it prints out as 0.001044397242367267608642578125000

So C got the same answer I did -- but, again, most of those digits are not meaningful.

So the C code keeps good precision. But, does Matlab?

I'd be willing to bet that Matlab keeps the same internal precision for ordinary floats and doubles.

c++ and matlab floating point computation

Although it's not clear what your numbers actually are, the relative difference of the first (and largest) numbers is about 1e-8, which is the relative tolerance of many double precision algorithms.

Floating point numbers are only an approximation of the real number system, and their finite size (64 bits for double precision) limits their precision. Because of this finite precision, operations that involve floating point numbers can incur round-off error, and are thus not strictly associative. What this means is that A+(B+C) != (A+B)+C. The difference between the two is usually small, depending on their relative sizes, but it's not always zero.

What this means is that you should expect small differences in the relative and absolute values when you compare an algorithm coded in Matlab to one in C++. The difference may be in the libraries (i.e., there's no guarantee that Matlab uses the system math library for routines like sqrt), or it may just be that your C++ and Matlab implementations order their operations differently.

The section on floating point comparison tests in Boost::Test discusses this a bit, and has some good references. In particular, you should probably read What Every Computer Scientist Should Know About Floating-Point Arithmetic and consider picking up a copy of Knuth's TAOCP Vol. II.

why is there significant double precision difference between Matlab and Mathematica?

You can fix this problem by using ReadList instead of Import. I have added some demo steps below to explore displayed rounding and equality. Note the final test d == e? is False in Mathematica 7 but True in Mathematica 9, (with all the expected digits). So it looks like some precision has been added to Import by version 9. The demo uses a demo file.

Contents of demo.dat:

0.22381193949113697971853298440692014992237091064453125
"0.22381193949113697971853298440692014992237091064453125"

Exploring:-

a = Import["demo.dat"]
b = ReadList["demo.dat"]
a[[1, 1]] == a[[2, 1]]
b[[1]] == b[[2]]
a[[1, 1]] == b[[1]]
a[[1, 1]] == ToExpression@b[[2]]
b[[1]] // FullForm
c = First@StringSplit[ToString@FullForm@b[[1]], "`"]
b[[2]]
ToExpression /@ {c, b[[2]]}
d = N[FromDigits[RealDigits[a[[1, 1]]]], 100]
e = N[FromDigits[RealDigits[b[[1]]]], 100]
d == e

matlab and c++ precision

This is certainly not a normal difference!

The machine epsilon will be orders of magnitude smaller than the errors you are getting (i.e. 2.22e-016 vs. 2.0e-3).

You can confirm your machine epsilon with the following C++ code:

#include <limits>

cout << "Machine Epsilon is: " << numeric_limits<double>::epsilon() << endl;

Your Matlab script will be bound to the same limitations; you can confirm this by entering the following into Matlab command window:

eps

If the computations you are performing in Matlab and C++ are mathematically equivalent then you should obtain the same result - especially with 4 d.p. precision!

In MATLAB, are variables REALLY double-precision by default?

They're doubles. vpa() is simply choosing to display non-significant digits beyond the floating point relative accuracy, where printf() and disp() truncate or zero them out.

You're only getting your original four digits back out because the literal you chose to initialize num with just happens to be the exact decimal expansion of a binary double value, because it was copy and pasted from the output of the expansion of an actual double value from the other question. It won't work for other nearby values, as you show in your "BONUS" addendum.

More precisely, all numeric literals in Matlab produce values of type double. They get converted to the binary double value that is nearest to the decimal value they represent. In effect, digits in a literal beyond the limit of precision of the double type are silently dropped. When you copy and paste the output of vpa to create a new variable, as the other question's poster did with the e = ... statement, you're initializing a value from a literal, instead of dealing directly with the result of a previous expression.

The differences here are just in output formatting. I think what's going on is that vpa() is taking that double precision binary double and treating it as an exact value. For a given binary mantissa-exponent value, you can calculate the decimal equivalent to arbitrarily many decimal places. If you have a limited precision ("width") in the binary value, as you do with any fixed-size data type, only so many of those decimal digits are significant. printf() and Matlab's default display handle this by truncating the output or displaying non-significant digits as 0. vpa() is ignoring the limits of precision and continuing to calculate as many decimal places as you request.

Those additional digits are bogus, in the sense that if they were replaced by other values to produce a nearby decimal value, they would all get "rounded" to the same binary double value.

Here's a way to show it. These values of x are all the same when stored in doubles, and will all be represented the same by vpa().

x = [
    2.7182818284590455348848081484902650117874145507812500
    2.7182818284590455348848081484902650117874145507819999
    2.7182818284590455348848
    2.71828182845904553488485555555555555555555555555555
    exp(1)
    ]
unique(x)

Here's another way of demonstrating it. Here are two doubles that are very close to each other.

x0 = exp(1)
x1 = x0 + eps(x0)

vpa(x0) and vpa(x1) should produce outputs that differ a lot past the 16th digit. However, you shouldn't be able to create a double value x such that vpa(x) produces a decimal representation that falls between vpa(x0) and vpa(x1).

(UPDATE: Amro points out that you can use fprintf('%bx\n', x) to display an exact representation of the underlying binary value in hex format. You can use this to confirm the literals map to the same double.)

I suspect vpa() behaves this way because it treats its inputs as exact values, and polymorphically supports other Matlab types from the Symbolic Toolbox that have more precision than doubles. Those values will need to be initialized by means other than numeric literals, which is why sym() takes a string as an input and vpa(exp(1)) differs from vpa(sym('exp(1)')).

Make sense? Sorry for the long-windedness.

(Note I don't have the Symbolic Toolbox so I can't test vpa() myself.)

Why Matlab and Octave give out slightly different result for same equation? How can I fix that?

Floating-point calculations are inherently imprecise. Changing the order of operations will often cause rounding errors to change, which you will see in the last digit (if you are lucky, if you are unlucky the differences will be much larger!). You cannot expect two different programs, or the same program running on two different computers, to generate the exact same floating-point values.

If the difference between these two numbers is a problem to your computations, you should probably find out why this difference gets amplified, and change the order of your computations so that rounding errors do not cause this much harm.

Two additional suggestions:

Don't redefine pi. It is a built-in function, when you assign to it you overwrite it.
Use 1e-7, not 10^(-7). It is more readable and easier to type.

Matlab VS C++ Double Precision