Is There a Functional Difference Between "2.00" and "2.00F"

Is there a functional difference between 2.00 and 2.00f?

The f suffix makes it a single precision(float) literal instead of a double precision literal. This usually means 32 bit instead of 64 bit floats.

Floating-point constants default to type double. By using the suffixes f or l (or F or L — the suffix is not case sensitive), the constant can be specified as float or long double, respectively.

http://msdn.microsoft.com/en-us/library/tfh6f0w2(v=VS.100).aspx

Is there any difference between using floating point casts vs floating point suffixes in C and C++?

The default is double. Assuming IEEE754 floating point, double is a strict superset of float, and thus you will never lose precision by not specifying f. EDIT: this is only true when specifying values that can be represented by float. If rounding occurs this might not be strictly true due to having rounding twice, see Eric Postpischil's answer. So you should also use the f suffix for floats.


This example is also problematic:

long double MY_LONG_DOUBLE = (long double)3.14159265358979323846264338328;

This first gives a double constant which is then converted to long double. But because you started with a double you have already lost precision that will never come back. Therefore, if you want to use full precision in long double constants you must use the L suffix:

long double MY_LONG_DOUBLE = 3.14159265358979323846264338328L; // L suffix

What is the difference between casting to `float` and adding `f` as a suffix when initializing a `float`?

float f = 99.32f ;

That is a float literal, which means a float variable being assigned with a float value directly.

float f = (float) 99.32 ;

That is a float variable that is assigned a double value that is cast to float before being assigned.

What's the use of suffix `f` on float value

3.00 is interpreted as a double, as opposed to 3.00f which is seen by the compiler as a float.

The f suffix simply tells the compiler which is a float and which is a double.

See MSDN (C++)

When does appending an 'f' change the value of a floating constant when assigned to a `float`?

This is a self answer per Answer Your Own Question.

Appending an f makes the constant a float and sometimes makes a value difference.


Type

Type difference: double to float.

A well enabled compiler may emit a warning when the f is omitted too.

  float f = 3.1415926535897932;  // May generate a warning

warning: conversion from 'double' to 'float' changes value from '3.1415926535897931e+0' to '3.14159274e+0f' [-Wfloat-conversion]


Value

To make a value difference, watch out for potential double rounding issues.

The first rounding is due to code's text being converted to the floating point type.

the result is either the nearest representable value, or the larger or smaller representable value immediately adjacent to the nearest representable value, chosen in an implementation-defined manner. C17dr § 6.4.4.2 3

Given those two choices, a very common implementation-defined manner is to convert the source code text to the closest double (without the f) or to the closest float with the f suffix. Lesser quality implementations sometimes form the 2nd closest choice.

Assignment of a double FP constant to a float incurs another rounding.

If the value being converted is in the range of values that can be represented but cannot be represented exactly, the result is either the nearest higher or nearest lower representable value, chosen in an implementation-defined manner. C17dr § 6.3.1.4 2

A very common implementation-defined manner is to convert the double to the closest float - with ties to even. (Note: compile time rounding may be affected by various compiler settings.)

Double rounding value change

Consider the case when source code uses a value very close to half-way between 2 float values.

Without an f, the rounding of code to a double may result in a value exactly half-way between 2 floats. The conversion of the double to float then could differ from "with an f".

With an f, the conversion results in the closest float.

Example:

#include <math.h>
#include <stdio.h>
int main(void) {
float f;
f = 10000000.0f;
printf("%.6a %.3f 10 million\n", f, f);
f = nextafterf(f, f + f);
printf("%.6a %.3f 10 million - next float\n", f, f);
puts("");
f = 10000000.5000000001;
printf("%.6a %.3f 10000000.5000000001\n", f, f);
f = 10000000.5000000001f;
printf("%.6a %.3f 10000000.5000000001f\n", f, f);
puts("");
f = 10000001.4999999999;
printf("%.6a %.3f 10000001.4999999999\n", f, f);
f = 10000001.4999999999f;
printf("%.6a %.3f 10000001.4999999999f\n", f, f);
}

Output

0x1.312d00p+23  10000000.000  10 million
0x1.312d02p+23 10000001.000 10 million - next float

// value value source code
0x1.312d00p+23 10000000.000 10000000.5000000001
0x1.312d02p+23 10000001.000 10000000.5000000001f // Different, and better

0x1.312d04p+23 10000002.000 10000001.4999999999
0x1.312d02p+23 10000001.000 10000001.4999999999f // Different, and better

Rounding mode

The issue about double1 rounding is less likely when the rounding mode is up, down or towards zero. Issue arises when the 2nd rounding compounds the direction on half-way cases.

Occurrence rate

Issue occurs when code converts inexactly to a double that is very near half-way between 2 float values - so relatively rare. Issue applies even if the code constant was in decimal or hexadecimal form. With random constants: about 1 in 230.

Recommendation

Rarely a major concern, yet an f suffix is better to get the best value for a float and quiet a warning.

[Update 2022]

The issue is further complicated under 2 conditions:

  • FLT_EVAL_METHOD == 2, then the constant maybe evaluated using long double math.

  • Evaluation of floating point constants may ignore decimal digits past a certain precision. This is allowed in C and IEEE 754. Typically this is XXX_DECIMAL_DIG + 3 digits (e.g. 20 for double).

These complications change the chance of seeing this issue. Still the conclusion remains: append f to get the best float constant.


1 double here refers to doing something twice, not the the type double.

f after number

CGRect frame = CGRectMake(0.0f, 0.0f, 320.0f, 50.0f);

uses float constants. (The constant 0.0 usually declares a double in Objective-C; putting an f on the end - 0.0f - declares the constant as a (32-bit) float.)

CGRect frame = CGRectMake(0, 0, 320, 50);

uses ints which will be automatically converted to floats.

In this case, there's no (practical) difference between the two.



Related Topics



Leave a reply



Submit