How to Trace a Nan in C++

How to trace a NaN in C++

In Visual Studio you can use the _controlfp function to set the behavior of floating-point calculations (see http://msdn.microsoft.com/en-us/library/e9b52ceh(VS.80).aspx). Maybe there is a similar variant for your platform.

Trapping quiet NaN

If you want to make all NaNs, overflows, and zerodivides signaling during debug, it is possible.

For gcc:

#include <fenv.h>

#ifndef NDEBUG
feenableexcept(FE_DIVBYZERO | FE_INVALID | FE_OVERFLOW);
#endif

For Visual Studio (not tested):

#include <float.h>

#ifndef NDEBUG
_clearfp();
_controlfp(_controlfp(0, 0) & ~(_EM_INVALID | _EM_ZERODIVIDE | _EM_OVERFLOW),
           _MCW_EM);
#endif

References: Microsoft, gcc.

These functions allow to catch either NaNs, produced by floating point operations (overflows, zerodivides, invalid operations), or sNaNs, used as input to some floating point operation. They do not allow to catch qNaNs, used as input to floating point operation. For such qNaNs, the only way to find them is to check each value individually (see Luchian Grigore's answer).

So if the component, that inserts a qNaN is in the same program, were you want to catch it, or if this component is in separate program, but you have its source codes, just enable FP exceptions with feenableexcept()/_controlfp(). Otherwise, check every value in the incoming data stream with isnan() (C++11) or with x != x.

C programming nan output

You have fallen victim to garbage values - a common mistake for beginners. Specifically, its happening in these lines -

float total;
float low;
float high;
float average;

When you write that, the system assigns the variables a memory location. The computer though, is not infinite, so it just uses a memory location that used to be used by something else, but is no longer being used. Most of the time though, the 'something else' doesnt clean up after itself (because it would take a lot of time), so the information that was there is left there, such as fdaba7e23f. This is totally meaningless to us, so we call it a garbage value.

You can fix this by initializing the variables, like so -

float total=0;
float low;
float high;
float average=0;

Note that you will have to add some extra logic for the low variable. Here is one way to do it -

printf("\n\nPlease enter a positive number to continue or a negative number");
printf(" to stop: ");

scanf("%f", &input);
low=input;
high=input;

while (input > 0)
....
....

As you can see, I just copied your code and added two lines after the first scanf.

Debug C++ code: Catch first NaN appearance

The answer is given here: https://stackoverflow.com/a/5394095/1326595

Just include

#include <fenv.h>

and than add the following line to the code:

feenableexcept(FE_INVALID | FE_OVERFLOW);

The debugger is than able to capture the signal and shows the very first occurrence of a NaN.

Stopping the debugger when a NaN floating point number is produced

You could enable floating point exceptions - see glibc Control Functions - then you'll get a SIGFPE when your NaN value is produced

Can I make gcc tell me when a calculation results in NaN or inf at runtime?

Almost any floating-point operation or math library function that produces a NaN from non-NaN inputs should also signal the 'invalid operation' floating-point exception; similarly, a calculation that produces an infinity from finite inputs will typically signal either the 'divide-by-zero' or 'overflow' floating-point exception. So you want some way of turning these exceptions into a SIGFPE.

I suspect the answer will be highly system-dependent, since control of floating-point traps and flags is likely to be supplied by the platform C library rather than by gcc itself. But here's an example that works for me, on Linux. It uses the feenableexcept function from fenv.h. The _GNU_SOURCE define is necessary for this function to be declared.

#define _GNU_SOURCE
#include <fenv.h>

int main(void) {
    double x, y, z;
    feenableexcept(FE_DIVBYZERO | FE_INVALID | FE_OVERFLOW);

    x = 1e300;
    y = 1e300;
    z = x * y; /* should cause an FPE */

    return 0;
}

A caveat: I think it's possible with some setups that the exception isn't actually generated until the next floating-point operation after the one that (in theory) should have caused it, so you sometimes need a no-op floating-point operation (e.g. multiplying by 1.0) to trigger the exception.

Unusual cause of NaN in C++? Do limits approaching zero cause NaN?

limes

Denominator in every case is trying to reach infinity (which means the whole fraction is trying to reach 0). This, however, means that you're defining somewhere division by infinity (in conjunction with limited double range).

An explanation in this lies in the c++ exp function which returns +-HUGE_VAL if the return value cannot be represented as a double.

Having said that, when your result cannot be contained within a double variable it will result in dividing by infinity and thus a nan

Btw if you want to operate on big numbers you can implement a class that stores numbers eg in a string and overload operators.

C++ float number to nan

Taken from wikipedia -> special values -> nan

0/0
∞×0
sqrt(−1)
in general "invalid operations" (I am not sure wether there are not more than the three above)

Looking at you code: infinity times 0 is possible, is it?

edit:

0 <= s <= +inf
1 <= m <= +inf

s / m:

+inf / +inf does indeed make minus NaN (I tested it)

I think that's the only thing that makes a NaN.

Testing for a float NaN results in a stack overflow

The solution:

First of all, thank you to @Matt for pointing me in the right direction, and @Hans Passant for providing the workaround.

The application talks to a CAN-USB adapter from Chinese manufacturer QM_CAN.

The problem is in their driver.

The DLL statements and Driver import:

   // DLL Statement
    IntPtr QM_DLL;
    TYPE_Init_can Init_can;
    TYPE_Quit_can Quit_can;
    TYPE_Can_send Can_send;
    TYPE_Can_receive Can_receive;
    delegate int TYPE_Init_can(byte com_NUM, byte Model, int CanBaudRate, byte SET_ID_TYPE, byte FILTER_MODE, byte[] RXF, byte[] RXM);
    delegate int TYPE_Quit_can();
    delegate int TYPE_Can_send(byte[] IDbuff, byte[] Databuff, byte FreamType, byte Bytes);
    delegate int TYPE_Can_receive(byte[] IDbuff, byte[] Databuff, byte[] FreamType, byte[] Bytes);

    // Driver
    [DllImport("kernel32.dll")]
    static extern IntPtr LoadLibrary(string lpFileName);
    [DllImport("kernel32.dll")]
    static extern IntPtr GetProcAddress(IntPtr hModule, string lpProcName);

The call to the offending code, including Hans' workaround:

   private void InitCanUsbDLL() // Initiate the driver for the CAN-USB dongle
    {

        // Here is an example of dynamically loaded DLL functions
        QM_DLL = LoadLibrary("QM_USB.dll");
        if (QM_DLL != IntPtr.Zero)
        {
            IntPtr P_Init_can = GetProcAddress(QM_DLL, "Init_can");
            IntPtr P_Quit_can = GetProcAddress(QM_DLL, "Quit_can");
            IntPtr P_Can_send = GetProcAddress(QM_DLL, "Can_send");
            IntPtr P_Can_receive = GetProcAddress(QM_DLL, "Can_receive");
            // The next line results in a FPU stack overflow if float.NaN is called by a handler
            Init_can = (TYPE_Init_can)Marshal.GetDelegateForFunctionPointer(P_Init_can, typeof(TYPE_Init_can));
            // Reset the FPU, otherwise we get a stack overflow when we work with float.NaN within a event handler
            // Thanks to Matt for pointing me in the right direction and to Hans Passant for this workaround:
            // http://stackoverflow.com/questions/25205112/testing-for-a-float-nan-results-in-a-stack-overflow/25206025
            try { throw new Exception("Please ignore, resetting FPU"); }
            catch { } 
            Quit_can = (TYPE_Quit_can)Marshal.GetDelegateForFunctionPointer(P_Quit_can, typeof(TYPE_Quit_can));
            Can_send = (TYPE_Can_send)Marshal.GetDelegateForFunctionPointer(P_Can_send, typeof(TYPE_Can_send));
            Can_receive = (TYPE_Can_receive)Marshal.GetDelegateForFunctionPointer(P_Can_receive, typeof(TYPE_Can_receive));
        }
    }

The reason that the application crashed when a reference was made to float.NaN in the event handler and not in the constructor was a simple matter of timing: the constructor is called before InitCanUsbDLL(), but the event handler was called long after InitCanUsbDLL() corrupted the FPU registers.

Usefulness of signaling NaN?

As I understand it, the purpose of signaling NaN is to initialize data structures, but, of course runtime initialization in C runs the risk of having the NaN loaded into a float register as part of initialization, thereby triggering the signal because the the compiler isn't aware that this float value needs to be copied using an integer register.

I would hope that you could could initialize a static value with a signaling NaN, but even that would require some special handling by the compiler to avoid having it converted to a quiet NaN. You could perhaps use a bit of casting magic to avoid having it treated as a float value during initialization.

If you were writing in ASM, this would not be an issue. but in C and especially in C++, I think you will have to subvert the type system in order to initialize a variable with NaN. I suggest using memcpy.

How to Trace a Nan in C++