Stopping the Debugger When a Nan Floating Point Number Is Produced

Stopping the debugger when a NaN floating point number is produced

You could enable floating point exceptions - see glibc Control Functions - then you'll get a SIGFPE when your NaN value is produced

Stopping the debugger when a NaN floating point number is produced without a code change

No matter how many files your code spans, you only need to add feenableexcept(FE_ALL_EXCEPT & ~FE_INEXACT) once only, at the first line of your main() function.

It will enable the exceptions for your whole program until you disable the exceptions by calling another function such as fedisableexcept().

Debug C++ code: Catch first NaN appearance

The answer is given here: https://stackoverflow.com/a/5394095/1326595

Just include

#include <fenv.h>

and than add the following line to the code:

feenableexcept(FE_INVALID | FE_OVERFLOW);

The debugger is than able to capture the signal and shows the very first occurrence of a NaN.

How to trace a NaN in C++

In Visual Studio you can use the _controlfp function to set the behavior of floating-point calculations (see http://msdn.microsoft.com/en-us/library/e9b52ceh(VS.80).aspx). Maybe there is a similar variant for your platform.

Why is GDB backtrace limited when using floating point control?

Is there a way to preserve the trace information with SIGFPE?

The trace info has ~nothing to do with which signal is being raised, and ~everything to do with the function it is raised in.

Somehow your pow is missing unwind descriptor (which is what GDB uses to unwind the stack).

This often happens with assembly-level implementations (where the developer neglects to put in the appropriate .cfi directives), or when building code with broken compilers.

The broken compiler seems unlikely, and I can't find any recent versions of GLIBC that used assembly to implement pow.

To recover the stack, the following techniques may work:

Use reverse debugger (such as rr) and go backwards from the SIGFPE. This is the best solution, but I doubt rr is available for your (apparently quite old) system.
Count the number of times pow is called before the crash:
(gdb) break pow (gdb) commands 1 silent cont end (gdb) run # run until SIGFPE (gdb) info break
You will now know how many times pow was called before the crash.

Run the program again, ignoring the breakpoint $N-1 times (you'll need to remove commands from the breakpoint first and use GDB ignore 1 $N-1 command). You should now be stopped just before the crash, and since you are still not inside pow, GDB should have no trouble showing you the stack trace.

This approach only works if your program is deterministic.

Is it possible to break the execution in gdb at the first assignment of inf?

With gcc on Linux you can turn floating-point exceptions to SIGFPE signal. You can call feenableexcept(FE_DIVBYZERO) to catch all subsequent floating point divisions by zero in your code. If you run this code in gdb it will stop on SIGFPE signal. This is default behavior for most other signals as well.
This code was taken and modified from here: https://stackoverflow.com/a/2949452/72178

#define _GNU_SOURCE
#include <fenv.h>

int main(void) {
    double x, y, z;
    feenableexcept(FE_DIVBYZERO);

    x = 1;
    y = 0;
    z = x / y;

    return 0;
}

gdb will stop on division by zero:

$ gdb -q ./a.out
Reading symbols from ./a.out...
(gdb) r
Program received signal SIGFPE, Arithmetic exception.
0x0000000000401153 in main () at 1.c:10
10      z = x / y;
(gdb)

Detecting when any variable in a large JS program is set to NaN

If you're doing a good job keeping things off of the global namespace and nesting things in objects, this might be of help. And I will preface this by saying this is by no means a fully complete solution, but at the very least, this should help you on your search.

function deepNaNWatch(objectToWatch) {
  'use strict';

  // Setting this to true will check object literals for NaN
  // For example: obj.example = { myVar : NaN };
  // This will, however, cost even more performance
  var configCheckObjectLiterals = true;

  var observeAllChildren = function observeAllChildren(parentObject) {

    for (var key in parentObject) {
      if (parentObject.hasOwnProperty(key)) {
        var childObject = parentObject[key];

        examineObject(childObject);
      }
    }
  };

  var examineObject = function examineObject(obj) {
    var objectType = typeof obj;

    if (objectType === 'object' || objectType === 'function') {
      Object.observe(obj, recursiveWatcher);
      if (configCheckObjectLiterals) {
        observeAllChildren(obj);
      }
    } if (objectType === 'number' && isNaN(obj)) {
      console.log('A wild NaN appears!');
    }
  };

  var recursiveWatcher = function recursiveWatcher(changes) {
    var changeInfo = changes[0];
    var changedObject = changeInfo.object[changeInfo.name];

    examineObject(changedObject);
  };

  Object.observe(objectToWatch, recursiveWatcher);
}

Call deepNaNWatch(parentObject) for every top level object/function you're using to nest things under as soon as they are created. Any time an object or function is created within a watched object/function, it itself will become watched as well. Any time a number is created or changed under a watched object--remember that typeof NaN == 'number'--it will check if it's NaN, and if so will run the code at console.log('A wild NaN appears!');. Be sure to change that to whatever sort of debugging output you feel will help.

This function would be more helpful if someone could find a way to force it onto the global object, but every attempt I made to do so simply told me I should sit in time out and think about what I've done.

Oh, and if it's not obvious from the above, on a large scale project, this function is bound to make pesky features like "speed" and "efficiency" a thing of the past.

Force gfortran to stop program at first NaN

The flag you're looking for is -ffpe-trap=invalid; I usually add ,zero,overflow to check for related floating point exceptions.

program nantest
    real :: a, b, c

    a = 1.
    b = 2.

    c = a/b
    print *, c,a,b

    a = 0.
    b = 0.

    c = a/b
    print *, c,a,b

    a = 2.
    b = 1.

    c = a/b
    print *,c,a,b
end program nantest

Then compiling it and running it in a debugger gives:

$ gfortran -o nantest nantest.f90 -ffpe-trap=invalid,zero,overflow -g -static
$ gdb nantest
[...]
(gdb) run
Starting program: /scratch/ljdursi/Testing/fortran/nantest 
  0.50000000       1.0000000       2.0000000    

Program received signal SIGFPE, Arithmetic exception.
0x0000000000400384 in nantest () at nantest.f90:13
13          c = a/b
Current language:  auto; currently fortran

With the intel fortran compiler (ifort), using the option -fpe0 will do the same thing.

It's a little tricker with C/C++ code; we have to actually insert a call to feenableexcept(), which enables floating point exceptions, and is defined in fenv.h;

#include <stdio.h>
#include <fenv.h>

int main(int argc, char **argv) {  
    float a, b, c;
    feenableexcept(FE_DIVBYZERO | FE_INVALID | FE_OVERFLOW);

    a = 1.;
    b = 2.;

    c = a/b;
    printf("%f %f %f\n", a, b, c);

    a = 0.;
    b = 0.;

    c = a/b;
    printf("%f %f %f\n", a, b, c);

    a = 2.;
    b = 1.;

    c = a/b;
    printf("%f %f %f\n", a, b, c);

    return 0;
}

but the effect is the same:

$ gcc -o nantest nantest.c -lm -g
$ gdb ./nantest
[...]
(gdb) run
Starting program: /scratch/s/scinet/ljdursi/Testing/exception/nantest  
1.000000 2.000000 0.500000

Program received signal SIGFPE, Arithmetic exception.  
0x00000000004005d0 in main (argc=1, argv=0x7fffffffe4b8) at nantest.c:17  
17        c = a/b;

either way, you have a much better handle on where the errors are occuring.

Stopping the Debugger When a Nan Floating Point Number Is Produced