Traceback a Pointer in C++ Code Gdb

Understanding C pointers using GDB by examining core and call stack

When a function is called in C, the parameters are copied into registers or pushed onto the stack. The called function can reuse those registers and stack locations for any purpose. Often, but not always, a parameter is kept in the same register or stack location for the entire lifetime of the function call.

On a 32-bit system, the first argument to a function - in your case, inp - is often located on the stack, 12 bytes away from the location that the stack frame's base pointer points to. See stackoverflow.com: what exactly is program stack's growth direction.

When gdb does a backtrace, the only guidance it has from the compiler is something like "the first argument to func2 is named inp and is a 4-byte value of type *UTYPE located at a 12-byte offset from the %ebp register".

If, somewhere in func2, you alter inp, as you do at location (4), then any backtrace from that point on may very well show the altered value of inp, in your case, 0. The value that inp had when func2 was entered is lost forever, unless the compiler has been clever enough to include guidance like "the first argument to func2 is named inp and is a 4-byte value of type *UTYPE and its value upon entry to func2 can be found by unwinding the stack to the previous frame and looking at the value of ptr, which is located at a -4-byte offset from the %ebp register." The newer versions of the DWARF debugging format can specify things like this, I believe.

I cannot explain why your gdb's backtrace shows ptr in func1's frame as having the value 0. Setting inp to NULL should have no effect on ptr's value nor on gdb's ability to show ptr's value.

this pointer changes in GDB backtrace

The this pointer can change between frames in a gdb trace if the function in the next frame is called on a different object (even if the objects are the same type), since this is for the specific instance. This is probably not your problem.

0x200 is not a valid value for this, and almost certainly indicates memory corruption of some type. The this pointer is sometimes stored on the stack and passed as an invisible first argument to a function. So if you have corrupted the stack (by going out of bounds writing to another variable) you could see the this pointer corrupted.

The value 0x200 itself is interesting. Because it is so close to 0, but not actually 0, it indicates that the instance you're looking at is probably part of another object or array, located 0x200 bytes from the beginning of that object/array, and that the object/array's address is actually NULL. Looking at your code you should be able to pretty easily figure out which object has gotten set to NULL, which is causing this to report 0x200.

How gdb reconstructs stacktrace for C++?

Speaking Pseudocode, you could call the stack "an array of packed stack frames", where every stack frame is a data structure of variable size you could express like:

template struct stackframe<N> {
uintptr_t contents[N];
#ifndef OMIT_FRAME_POINTER
struct stackframe<> *nextfp;
#endif
void *retaddr;
};

Problem is that every function has a different <N> - frame sizes vary.

The compiler knows frame sizes, and if creating debugging information will usually emit these as part of that. All the debugger then needs to do is to locate the last program counter, look up the function in the symbol table, then use that name to look up the framesize in the debugging information. Add that to the stackpointer and you get to the beginning of the next frame.

If using this method you don't require frame linkage, and backtracing will work just fine even if you use -fomit-frame-pointer. On the other hand, if you have frame linkage, then iterating the stack is just following a linked list - because every framepointer in a new stackframe is initialized by the function prologue code to point to the previous one.

If you have neither frame size information nor framepointers, but still a symbol table, then you can also perform backtracing by a bit of reverse engineering to calculate the framesizes from the actual binary. Start with the program counter, look up the function it belongs to in the symbol table, and then disassemble the function from the start. Isolate all operations between the beginning of the function and the program counter that actually modify the stackpointer (write anything to the stack and/or allocate stackspace). That calculates the frame size for the current function, so subtract that from the stackpointer, and you should (on most architectures) find the last word written to the stack before the function was entered - which is usually the return address into the caller. Re-iterate as necessary.

Finally, you can perform a heuristic analysis of the contents of the stack - isolate all words in the stack that are within executably-mapped segments of the process address space (and thereby could be function offsets aka return addresses), and play a what-if game looking up the memory, disassembling the instruction there and see if it actually is a call instruction of sort, if so whether that really called the 'next' and if you can construct an uninterrupted call sequence from that. This works to a degree even if the binary is completely stripped (although all you could get in that case is a list of return addresses). I don't think GDB employs this technique, but some embedded lowlevel debuggers do. On x86, due to the varying instruction lengths, this is terribly difficult to do because you can't easily "step back" through an instruction stream, but on RISC, where instruction lengths are fixed, e.g. on ARM, this is much simpler.

There are some holes that make simple or even complex/exhaustive implementations of these algorithms fall over sometimes, like tail-recursive functions, inlined code, and so on. The gdb sourcecode might give you some more ideas:

https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/frame.c

GDB employs a variety of such techniques.

How can one see content of stack with GDB?

info frame to show the stack frame info

To read the memory at given addresses you should take a look at x

x/x $esp for hex x/d $esp for signed x/u $esp for unsigned etc. x uses the format syntax, you could also take a look at the current instruction via x/i $eip etc.

gdb backtrace of a core file prints error no such file or directory

This error looks mystifying but it is correct. It shows that a NULL pointer de-reference was being made by strcmp, which was called from line 1144 of your code.

A segmentation fault refers to trying to access a page of memory that is invalid: its segment is mapped as Invalid for read or write in the MMU. In this case, strcmp is trying to access page 0 because you passed it a NULL ptr. Null Ptr is address zero, and page 0 is an invalid page.

The reference to file:

../sysdeps/x86_64/multiarch/strcmp-sse2-unaligned.S

is referring the the assembler file (.S) that implements strcmp for x86 on 64-bit architectures. Since you do not have that implementation file on your system, gdb is complaining that it can not access it.

Determine the line of code that causes a segmentation fault?

GCC can't do that but GDB (a debugger) sure can. Compile you program using the -g switch, like this:

gcc program.c -g

Then use gdb:

$ gdb ./a.out
(gdb) run
<segfault happens here>
(gdb) backtrace
<offending code is shown here>

Here is a nice tutorial to get you started with GDB.

Where the segfault occurs is generally only a clue as to where "the mistake which causes" it is in the code. The given location is not necessarily where the problem resides.

What does ?? in gdb backtrace mean and how to get the actual stack frames?

Those ?? are usually where the name of the function is displayed. GDB does not know the name of those functions and therefore displays ??.

Now, why is this happening? Depends. GCC compiles including symbols (e.g. function names and similar) by default. Most probably you are working with a stripped version, where symbols have been removed, or just with the wrong file.

As @zwol suggests, the line you see warning: exec file is newer than core file is an indication of the fact that something else is going on that you don't show in your question. You are working on a core dump file generated by the crashed executable, which is outdated.

I would suggest you to re-compile the program from scratch and make sure that you are opening the right file with GDB. First produce a new core dump by crashing the new program, then open it in GDB.

Assuming the following program.c:

int main(void) { return 1/0; }

This should work:

$ rm -f core
$ gcc program.c -o program
$ ./program
Floating point exception (core dumped)

$ gdb program core
Reading symbols from program...(no debugging symbols found)...done.
[New LWP 11896]
Core was generated by `./program'.
Program terminated with signal SIGFPE, Arithmetic exception.
#0 0x000055d24a4cd790 in main ()
(gdb) bt
#0 0x000055d24a4cd790 in main ()
(gdb)

NOTE: if you don't see (core dumped) when running the process that means that a core dump was not generated (which leaves you with the old one). If you are using Bash, try running the command ulimit -c unlimited before crashing the program.



Related Topics



Leave a reply



Submit