How does gdb start an assembly compiled program and step one line at a time?
starti
implementation
As usual for a process that wants to start another process, it does a fork/exec, like a shell does. But in the new process, GDB doesn't just make an execve system call right away.
Instead, it calls ptrace(PTRACE_TRACEME)
to wait for the parent process to attach to it, so GDB (the parent) is already attached before the child process makes an execve()
system call to make this process start executing the specified executable file.
Also note in the execve(2)
man page:
If the current program is being ptraced, a SIGTRAP signal is sent
to it after a successful execve().
So that's how the kernel debugging API supports stopping before the first user-space instruction is executed in a newly-execed process. i.e. exactly what starti
wants. This doesn't depend on setting a breakpoint; that can't happen until after execve anyway, and with ASLR the correct address isn't even known until after execve picks a base address. (GDB by default disables ASLR, but it still works if you tell it not to disable ASLR.)
This is also what GDB use if you set breakpoints before run
, manually, or by using start
to set a one-time breakpoint on main
. Before the starti
command existed, a hack to emulate that functionality was to set an invalid breakpoint before run
, so GDB would stop on that error, giving you control at that point.
If you strace -f -o gdb.trace gdb ./foo
or something, you'll see some of what GDB does. (Nested tracing apparently doesn't work, so running GDB under strace means GDB's ptrace system call fails, but we can see what it does leading up to that.)
...
231566 execve("/usr/bin/gdb", ["gdb", "./foo"], 0x7ffca2416e18 /* 57 vars */) = 0
# the initial GDB process is PID 231566.
... whole bunch of stuff
231566 write(1, "Starting program: /tmp/foo \n", 28) = 28
231566 personality(0xffffffff) = 0 (PER_LINUX)
231566 personality(PER_LINUX|ADDR_NO_RANDOMIZE) = 0 (PER_LINUX)
231566 personality(0xffffffff) = 0x40000 (PER_LINUX|ADDR_NO_RANDOMIZE)
231566 vfork( <unfinished ...>
# 231584 is the new PID created by vfork that would go on to execve the new PID
231584 openat(AT_FDCWD, "/proc/self/fd", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 13
231584 newfstatat(13, "", {st_mode=S_IFDIR|0500, st_size=0, ...}, AT_EMPTY_PATH) = 0
231584 getdents64(13, 0x558403e20360 /* 16 entries */, 32768) = 384
231584 close(3) = 0
... all these FDs
231584 close(12) = 0
231584 getdents64(13, 0x558403e20360 /* 0 entries */, 32768) = 0
231584 close(13) = 0
231584 getpid() = 231584
231584 getpid() = 231584
231584 setpgid(231584, 231584) = 0
231584 ptrace(PTRACE_TRACEME) = -1 EPERM (Operation not permitted)
231584 write(2, "warning: ", 9) = 9
231584 write(2, "Could not trace the inferior pro"..., 37) = 37
231584 write(2, "\n", 1) = 1
231584 write(2, "warning: ", 9) = 9
231584 write(2, "ptrace", 6) = 6
231584 write(2, ": ", 2) = 2
231584 write(2, "Operation not permitted", 23) = 23
231584 write(2, "\n", 1) = 1
# gotta love unbuffered stderr
231584 exit_group(127) = ?
231566 <... vfork resumed>) = 231584 # in the parent
231584 +++ exited with 127 +++
# then the parent is running again
231566 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=231584, si_uid=1000, si_status=127, si_utime=0, si_stime=0} ---
231566 rt_sigreturn({mask=[]}) = 231584
... then I typed "quit" and hit return
There some earlier clone
system calls to create more threads in the main GDB process, but those didn't exit until after the vforked PID that attempted ptrace(PTRACE_TRACEME)
. They were all just threads since they used clone
with CLONE_VM
. There was one earlier vfork
/ execve
of /usr/bin/iconv
.
Annoyingly, modern Linux has moved to PIDs wider than 16-bit so the numbers get inconveniently large for human minds.
step
implementation:
Unlike stepi
which would use PTRACE_SINGLESTEP
on ISAs that support it (e.g. x86 where the kernel can use the TF trap flag, but interestingly not ARM), step
is based on source-level line number <-> address debug info. That's usually pointless for asm, unless you want to step past macro expansions or something.
But for step
, GDB will use ptrace(PTRACE_POKETEXT)
to write an int3
debug-break opcode over the first byte of an instruction, then ptrace(PTRACE_CONT)
to let execution run in the child process until it hits a breakpoint or other signal. (Then put back the original opcode byte when this instruction needs to execute). The place at which it puts that breakpoint is something it finds by looking for the next address of a line-number in the DWARF or STABS debug info (metadata) in the executable. That's why only stepi
(aka si
) works when you don't have debug info.
Or possibly it would use PTRACE_SINGLESTEP
one or two times as an optimization if it saw it was close.
(I normally only use si
or ni
for debugging asm, not s
or n
. layout reg
is also nice, when GDB doesn't crash. See the bottom of the x86 tag wiki for more GDB asm debugging tips.)
If you meant to ask how the x86 ISA supports debugging, rather than the Linux kernel API which exposes those features via a target-independent API, see the related Q&As:
- How is PTRACE_SINGLESTEP implemented?
- Why Single Stepping Instruction on X86?
- How to tell length of an x86-64 instruction opcode using CPU itself?
Also How does a debugger work? has some Windowsy answers.
Using gdb to single-step assembly code outside specified executable causes error cannot find bounds of current function
You can use stepi
or nexti
(which can be abbreviated to si
or ni
) to step through your machine code.
Debugging a compiled C program with GDB to learn Assembly programming
First, start the program to stop exactly at the beginning of main
function.
(gdb) start
Switch to assembly layout to see assembly instructions interactively in a separate window.
(gdb) layout asm
Use stepi
or nexti
commands to step through the program. You will see current instruction pointer in assembly window moving when you walk over the assembly instructions in your program.
How to call assembly in gdb?
Prior to GCC 5 (1), I don't know of a way to run arbitrary machine code unless you actually enter the machine code into memory and then run it.
If you want to run code that's already in memory, you can just set the instruction pointer to the start, a breakpoint at the end, then go. Then, after the breakpoint, change the instruction pointer back to its original value.
But I can't actually see the use case for this. That doesn't mean there isn't one, just that anything you can do by running code, you can also achieve by directly modifying the registers, flags, memory and so forth.
For example, the command:
info registers
will dump the current values of the registers while:
set $eax = 42
will change the eax
register to 42
.
You can also change memory in this way:
set *((char*)0xb7ffeca0) = 4
This writes a single byte to memory location 0xb7ffeca0
and you can also use that same method to store wider data types.
(1) GCC 5 allows you to compile and execute arbitrary code with the compile code
command, as documented here.
Gdb step through assembly output of objdump from C compiled
I would like to be able to run gdb on the output of an objdump
That request makes no sense whatsoever.
What you are probably asking is "can I single-step in GDB, one instruction at a time?", in which case the answer is yes: use stepi
command.
How do I step through an executable file using gdb?
Use a function name or a memory address when putting a breakpoint instead or compile without optimizations if you want line numbers.
(gdb) b main // will put a break point at start of function main
(gdb) r // run
Alternatively, use start
command which sets a temporary breakpoint on main() and starts executing.
Use n
to move to next instruction and si
to step into a function / label.
To display the registers
, you can use info regs
command or i r
. Alternatively, use registers layout, which is much better. To get the value inside a particular register, use print
, e.g print $rax
.
(gdb) layout regs
How does a debugger work?
The details of how a debugger works will depend on what you are debugging, and what the OS is. For native debugging on Windows you can find some details on MSDN: Win32 Debugging API.
The user tells the debugger which process to attach to, either by name or by process ID. If it is a name then the debugger will look up the process ID, and initiate the debug session via a system call; under Windows this would be DebugActiveProcess.
Once attached, the debugger will enter an event loop much like for any UI, but instead of events coming from the windowing system, the OS will generate events based on what happens in the process being debugged – for example an exception occurring. See WaitForDebugEvent.
The debugger is able to read and write the target process' virtual memory, and even adjust its register values through APIs provided by the OS. See the list of debugging functions for Windows.
The debugger is able to use information from symbol files to translate from addresses to variable names and locations in the source code. The symbol file information is a separate set of APIs and isn't a core part of the OS as such. On Windows this is through the Debug Interface Access SDK.
If you are debugging a managed environment (.NET, Java, etc.) the process will typically look similar, but the details are different, as the virtual machine environment provides the debug API rather than the underlying OS.
Stopping at the first machine code instruction in GDB
Starting with GDB 8.1, there's a special command for this: starti
. Example GDB session:
$ gdb /bin/true
Reading symbols from /bin/true...(no debugging symbols found)...done.
(gdb) starti
Starting program: /bin/true
Program stopped.
0xf7fdd800 in _start () from /lib/ld-linux.so.2
(gdb) x/5i $pc
=> 0xf7fdd800 <_start>: mov eax,esp
0xf7fdd802 <_start+2>: call 0xf7fe2160 <_dl_start>
0xf7fdd807 <_dl_start_user>: mov edi,eax
0xf7fdd809 <_dl_start_user+2>: call 0xf7fdd7f0
0xf7fdd80e <_dl_start_user+7>: add ebx,0x1f7e6
Can GDB change the assembly code of a running program?
You can write binary to memory directly but GDB doesn't have an assembler build in by default you can however do something like set *(unsigned char*)0x80FFDDEE = 0x90
to change the mnemonic at that address to a NOP for example. You could however use NASM to write a shellcode and use perl or python to inject it into the program :)
You might also like this little .gdbinit file to make debugging allot easier: https://gist.github.com/985474
How to go to the previous line in GDB?
Yes! With the new version 7.0 gdb, you can do exactly that!
The command would be "reverse-step
", or "reverse-next
".
You can get gdb-7.0 from ftp.gnu.org:/pub/gnu/gdb
If you run into the error: Target child does not support this command.
then try adding target record
at the beginning of execution, after starting run
.
Edit: Since GDB 7.6 target record
is deprecated, use target record-full
instead.
Related Topics
How to Escape Unusual/Uniq Characters from Expect Scripts
Will Data Written via Write() Be Flushed to Disk If a Process Is Killed
Focas Fwlib32 Cnc Library on Linux Platform
How to Show Dialog Gauge for Wget
How to Read from User in Rpm Install Script
Find the Average of Fields in the Columns
Write to Port 0Cf8H Fails with Segfault
How to Share a Register Between Threads
Perf Tool Stat Output: Multiplex and Scaling of "Cycles"
Average of Multiple Files Without Considering Missing Values
How to Stop Apache from Listing the Contents of My User Directories
How to Sort File Names by Specific Part in Linux
Compling C++ Code Using Command Line
Ha Proxy Simple Forwarding with Docker
A Way to Prevent Bash from Parsing Command Line W/Out Using Escape Symbols