Signal handling in asm: Why am I receiving SIGSEGV when invoking the sys_pause syscall?
There were two corrections that needed to be made before the application worked correctly.
sa_restorer
Jester pointed me to this answer which mentioned that the kernel requires the sa_restorer
member of sigaction
to be filled in.
Fixing this required defining SA_RESTORER
:
%define SA_RESTORER 0x04000000
...and initializing the sa_restorer
and sa_flags
members:
mov [act + sigaction.sa_flags], dword SA_RESTORER
lea rax, [restorer]
mov [act + sigaction.sa_restorer], rax
I then added an empty stub for the restorer
function:
restorer:
ret
At this point, the handler was invoked without error but the application was still crashing...
sys_rt_sigreturn
Apparently, the sa_restorer
function needs to invoke the sys_rt_sigreturn
syscall. This required defining sys_rt_sigreturn
:
%define sys_rt_sigreturn 0x0f
The restorer
function was then modified:
restorer:
; return from the signal handler
mov rax, sys_rt_sigreturn
syscall
At this point, the application ran without crashing.
Why SIGSEGV while push instruction
It appears that gdb stepped over the syscall
instruction and some of the surrounding instructions. The SIGSEGV
probably has something to with the value of the rcx
register, used in the instruction at 0x7fffffffef12
. If you want gdb to stop at every instruction rather than proceeding over function calls, stepi
is likely to be better for that than nexti
.
The instruction at 0x7fffffffef12
(the presumed location of the crash) seems strange; other instructions in that disassembly also seem strange. If I look at the same address range in a gdb session on my own system, what I see in that part of that page is a bunch of null terminated strings which looks a whole lot like pieces of my command line, and then my environment. The addresses of the first three match the first three elements of argv in my main frame, and argv itself is also in that page.
It might be interesting to examine the addresses you disassembled with x/s
rather than x/i
. In my session (in the main frame) x/29s argv[0]
shows a bunch of stuff in that address range.
If it turns out that your crash occurred while attempting to treat your environment as code, perhaps the more interesting question is how a branch to that range of addresses occurred. If gdb shows a coherent backtrace for this crash, that might provide some insight.
Memory access error sys_rt_sigaction (signal handler)
In x86-64 linux, it's mandatory to supply a sa_restorer
and you haven't done so.
The relevant part of kernel source:
/* x86-64 should always use SA_RESTORER. */
if (ksig->ka.sa.sa_flags & SA_RESTORER) {
put_user_ex(ksig->ka.sa.sa_restorer, &frame->pretcode);
} else {
/* could use a vstub here */
err |= -EFAULT;
}
The C library wrapper does this for you:
kact.sa_flags = act->sa_flags | SA_RESTORER;
kact.sa_restorer = &restore_rt;
With the updated code you do indeed have a restorer, but you have two problems: it's broken and you pass it wrong. Looking at the above mentioned C library source you can find this comment:
/* The difference here is that the sigaction structure used in the
kernel is not the same as we use in the libc. Therefore we must
translate it here. */
Also, you can't have a C++ function as restorer due to the function prologue. Furthermore, calling printf
from a signal handler is not supported (but works here). Finally, as David Wohlferd pointed out, your clobbers are wrong. All in all, the following could be a reworked version:
#include<stdio.h>
#include<unistd.h>
#include<time.h>
void handler(int){
const char msg[] = "handler\n";
write(0, msg, sizeof(msg));
}
extern "C" void restorer();
asm volatile("restorer:mov $15,%rax\nsyscall");
struct kernel_sigaction {
void (*k_sa_handler) (int);
unsigned long sa_flags;
void (*sa_restorer) (void);
unsigned long sa_mask;
};
struct kernel_sigaction act{handler};
timespec ts{10,0};
int main(){
act.sa_flags=0x04000000;
act.sa_restorer=&restorer;
asm volatile("\
mov $13,%%rax\n\
mov %0,%%rdi\n\
mov %1,%%rsi\n\
mov %2,%%rdx\n\
mov $8,%%r10\n\
syscall\n\
"::"i"(7),"p"(&act),"p"(0):"rax","rcx", "rdi","rsi","rdx","r8", "r9", "r10", "r11");
nanosleep(&ts,0);
}
It's still hacky, and you shouldn't really be doing it this way, obviously.
When will Linux kernel reset the signal handler for SIGSEGV to SIG_DFL?
It depends on how you register the signal handler.
With sigaction
and without the SA_RESETHAND
flag, there will be no resetting to SIG_DFL
(although returning from a signal handler run in response to a SIGSEGV
delivered due to a segmentation fault is technically UB).
With SA_RESETHAND
it will get reset, and if you register the handler with signal
, then whether the handler will be reset or not is unspecified (so don't use signal()
).
Example:
#include <signal.h>
#include <unistd.h>
int volatile*a;
void h(int Sig) { write(1,"h\n", 2); }
int main()
{
//sigaction(SIGSEGV,&(struct sigaction){.sa_handler=h}, 0); //won't reset the handler, will likely loop
sigaction(SIGSEGV,&(struct sigaction){.sa_handler=h,.sa_flags=SA_RESETHAND}, 0); //will reset the handler
//signal(SIGSEGV,h); //may or may not reset the handler
*a=1;
return 0;
}
Does linux allow any system call to be made from signal handlers?
I would believe that any real system call can be called from a signal handler. A true syscall has a number in <asm/unistd.h>
(or <asm/unistd_64.h>
).
some posix functions from section 2 of man pages are implemented thru a "multiplexing" syscall, so they are not "true syscalls" in my sense
A system call is an atomic operation from the point of view of the application; it is almost like a single machine instruction (from inside the application). See this answer.
If your question is: can a SIGSEGV
handler change the faulty address mapping thru mprotect
or mmap
? then I believe the answer is yes (at least on x86-64 & x86-32 architectures), as said here in a question you quoted, but I did not try. I've read that doing that is quite inefficient (SIGSEGV
handling is not very fast, and mprotect
or mmap
is also a bit slow). In particular, mimicking this way Hurd/Mach external pagers might be inefficient.
Handling SIGCHLD NASM
Ooooook after a long time poking I finally figured this out! The problem was setting the restorer correctly in the sigact struct.
When I checked sigaction(2)
to get the struct definition it ended up not being what I thought it was at all. I got this:
struct sigaction {
void (*sa_handler)(int);
void (*sa_sigaction)(int, siginfo_t *, void *);
sigset_t sa_mask;
int sa_flags;
void (*sa_restorer)(void);
};
but that is the C definition of the struct (well not quite, as the man page mentions the first two might be a union which was the case for me).
However some more poking around I found that the struct that I needed to build looks more like this:
struct asm_sigaction {
void (*sa_handler)(int);
[unsigned?] long sa_flags;
void (*sa restorer)(void);
sigset_t sa_mask;
};
I found this out by digging around in what my C code was really doing. I found the spot where the same syscall that I was making was made and dumped the bytes for what they were passing for the sigaction struct:
(gdb) x/38wx $rsi
0x7fffffffddc0: 0x004007f5 0x00000000 0x14000000 0x00000000
0x7fffffffddd0: 0xf7a434a0 0x00007fff 0x00000000 0x00000000
0x7fffffffdde0: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffffffddf0: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffffffde00: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffffffde10: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffffffde20: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffffffde30: 0x00000000 0x00000000 0x00000000 0x00000000
0x7fffffffde40: 0x00000000 0x00000000 0x00000000 0x00000000
The part at 0x7fffffffddd0
looked like an address to me so I checked it out:
(gdb) disas 0x00007ffff7a434a0
Dump of assembler code for function __restore_rt:
0x00007ffff7a434a0 <+0>: mov rax,0xf
0x00007ffff7a434a7 <+7>: syscall
0x00007ffff7a434a9 <+9>: nop DWORD PTR [rax+0x0]
Sure enough they were setting the restorer which was calling sigreturn
(in my case rt_sigreturn
) system call! The man page said applications don't normally mess with that, but that is for typical C programs I guess. So I went ahead and copied this function in the restorer label and put it in the appropriate spot in my struc and wooooooo it worked.
Here is the now working NASM, I changed things around a bit with a new C program which I tried to make look and act more like my NASM program was and switched out pause
for nanosleep
.
New C program:
#include <signal.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
#include <errno.h>
#include <time.h>
#include <sys/types.h>
#include <sys/wait.h>
const char *parentmsg = "from parent\n\0";
const char *childmsg = "from child\n\0";
const char *handlemsg = "in handle\n\0";
const char *forkfailed = "fork failed\n\0";
const char *parentexit = "parent exiting\n\0";
const char *sleepfailed = "sleep failed\n\0";
const char *sleepinterrupted = "sleep interrupted\n\0";
void print(const char *msg) {
write(STDIN_FILENO, msg, strlen(msg));
}
static void handle(int sig) {
print(handlemsg);
waitid(P_ALL, -1, NULL, WEXITED|WSTOPPED|WCONTINUED);
}
int main(int argc, char* argv[]) {
struct timespec tsreq;
struct timespec tsrem;
tsreq.tv_sec = 2;
struct sigaction act;
act.sa_handler = &handle;
sigaction(SIGCHLD, &act, NULL);
pid_t pid;
if ( (pid = fork()) == 0 ) {
print(childmsg);
exit(0);
}
print(parentmsg);
if (nanosleep((const struct timespec*)&tsreq, &tsrem) == -1) {
if (errno == EINTR) {
print(sleepinterrupted);
nanosleep((const struct timespec*)&tsrem, NULL);
} else {
print(sleepfailed);
}
}
print(parentexit);
exit(0);
}
And the new working NASM (with some help from Peter to hopefully make it look and function little better)
USE64
STRUC sigact
.handler resq 1
.flag resq 1
.restorer resq 1
.mask resq 16
ENDSTRUC
STRUC timespec
.tv_sec resq 1
.tv.nsec resq 1
ENDSTRUC
section .text
global _start
_start:
; register SIGCHLD handler
mov DWORD [act+sigact.handler], handle
mov QWORD [act+sigact.restorer], restorer
mov DWORD [act+sigact.flag], 0x04000000
mov rax, 13
mov rdi, 17
lea rsi, [act]
xor rdx, rdx
mov r10, 0x8
syscall
cmp eax, 0
jne sigaction_fail
mov rax, 57
syscall
cmp eax, -1
je fork_failed
cmp eax, 0
je child
mov rax, parentmsg
call print
mov rax, 35
mov QWORD [tsreq+timespec.tv_sec], 2
lea rdi, [tsreq]
lea rsi, [tsrem]
syscall
cmp eax, -1
je .exit
mov rax, sleepagain
call print
mov rax, 35
mov rdi, tsrem
xor rsi, rsi
syscall
.exit:
mov rax, parentexit
call print
mov rax, 60
xor rdi, rdi
syscall
restorer:
mov rax, 15
syscall
fork_failed:
mov rax, forkfailed
call print
mov rax, 60
mov rdi, -1
syscall
sigaction_fail:
mov rax, safailed
call print
mov rax, 60
mov rdi, -1
syscall
handle:
mov rax, handlemsg
call print
lea rsi, [rsp-0x4]
mov rax, 247
xor rdi, rdi
xor rdx, rdx
mov r10, 14
syscall
cmp eax, -1
jne .success
mov rax, hdfailed
call print
mov rax, 60
mov rdi, -1
syscall
.success:
mov rax, hdsuccess
call print
ret
child:
mov rax, childmsg
call print
mov rax, 60
xor rdi, rdi
syscall
; print a null terminated string stored in rax
print:
push rbx
push rdx
push rdi
push rsi
mov rbx, rax
call strlen
mov rdx, rax
mov rax, 1
mov rdi, 1 ; stdout
mov rsi, rbx
syscall
pop rsi
pop rdi
pop rdx
pop rbx
ret
strlen:
push rbp
mov rbp, rsp
push rbx
mov rbx, rax
.countchar:
cmp BYTE [rax], 0 ; compare it to null byte
jz .exit
inc rax
jmp .countchar
.exit:
sub rax, rbx
pop rbx
mov rsp, rbp
pop rbp
ret
section .data
childmsg: db "from child", 0xa, 0 ; null terminated
parentmsg db "from parent", 0xa, 0
handlemsg db "in handle", 0xa, 0
safailed db "failed to set signal handler", 0xa, 0
hdfailed db "failed waiting for child", 0xa, 0
hdsuccess db "successfully waited on child", 0xa, 0
parentexit db "parent exiting", 0xa, 0
forkfailed db "fork failed", 0xa, 0
sleepagain db "sleeping again", 0xa, 0
section .bss
tsreq: resb timespec_size
tsrem: resb timespec_size
act: resb sigact_size
Related Topics
How to Imshow with Invisible Figure in Matlab Running on Linux
How the Share Library Be Shared by Different Processes
Why Can Back-Quotes and $() for Command Substitution Result in Different Output
Define Function in Unix/Linux Command Line (E.G. Bash)
Removing Sensitive Data from Git. "Fatal: Ambiguous Argument 'Rm'"
Convert an Iso Date to Seconds Since Epoch in Linux Bash
Is Lib{Library Name}.A/.So a Naming Convention for Static Libraries in Linux
Compiler Can't Find Libxml/Parser.H
How to Non-Interactively Turn on Features in a Linux Kernel .Config File
Multiplication with Expr in Shell Script
How to Run Script Commands from Variables
Sed Replacement Not Working When Using Variables
How to Call Accept() for One Socket from Several Threads Simultaneously