Why doesn't time() from time.h have a syscall to sys_time?
Read time(7). Probably your call to time(2) uses the vdso(7) (maybe via clock_gettime(2) or via __vdso_time
). If vdso(7) is used,
When tracing systems calls with strace(1), symbols (system calls)
that are exported by the vDSO will not appear in the trace output.
Details could be kernel and libc specific (and of course architecture specific).
For similar vDSO reasons, strace date
don't show any time-related syscalls.
And vDSO is a really handy feature (subject to ASLR). Thanks to it, timing calls (e.g. clock_gettime(2)...) go really quick (about 40 nanoseconds on my i5-4690S). AFAIU, no context switch (or user to kernel mode transition) is happening.
So your 0x7ffff7ff80a8
is probably sitting in the vDSO (and the kernel ensures it contains the current time). You might check by using proc(5) (e.g. reading and showing /proc/self/maps
from your program), or perhaps using ldd(1) and pmap(1)
glibc time function implementation
This complicated-looking inline assembly just causes the following assembly instructions to be emitted by the compiler:
mov eax, 201
syscall
So, the entire time
function is just:
time:
mov eax, 201
syscall
ret
The immediate value 201 (0xC9 in hexadecimal notation) is moved into the EAX
register, and then the syscall
instruction is executed. This instruction does just what the name suggests: it makes a system call. This is basically the way you call platform API functions on Linux. See also section A.2 ("AMD64 Linux Kernel Conventions") of the System V AMD64 ABI.
In brief:
The system call ID number is placed into
rax
.(In this case, the number is just 32 bits, so the assembly code places it into
eax
. The upper 32 bits are implicitly zeroed, saving some bytes in the size of themov
instruction.)The arguments for the system call, if any, are placed in registers:
rdi
,rsi
,rdx
,r10
,r8
, andr9
.(In this case, for system call #201, there are no arguments that need to be specified, so none of these registers are initialized by the
time
function.)After
syscall
is invoked, its result is contained inrax
. Conventionally, negative values (−4095 to −1) indicate an error, corresponding to−errno
.For system calls, the
rcx
andr11
registers are treated as volatile, which means that their contents are subject to being clobbered. If the caller cares about those values, it needs to preserve them. All other registers' values are saved across the system call.(This is why the clobbers are there in the extended inline asm syntax.)
There is a reference for 64-bit Linux system calls available here (32-bit Linux system calls are here). You can see that 201 (0xC9) corresponds to sys_time
.
sys_time
interprets the RDI
register as a time_t*
value. This code:
long int __arg1 = (long int) (t);
register long int _a1 asm ("rdi") = __arg1;
causes the function's parameter, t
, to be stored in the RDI
register. That doesn't actually cause any machine instructions to be generated, though, because the System V AMD64 calling convention already passes the first parameter of a function in RDI
, so t
is already in RDI
.
The sys_time
system call just fills the pointer it finds in RDI
, which is the same as the time
function's t
argument. It also returns its result (an error code) in RAX
, which is always used for the return value of a function under the System V AMD64 calling convention, so no machine instructions are required there, either.
Perhaps more clearly:
# inputs: RDI is a pointer to time_t that will be filled in
# returns: result is left in RAX
time:
mov eax, 201
syscall
ret
Why does this ptrace program say syscall returned -38?
The code doesn't account for the notification of the exec
from the child, and so ends up handling syscall entry as syscall exit, and syscall exit as syscall entry. That's why you see "syscall 12 returned
" before "syscall 12 called
", etc. (-38
is ENOSYS
which is put into RAX as a default return value by the kernel's syscall entry code.)
As the ptrace(2)
man page states:
PTRACE_TRACEME
Indicates that this process is to be traced by its parent. Any signal (except SIGKILL) delivered to this process will cause it to stop and its parent to be notified via wait(). Also, all subsequent calls to exec() by this process will cause a SIGTRAP to be sent to it, giving the parent a chance to gain control before the new program begins execution. [...]
You said that the original code you were running was "the same as this one except that I'm running execl("/bin/ls", "ls", NULL);
". Well, it clearly isn't, because you're working with x86_64 rather than 32-bit and have changed the messages at least.
But, assuming you didn't change too much else, the first time the wait()
wakes up the parent, it's not for syscall entry or exit - the parent hasn't executed ptrace(PTRACE_SYSCALL,...)
yet. Instead, you're seeing this notification that the child has performed an exec
(on x86_64, syscall 59 is execve
).
The code incorrectly interprets that as syscall entry. Then it calls ptrace(PTRACE_SYSCALL,...)
, and the next time the parent is woken it is for a syscall entry (syscall 12), but the code reports it as syscall exit.
Note that in this original case, you never see the execve
syscall entry/exit - only the additional notification - because the parent does not execute ptrace(PTRACE_SYSCALL,...)
until after it happens.
If you do arrange the code so that the execve
syscall entry/exit are caught, you will see the new behaviour that you observe. The parent will be woken three times: once for execve
syscall entry (due to use of ptrace(PTRACE_SYSCALL,...)
, once for execve
syscall exit (also due to use of ptrace(PTRACE_SYSCALL,...)
, and a third time for the exec
notification (which happens anyway).
Here is a complete example (for x86 or x86_64) which takes care to show the behaviour of the exec
itself by stopping the child first:
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <sys/ptrace.h>
#include <sys/reg.h>
#ifdef __x86_64__
#define SC_NUMBER (8 * ORIG_RAX)
#define SC_RETCODE (8 * RAX)
#else
#define SC_NUMBER (4 * ORIG_EAX)
#define SC_RETCODE (4 * EAX)
#endif
static void child(void)
{
/* Request tracing by parent: */
ptrace(PTRACE_TRACEME, 0, NULL, NULL);
/* Stop before doing anything, giving parent a chance to catch the exec: */
kill(getpid(), SIGSTOP);
/* Now exec: */
execl("/bin/ls", "ls", NULL);
}
static void parent(pid_t child_pid)
{
int status;
long sc_number, sc_retcode;
while (1)
{
/* Wait for child status to change: */
wait(&status);
if (WIFEXITED(status)) {
printf("Child exit with status %d\n", WEXITSTATUS(status));
exit(0);
}
if (WIFSIGNALED(status)) {
printf("Child exit due to signal %d\n", WTERMSIG(status));
exit(0);
}
if (!WIFSTOPPED(status)) {
printf("wait() returned unhandled status 0x%x\n", status);
exit(0);
}
if (WSTOPSIG(status) == SIGTRAP) {
/* Note that there are *three* reasons why the child might stop
* with SIGTRAP:
* 1) syscall entry
* 2) syscall exit
* 3) child calls exec
*/
sc_number = ptrace(PTRACE_PEEKUSER, child_pid, SC_NUMBER, NULL);
sc_retcode = ptrace(PTRACE_PEEKUSER, child_pid, SC_RETCODE, NULL);
printf("SIGTRAP: syscall %ld, rc = %ld\n", sc_number, sc_retcode);
} else {
printf("Child stopped due to signal %d\n", WSTOPSIG(status));
}
fflush(stdout);
/* Resume child, requesting that it stops again on syscall enter/exit
* (in addition to any other reason why it might stop):
*/
ptrace(PTRACE_SYSCALL, child_pid, NULL, NULL);
}
}
int main(void)
{
pid_t pid = fork();
if (pid == 0)
child();
else
parent(pid);
return 0;
}
which gives something like this (this is for 64-bit - system call numbers are different for 32-bit; in particular execve
is 11, rather than 59):
Child stopped due to signal 19
SIGTRAP: syscall 59, rc = -38
SIGTRAP: syscall 59, rc = 0
SIGTRAP: syscall 59, rc = 0
SIGTRAP: syscall 63, rc = -38
SIGTRAP: syscall 63, rc = 0
SIGTRAP: syscall 12, rc = -38
SIGTRAP: syscall 12, rc = 5324800
...
Signal 19 is the explicit SIGSTOP
; the child stops three times for the execve
as just described above; then twice (entry and exit) for other system calls.
If you're really interesting in all the gory details of ptrace()
, the best documentation I'm aware of is theREADME-linux-ptrace
file in the strace
source. As it says, the "API is complex and has subtle quirks"....
NASM printing out time - code doesn't output anything
This is your example translated to C. You are copying the pointer to time to eax instead of eax to the buffer. Still that wouldn't work because you want a char array for write and not a raw integer which will print garbage.
#include <stdlib.h>
char b[255];
int
main()
{
/* You wanted to do this which doesn't work
* because write wont take int* but char arrays
* *(int*)b=time(NULL);
*/
/* Instead you did */
time(NULL);
b;
write(1, b, 255);
exit(1);
}
C segmentation fault after accessing the system time by inline assembly
You have a lot of bugs here, of which the most important are:
You pushed things onto the stack, but you didn't pop them off again before leaving the inline assembly block. The compiler doesn't know you did that, so it will look for everything on the stack (such as the return address) in the wrong place afterward. This is is very likely to be what caused the crash.
More generally, compilers that use this style of inline assembly don't interpret the assembly instructions at all. They trust you to have used the input, output, clobber annotations correctly. If you neglect to mention even one register or memory area that has been modified, the compiler will generate incorrect code surrounding the assembly insert and the program won't work.
"Ubuntu 16.04" is a distribution of Linux, so you are using the wrong calling convention. Linux takes system call arguments in registers, not on the stack, as documented here, and
gettimeofday
is not system call number 116 on x86-32/Linux. (Always use theSYS_foo
constants, fromsys/syscall.h
, for system call numbers.)
Also, it is best to do as little as possible in the actual inserted assembly. In this case, that means just the int
instruction itself. Set up arguments using the input and output constraints, instead. This gives the compiler maximum leeway to optimize. (If you are writing assembly by hand because the compiler is failing to do a sufficiently good job of optimizing, you should write an entire ".s" file of pure assembly, rather than a .c file with gigantic assembly inserts; this is more maintainable.)
Correct code for this task would be something like
#include <assert.h>
#include <sys/time.h>
#include <sys/syscall.h>
struct timeval
call_gettimeofday()
{
struct timeval ret;
int dummy;
asm("int $0x80"
: "=m" (ret), "=a" (dummy)
: "1" (SYS_gettimeofday), "b" (&ret), "c" (0));
assert(!dummy); // gettimeofday should never fail
return ret;
}
As a final note, it is almost always a mistake to use inline assembly to make system calls. The C library's wrapper functions may be doing more work than is apparent to you, and they know how to use a more efficient trap sequence (using sysenter
or syscall
instead of int
) when possible. In the case of gettimeofday
, the difference is even more profound: the C library knows how to do a gettimeofday
operation without trapping into the kernel at all! (Read up on the vDSO
to understand how this is possible.)
Related Topics
How Does Ltrace (Library Tracing Tool) Work
How to Use Awk to Test If a Column Value Is in Another File
Bind Outgoing Traffic to Eth0 Instead of Eth0:1
Batch Crop and Resize Images to Create Thumbnails
Building Perf with Babeltrace (For Perf to Ctf Conversion)
Sending Command to Process Using /Proc
How to Give Password in Shell Script
Bash Script Command to Wait Until Docker-Compose Process Has Finished Before Moving On
Remove a List of Words from Filename
Bash Tries to Execute Commands in Heredoc
Shipping Gnu/Linux Firefox Plugin with Shared Libraries (For Installation with No Root Access)
Run Any Linux Terminal Command from Typescript
How to Find Which Type of System Call Is Used by a Program
Environment Variables in Docker When Exec Docker Run
Is There Any Posix Way Through Fstat() to Check Whether a File Is a Symbolic Link or Not
Would It Be Possible to Read Out Physical Keyboard Strokes in Node.Js