How does ltrace() display rand()
ltrace
shows the content of the few registers passing arguments, according to x86-64 ABI conventions.
For other functions, ltrace
knows their API (i.e. their signature) so show arguments more cleverly.
See ltrace(1) and the PROTOTYPE LIBRARY DISCOVERY section.
ltrace does not show sin() in the output
This is because your sin
call is a constant value and gcc
optimizes it out (even when compiling with -O0
and without -lm
). This is the result of running disass main
in gdb
:
0x0000000000400580 <+0>: push %rbp
0x0000000000400581 <+1>: mov %rsp,%rbp
0x0000000000400584 <+4>: sub $0x10,%rsp
0x0000000000400588 <+8>: mov 0xee(%rip),%eax # 0x40067c
0x000000000040058e <+14>: mov %eax,-0x4(%rbp)
0x0000000000400591 <+17>: mov $0x400660,%edi
0x0000000000400596 <+22>: callq 0x400450 <puts@plt>
0x000000000040059b <+27>: mov 0xdf(%rip),%eax # 0x400680
0x00000000004005a1 <+33>: mov %eax,-0x4(%rbp)
0x00000000004005a4 <+36>: movss -0x4(%rbp),%xmm0
0x00000000004005a9 <+41>: cvtps2pd %xmm0,%xmm0
0x00000000004005ac <+44>: mov $0x40066e,%edi
0x00000000004005b1 <+49>: mov $0x1,%eax
0x00000000004005b6 <+54>: callq 0x400460 <printf@plt>
0x00000000004005bb <+59>: mov $0x0,%eax
0x00000000004005c0 <+64>: leaveq
0x00000000004005c1 <+65>: retq
There is no call for sin
here.
Changing your code to read:
#include<stdio.h>
#include<math.h>
int main()
{
float x, y;
scanf("%f", &x);
y=sin(x);
printf("sin(%f)=%f\n", x, y);
return 0;
}
will make you need -lm
when compiling:
$ gcc -Wall -Wextra -O0 -g 1.c -lm
and now you'll see this disassembled output:
...
0x00000000004006c9 <+25>: callq 0x4005b0 <__isoc99_scanf@plt>
0x00000000004006ce <+30>: movss -0x8(%rbp),%xmm0
0x00000000004006d3 <+35>: unpcklps %xmm0,%xmm0
0x00000000004006d6 <+38>: cvtps2pd %xmm0,%xmm0
0x00000000004006d9 <+41>: callq 0x4005a0 <sin@plt>
...
and the call in ltrace
:
__libc_start_main(0x4006b0, 1, 0x7fffd25ecff8, 0x400720 <unfinished ...>
__isoc99_scanf(0x4007b0, 0x7fffd25ecf08, 0x7fffd25ed008, 0x400720) = 1
sin(0x7fffd25ec920, 0x7fa1a6388a20, 1, 16) = 0x7fa1a643b780
printf("sin(%f)=%f\n", 3.000000, 0.141120sin(3.000000) =0.141120
) = 23
+++ exited (status 0) +++
No output when running ltrace
This may have to do with binaries being compiled with -z now
. I created a quick test program (I'm using Ubuntu 16.04):
int main() {
write(0, "hello\n", 6);
return 0;
}
If I compile it with gcc -O2 test.c -o test
then ltrace works:
$ ltrace ./test
__libc_start_main(0x400430, 1, 0x7ffc12326528, 0x400550 <unfinished ...>
write(0, "hello\n", 6hello
) = 6
+++ exited (status 0) +++
However when I compile with gcc -O2 test.c -Wl,-z,relro -Wl,-z,now -o test2
then it doesn't:
$ ltrace ./test2
hello
+++ exited (status 0) +++
You can check if a binary was compiled like so using scanelf
from the pax-utils
package on Ubuntu:
$ scanelf -a test*
TYPE PAX PERM ENDIAN STK/REL/PTL TEXTREL RPATH BIND FILE
ET_EXEC PeMRxS 0775 LE RW- R-- RW- - - LAZY test
ET_EXEC PeMRxS 0775 LE RW- R-- RW- - - NOW test2
Note the LAZY
(ltrace works) versus NOW
(ltrace doesn't).
There is a little bit more discussion (but no resolution) here:
https://bugzilla.redhat.com/show_bug.cgi?id=1333481
gdb prints long values watching a variable set with rand()
Unoptimized gcc assembly can be strange:
jmp .L2
.L3:
call rand
movl %eax, %edx
movslq %edx, %rax
imulq $1717986919, %rax, %rax
shrq $32, %rax
sarl $2, %eax
movl %edx, %ecx
sarl $31, %ecx
subl %ecx, %eax
movl %eax, -4(%rbp)
movl -4(%rbp), %ecx
movl %ecx, %eax
sall $2, %eax
addl %ecx, %eax
addl %eax, %eax
subl %eax, %edx
movl %edx, -4(%rbp)
addl $1, -8(%rbp)
.L2:
cmpl $9, -8(%rbp)
jle .L3
And it seems you are warching -4(%rbp)
. So there is movl %eax, -4(%rbp)
where a "big number" is put there, then a read in movl -4(%rbp), %ecx
and then movl %edx, -4(%rbp)
where the result of % 10
is put there. So you are seeing some number from middle of calculations. Ie. one loop corresponds to:
New value = 32015002
0x00005555555551f8 in main () at demo.c:12
12 var = rand() % 10;
Hardware access (read/write) watchpoint 2: var
Value = 32015002
0x00005555555551fb in main () at demo.c:12
12 var = rand() % 10;
Hardware access (read/write) watchpoint 2: var
Old value = 32015002
New value = 7
main () at demo.c:10
10 for (int i = 0; i < 5; i++)
Hardware access (read/write) watchpoint 2: var
What is the best approach to compute the trace of a (sparse) matrix product efficiently in python
Another option is (A.conj().multiply(B)).sum()
.
In [111]: Dimension = 2**12
In [112]: A = rand(Dimension, Dimension, density=0.001, format='csr')
...: B = rand(Dimension, Dimension, density=0.001, format='csr')
Compare to sum((A.conj().T @ B).diagonal())
:
In [113]: sum((A.conj().T @ B).diagonal())
Out[113]: 4.152218112255467
In [114]: (A.conj().multiply(B)).sum()
Out[114]: 4.152218112255466
In [115]: %timeit sum((A.conj().T @ B).diagonal())
2.7 ms ± 11.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [116]: %timeit (A.conj().multiply(B)).sum()
1.12 ms ± 4.39 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
Of course, for larger values of Dimension
, the relative performance difference is much greater (O(Dimension**3)
for the full matrix multiply vs O(Dimension**2)
for the elementwise multiply):
In [119]: Dimension = 2**14
In [120]: A = rand(Dimension, Dimension, density=0.001, format='csr')
...: B = rand(Dimension, Dimension, density=0.001, format='csr')
In [121]: sum((A.conj().T @ B).diagonal())
Out[121]: 69.23254213582365
In [122]: (A.conj().multiply(B)).sum()
Out[122]: 69.23254213582364
In [123]: %timeit sum((A.conj().T @ B).diagonal())
124 ms ± 1.22 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [124]: %timeit (A.conj().multiply(B)).sum()
8.67 ms ± 63.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Related Topics
Linux: How to Send a Whole Packet to a Specific Port on Another Host
Bash Script - Auto Fill Answer
How to Disable Qt's Behavior on Linux of Capturing Arrow Keys for Widget Focus Navigation
How to Set Umask Default for an User
How to Implement Highly Accurate Timers in Linux Userspace
When Is The System Call Set_Tid_Address Used
Difference Between Posix Reliable Signals and Posix Real-Time Signals in Linux
Systemtap Script to Profile Latency of Functions
How to Get a Process Tree Trace/Log of a Process in Linux
U-Boot: Cannot Boot Linux Kernel Despite Kernel Being Less Than Maximum Bootm_Len
Qemu Simple Backend Tracing Dosen'T Print Anything
Qwidget/X11: Prevent Window from Beeing Activated/Focussed by Mouse Clicks
X11 Configurenotify() Always Returning X,Y = (0,0)
Make Diff to Ignore Symbolic Link
Sending Realtime Signal from a Kernel Module to User Space Fails