How Does Ltrace() Display Rand()

How does ltrace() display rand()

ltrace shows the content of the few registers passing arguments, according to x86-64 ABI conventions.

For other functions, ltrace knows their API (i.e. their signature) so show arguments more cleverly.

See ltrace(1) and the PROTOTYPE LIBRARY DISCOVERY section.

ltrace does not show sin() in the output

This is because your sin call is a constant value and gcc optimizes it out (even when compiling with -O0 and without -lm). This is the result of running disass main in gdb:

   0x0000000000400580 <+0>:     push   %rbp
0x0000000000400581 <+1>: mov %rsp,%rbp
0x0000000000400584 <+4>: sub $0x10,%rsp
0x0000000000400588 <+8>: mov 0xee(%rip),%eax # 0x40067c
0x000000000040058e <+14>: mov %eax,-0x4(%rbp)
0x0000000000400591 <+17>: mov $0x400660,%edi
0x0000000000400596 <+22>: callq 0x400450 <puts@plt>
0x000000000040059b <+27>: mov 0xdf(%rip),%eax # 0x400680
0x00000000004005a1 <+33>: mov %eax,-0x4(%rbp)
0x00000000004005a4 <+36>: movss -0x4(%rbp),%xmm0
0x00000000004005a9 <+41>: cvtps2pd %xmm0,%xmm0
0x00000000004005ac <+44>: mov $0x40066e,%edi
0x00000000004005b1 <+49>: mov $0x1,%eax
0x00000000004005b6 <+54>: callq 0x400460 <printf@plt>
0x00000000004005bb <+59>: mov $0x0,%eax
0x00000000004005c0 <+64>: leaveq
0x00000000004005c1 <+65>: retq

There is no call for sin here.

Changing your code to read:

#include<stdio.h>
#include<math.h>

int main()
{
float x, y;
scanf("%f", &x);
y=sin(x);
printf("sin(%f)=%f\n", x, y);
return 0;
}

will make you need -lm when compiling:

$ gcc -Wall -Wextra -O0 -g 1.c -lm

and now you'll see this disassembled output:

   ...
0x00000000004006c9 <+25>: callq 0x4005b0 <__isoc99_scanf@plt>
0x00000000004006ce <+30>: movss -0x8(%rbp),%xmm0
0x00000000004006d3 <+35>: unpcklps %xmm0,%xmm0
0x00000000004006d6 <+38>: cvtps2pd %xmm0,%xmm0
0x00000000004006d9 <+41>: callq 0x4005a0 <sin@plt>
...

and the call in ltrace:

__libc_start_main(0x4006b0, 1, 0x7fffd25ecff8, 0x400720 <unfinished ...>
__isoc99_scanf(0x4007b0, 0x7fffd25ecf08, 0x7fffd25ed008, 0x400720) = 1
sin(0x7fffd25ec920, 0x7fa1a6388a20, 1, 16) = 0x7fa1a643b780
printf("sin(%f)=%f\n", 3.000000, 0.141120sin(3.000000) =0.141120
) = 23
+++ exited (status 0) +++

No output when running ltrace

This may have to do with binaries being compiled with -z now. I created a quick test program (I'm using Ubuntu 16.04):

int main() {
write(0, "hello\n", 6);
return 0;
}

If I compile it with gcc -O2 test.c -o test then ltrace works:

$ ltrace ./test 
__libc_start_main(0x400430, 1, 0x7ffc12326528, 0x400550 <unfinished ...>
write(0, "hello\n", 6hello
) = 6
+++ exited (status 0) +++

However when I compile with gcc -O2 test.c -Wl,-z,relro -Wl,-z,now -o test2 then it doesn't:

$ ltrace ./test2 
hello
+++ exited (status 0) +++

You can check if a binary was compiled like so using scanelf from the pax-utils package on Ubuntu:

$ scanelf -a test*
TYPE PAX PERM ENDIAN STK/REL/PTL TEXTREL RPATH BIND FILE
ET_EXEC PeMRxS 0775 LE RW- R-- RW- - - LAZY test
ET_EXEC PeMRxS 0775 LE RW- R-- RW- - - NOW test2

Note the LAZY (ltrace works) versus NOW (ltrace doesn't).

There is a little bit more discussion (but no resolution) here:

https://bugzilla.redhat.com/show_bug.cgi?id=1333481

gdb prints long values watching a variable set with rand()

Unoptimized gcc assembly can be strange:

        jmp     .L2
.L3:
call rand
movl %eax, %edx
movslq %edx, %rax
imulq $1717986919, %rax, %rax
shrq $32, %rax
sarl $2, %eax
movl %edx, %ecx
sarl $31, %ecx
subl %ecx, %eax
movl %eax, -4(%rbp)
movl -4(%rbp), %ecx
movl %ecx, %eax
sall $2, %eax
addl %ecx, %eax
addl %eax, %eax
subl %eax, %edx
movl %edx, -4(%rbp)
addl $1, -8(%rbp)
.L2:
cmpl $9, -8(%rbp)
jle .L3

And it seems you are warching -4(%rbp). So there is movl %eax, -4(%rbp) where a "big number" is put there, then a read in movl -4(%rbp), %ecx and then movl %edx, -4(%rbp) where the result of % 10 is put there. So you are seeing some number from middle of calculations. Ie. one loop corresponds to:

New value = 32015002
0x00005555555551f8 in main () at demo.c:12
12 var = rand() % 10;

Hardware access (read/write) watchpoint 2: var

Value = 32015002
0x00005555555551fb in main () at demo.c:12
12 var = rand() % 10;

Hardware access (read/write) watchpoint 2: var

Old value = 32015002
New value = 7
main () at demo.c:10
10 for (int i = 0; i < 5; i++)

Hardware access (read/write) watchpoint 2: var

What is the best approach to compute the trace of a (sparse) matrix product efficiently in python

Another option is (A.conj().multiply(B)).sum().

In [111]: Dimension = 2**12

In [112]: A = rand(Dimension, Dimension, density=0.001, format='csr')
...: B = rand(Dimension, Dimension, density=0.001, format='csr')

Compare to sum((A.conj().T @ B).diagonal()):

In [113]: sum((A.conj().T @ B).diagonal())
Out[113]: 4.152218112255467

In [114]: (A.conj().multiply(B)).sum()
Out[114]: 4.152218112255466

In [115]: %timeit sum((A.conj().T @ B).diagonal())
2.7 ms ± 11.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [116]: %timeit (A.conj().multiply(B)).sum()
1.12 ms ± 4.39 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

Of course, for larger values of Dimension, the relative performance difference is much greater (O(Dimension**3) for the full matrix multiply vs O(Dimension**2) for the elementwise multiply):

In [119]: Dimension = 2**14

In [120]: A = rand(Dimension, Dimension, density=0.001, format='csr')
...: B = rand(Dimension, Dimension, density=0.001, format='csr')

In [121]: sum((A.conj().T @ B).diagonal())
Out[121]: 69.23254213582365

In [122]: (A.conj().multiply(B)).sum()
Out[122]: 69.23254213582364

In [123]: %timeit sum((A.conj().T @ B).diagonal())
124 ms ± 1.22 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [124]: %timeit (A.conj().multiply(B)).sum()
8.67 ms ± 63.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


Related Topics



Leave a reply



Submit