perf-report show value of CPU register
Try perf script
data dumping command with the iregs
field: perf script -F ip,sym,iregs
. All fields -F
are documented with source code of tools/perf/builtin-script.c - struct output_option .. all_output_options
, and iregs
is still here (also OPT_CALLBACK('F', "fields" ...
in same file). There is also perf-script.txt documentation - https://github.com/torvalds/linux/blob/master/tools/perf/Documentation/perf-script.txt#L115
-F::
--fields::
Comma separated list of fields to print. Options are:
comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
srcline, period, iregs, brstack, brstacksym, flags, bpf-output, brstackinsn,
callindent, insn, insnlen. Field list can be prepended with the type, trace, sw or hw,
to indicate to which event type the field list applies.
There were commits into linux kernel git which mention flag --intr-regs
. Start from option implementation:
https://github.com/torvalds/linux/search?utf8=%E2%9C%93&q=intr-regs&type=
tools/perf/builtin-record.c
OPT_CALLBACK_OPTARG('I', "intr-regs", &record.opts.sample_intr_regs, NULL, "any register",
Then search for 'sample_intr_regs' in commits: https://github.com/torvalds/linux/search?q=sample_intr_regs&type=Commits
Several commits were about kernel part and perf_attr debug print. But this has example of intr-regs
printing (Sep 1, 2015)
https://github.com/torvalds/linux/commit/532026612455a4a6fd27c1b2e7111263f63218a2
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
- Add ability to specify to select which registers to record,
to reduce the size of perf.data files, and also allow printing
the registers in 'perf script': (Stephane Eranian)# perf record --intr-regs=AX,SP usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.016 MB perf.data (8 samples) ]
# perf script -F ip,sym,iregs | tail -5
ffffffff8105f42a native_write_msr_safe AX:0xf SP:0xffff8802629c3c00
ffffffff8105f42a native_write_msr_safe AX:0xf SP:0xffff8802629c3c00
ffffffff81761ac0 _raw_spin_lock AX:0xffff8801bfcf8020 SP:0xffff8802629c3ce8
ffffffff81202bf8 __vma_adjust_trans_huge AX:0x7ffc75200000 SP:0xffff8802629c3b30
ffffffff8122b089 dput AX:0x101 SP:0xffff8802629c3c78
#
Recompile Linux Kernel not to use specific CPU register
You should not try to recompile kernel when you just want to sample some register with perf. As I understand, kernel has its own set of registers and will not overwrite user R11. syscall
interface uses some fixed registers which can't be changed (can you try different reg?) and there are often glibc gateways to syscall which may use some additional registers (they are not in kernel, they are user-space code; often generated or written in assembler). You may try using gdb to monitor the register to change to find who did it. It can do this (hmm, one more link to the same user on SO): gdb: breakpoint when register will have value 0xffaa like gdb ./program
then gdb commands start; watch $r11; continue; where
.
Two weeks age there was question perf-report show value of CPU register about register value sampling with perf:
I follow this document and using
perf record
with--intr-regs=ax,bx,r15
, trying to log additional CPU register information with PEBS record.
While there was x86 & PEBS, ARM may have --intr-regs
implemented too. Check output of perf record --intr-regs=\?
(man perf-record: "To list the available registers use --intr-regs=\?
") to find support status and register names.
To print registers, use perf script -F ip,sym,iregs
command. There was example in some linux commits:
# perf record --intr-regs=AX,SP usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.016 MB perf.data (8 samples) ]
# perf script -F ip,sym,iregs | tail -5
ffffffff8105f42a native_write_msr_safe AX:0xf SP:0xffff8802629c3c00
ffffffff8105f42a native_write_msr_safe AX:0xf SP:0xffff8802629c3c00
ffffffff81761ac0 _raw_spin_lock AX:0xffff8801bfcf8020 SP:0xffff8802629c3ce8
ffffffff81202bf8 __vma_adjust_trans_huge AX:0x7ffc75200000 SP:0xffff8802629c3b30
ffffffff8122b089 dput AX:0x101 SP:0xffff8802629c3c78
#
Why is moving data between two CPU registers so slow that it costs 30% of total time?
That's not moving between rax
and rdx
.
That's indexing into an array pointed to by rax
by rdx
and putting the result in rax
. Probable L1 cache miss.
Meaning of time measures given by Linux perf profiler
I found this on the perf wiki:
Interrupt-based sampling introduces skids on modern processors. That
means that the instruction pointer stored in each sample designates
the place where the program was interrupted to process the PMU
interrupt, not the place where the counter actually overflows, i.e.,
where it was at the end of the sampling period. In some case, the
distance between those two points may be several dozen instructions or
more if there were taken branches. When the program cannot make
forward progress, those two locations are indeed identical. For this
reason, care must be taken when interpreting profiles.
That may be the explanation. Unfortunately, the wiki does not say how to figure out if this is indeed the problem or how to correct for this issue.
Related Topics
How to Redirect Output of Echo Over Ssh to a File
How to Show Read Prompt with a New Line
Why Exported Variables in Makefile Is Not Received by Executable
How to Use a Linux Expect Script to Enter Answer a Prompt for Password
How to Capture All the Commands Typed in Unix/Linux by Any User
Iptables Script to Block All Internet Access Except for Desired Applications
In Linux, Schedule Task to Hour, Minute, Second Precision
Error Marking Master: Timed Out Waiting for the Condition [Kubernetes]
How to Replace a Multi Line String in a Bunch Files
Linux: Instantiate from User-Space:Eeprom New_Device
Parsing Data from Ifconfig with Awk or Sed
Does Using Xvfb to Run Opengl Effects Version
Save Modifications in Place with Non Gnu Awk
Erge Text Files Ordered by Numerical Filenames in Bash
Difference Between Linux Kernel and Unix Kernel(Such as Freebsd) from Programmer's Point of View
How to Set a Dynamic Variable in Haproxy