perf_event_open - how to monitoring multiple events
That's a bit tricky.
We create first counter as usual. Additionally, we pass PERF_FORMAT_GROUP
and PERF_FORMAT_ID
to be able to work with multiple counters simultaneously. This counter will be our group leader.
struct perf_event_attr pea;
int fd1, fd2;
uint64_t id1, id2;
memset(&pea, 0, sizeof(struct perf_event_attr));
pea.type = PERF_TYPE_HARDWARE;
pea.size = sizeof(struct perf_event_attr);
pea.config = PERF_COUNT_HW_CPU_CYCLES;
pea.disabled = 1;
pea.exclude_kernel = 1;
pea.exclude_hv = 1;
pea.read_format = PERF_FORMAT_GROUP | PERF_FORMAT_ID;
fd1 = syscall(__NR_perf_event_open, &pea, 0, -1, -1, 0);
Next, we retrieve identifier for the first counter:
ioctl(fd1, PERF_EVENT_IOC_ID, &id1);
Second (and all further counters) are created in the same fashion with only one exception: we pass fd1
value as group leader argument:
memset(&pea, 0, sizeof(struct perf_event_attr));
pea.type = PERF_TYPE_SOFTWARE;
pea.size = sizeof(struct perf_event_attr);
pea.config = PERF_COUNT_SW_PAGE_FAULTS;
pea.disabled = 1;
pea.exclude_kernel = 1;
pea.exclude_hv = 1;
pea.read_format = PERF_FORMAT_GROUP | PERF_FORMAT_ID;
fd2 = syscall(__NR_perf_event_open, &pea, 0, -1, fd1, 0); // <-- here
ioctl(fd2, PERF_EVENT_IOC_ID, &id2);
Next we need to declare a data structure to read multiple counters at once. You have to declare different set of fields depending on what flags you pass to perf_event_open
. Manual page mentions all possible fields. In our case, we passed PERF_FORMAT_ID
flag which adds id
field. This will allow us to distinguish between different counters.
struct read_format {
uint64_t nr;
struct {
uint64_t value;
uint64_t id;
} values[/*2*/];
};
Now we call standard profiling ioctls:
ioctl(fd1, PERF_EVENT_IOC_RESET, PERF_IOC_FLAG_GROUP);
ioctl(fd1, PERF_EVENT_IOC_ENABLE, PERF_IOC_FLAG_GROUP);
do_something();
ioctl(fd1, PERF_EVENT_IOC_DISABLE, PERF_IOC_FLAG_GROUP);
Finally, we read the counters from group leader file descriptor. Both counters are returned in single read_format
structure that we declared:
char buf[4096];
struct read_format* rf = (struct read_format*) buf;
uint64_t val1, val2;
read(fd1, buf, sizeof(buf));
for (i = 0; i < rf->nr; i++) {
if (rf->values[i].id == id1) {
val1 = rf->values[i].value;
} else if (rf->values[i].id == id2) {
val2 = rf->values[i].value;
}
}
printf("cpu cycles: %"PRIu64"\n", val1);
printf("page faults: %"PRIu64"\n", val2);
Below is the full program listing:
#define _GNU_SOURCE
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <string.h>
#include <sys/ioctl.h>
#include <linux/perf_event.h>
#include <linux/hw_breakpoint.h>
#include <asm/unistd.h>
#include <errno.h>
#include <stdint.h>
#include <inttypes.h>
struct read_format {
uint64_t nr;
struct {
uint64_t value;
uint64_t id;
} values[];
};
void do_something() {
int i;
char* ptr;
ptr = malloc(100*1024*1024);
for (i = 0; i < 100*1024*1024; i++) {
ptr[i] = (char) (i & 0xff); // pagefault
}
free(ptr);
}
int main(int argc, char* argv[]) {
struct perf_event_attr pea;
int fd1, fd2;
uint64_t id1, id2;
uint64_t val1, val2;
char buf[4096];
struct read_format* rf = (struct read_format*) buf;
int i;
memset(&pea, 0, sizeof(struct perf_event_attr));
pea.type = PERF_TYPE_HARDWARE;
pea.size = sizeof(struct perf_event_attr);
pea.config = PERF_COUNT_HW_CPU_CYCLES;
pea.disabled = 1;
pea.exclude_kernel = 1;
pea.exclude_hv = 1;
pea.read_format = PERF_FORMAT_GROUP | PERF_FORMAT_ID;
fd1 = syscall(__NR_perf_event_open, &pea, 0, -1, -1, 0);
ioctl(fd1, PERF_EVENT_IOC_ID, &id1);
memset(&pea, 0, sizeof(struct perf_event_attr));
pea.type = PERF_TYPE_SOFTWARE;
pea.size = sizeof(struct perf_event_attr);
pea.config = PERF_COUNT_SW_PAGE_FAULTS;
pea.disabled = 1;
pea.exclude_kernel = 1;
pea.exclude_hv = 1;
pea.read_format = PERF_FORMAT_GROUP | PERF_FORMAT_ID;
fd2 = syscall(__NR_perf_event_open, &pea, 0, -1, fd1 /*!!!*/, 0);
ioctl(fd2, PERF_EVENT_IOC_ID, &id2);
ioctl(fd1, PERF_EVENT_IOC_RESET, PERF_IOC_FLAG_GROUP);
ioctl(fd1, PERF_EVENT_IOC_ENABLE, PERF_IOC_FLAG_GROUP);
do_something();
ioctl(fd1, PERF_EVENT_IOC_DISABLE, PERF_IOC_FLAG_GROUP);
read(fd1, buf, sizeof(buf));
for (i = 0; i < rf->nr; i++) {
if (rf->values[i].id == id1) {
val1 = rf->values[i].value;
} else if (rf->values[i].id == id2) {
val2 = rf->values[i].value;
}
}
printf("cpu cycles: %"PRIu64"\n", val1);
printf("page faults: %"PRIu64"\n", val2);
return 0;
}
perf.data to text or csv
You should use perf record -e <event-name> ...
to sample events every 1ms. It seems you are trying to read the perf.data
file and organize it into human-readable data. You should use perf report
if you are not aware of it. The perf report
command reads the perf.data
file and generates a concise execution profile. The below link should help you -
Sample analysis with perf report
You can modify the perf report
output to your requirements. You can also use perf report -F
to specify multiple columns in csv format.
However, in addition, perf stat
does have a mechanism to collect information in a csv format using the perf stat -x
command.
Edit #1:
(I am using Linux-Kernel 4.14.3 for evaluation.)
Since you want the number of events per sample taken, there are couple of things to be noted. To count the number of events per sample, you will need to know the sampling period. The sampling period gives you the number of events after which the performance counter will overflow and the kernel will record a sample. So essentially, in your case,
sampling period = number of events per sample
Now there are two ways of specifying this sampling period. Either you specify it or you do not specify it.
If while doing a perf record
, you specify the sampling period.. something like this :-
perf record -e <some_event_name> -c 1000 ...
Here -c 1000 means that the sampling period is 1000. In this case, you purposefully force the system to record 1000 events per sample because the sampling period is fixed by you.
On the other hand, if you do not specify the sampling period, the system will try to record events at a default frequency of 1000 samples/sec. This means that the system will automatically change the sampling period, if need be, to maintain the frequency of 1000 samples/sec. In such a case, to determine the sampling period, you need to observe the perf.data
file.
Specifically, you need to open the perf.data
file using the command :
perf script -D
The output will very well look like this :-
0 0 0x108 [0x38]: PERF_RECORD_FORK(1:1):(0:0)
0x140 [0x30]: event: 3
.
. ... raw event: size 48 bytes
. 0000: 03 00 00 00 00 00 30 00 01 00 00 00 01 00 00 00 ......0.........
. 0010: 73 79 73 74 65 6d 64 00 00 00 00 00 00 00 00 00 systemd.........
. 0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0 0 0x140 [0x30]: PERF_RECORD_COMM: systemd:1/1
0x170 [0x38]: event: 7
.
. ... raw event: size 56 bytes
. 0000: 07 00 00 00 00 00 38 00 02 00 00 00 00 00 00 00 ......8.........
. 0010: 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
. 0030: 00 00 00 00 00 00 00 00 ........
You can see different types of records like PERF_RECORD_FORK
and PERF_RECORD_COMM
and even PERF_RECORD_MMAP
. You need to specifically look out for records that begin with PERF_RECORD_SAMPLE inside the file. Like this:
14 173826354106096 0x10d40 [0x28]: PERF_RECORD_SAMPLE(IP, 0x1): 28179/28179: 0xffffffffa500d3b5 period: 3000 addr: 0
... thread: perf:28179
...... dso: [kernel.kallsyms]
perf 28179 [014] 173826.354106: cache-misses: ffffffffa500d3b5 [unknown] ([kernel.kallsyms])
As you can see, in this case the period is 3000 i.e. number of events collected between the previous sampling event and this sampling event is 3000. (i.e. number of events per sample is 3000) Note that, as I mentioned above this period might be tuned. So you need to collect all of the PERF_RECORD_SAMPLE records from the perf.data
file.
Related Topics
/Usr/Bin/Ld: Client: Hidden Symbol '_Dso_Handle'
Dynamic Loading of Shared Objects Using Dlopen()
How to Redirect Output of Echo Over Ssh to a File
How to Show Read Prompt with a New Line
Assembly Linux System Calls VS Assembly Os X System Calls
How to Acess the Physical Address from Linux Kernel Space
Docker: Ssh Access Directly into Container
Iptables Script to Block All Internet Access Except for Desired Applications
In Linux, Schedule Task to Hour, Minute, Second Precision
Error Marking Master: Timed Out Waiting for the Condition [Kubernetes]
How to Replace a Multi Line String in a Bunch Files
How Does the Linux Kernel Determine Ld.So's Load Address
How to Install an Older Version of PHP Using Apt-Get
Does Not Work to Execute Command in Double Brackets in Bash