Arm64 Linux Page Table Walk

Arm64 Linux Page Table Walk

I finally solved the problem.

Actually, my code is correct. The only part I missed is a page table entry check.

According to the page table design of ARMv8, ARM uses 4 levels page table for 4kb granule case. Each level (level 0-3 defined in the link) is implemented as pgd, pud, pmd, and ptep in Linux code.

In the ARM architecture, each level can be either block entry or the table entry (see the AArch64 Descriptor Format Section in the link).

If the memory address belongs to a 4kb table entry, then it needs to be traced down till level 3 entry (ptep). However, for the address belongs to a larger chunk, the corresponding table entry may save in the pgd, pud, or pmd level.

By checking the last 2 bits of the entry in each level, you know it's block entry or not and you only keep tracing down for the block entry.

Here is how to improve my code above:

Retrieving the descriptor based on the page table pointer desc = *pgd and then checking the last 2 bits of the descriptor.

If the descriptor is a block entry (0x01) then you need to extract the lower level entry as my code shows above.
If you already get the table entry (0x11) at any level, then you can stop there and translate the VA to PA based on the descriptor desc you just get.

int find_physical_pte(void *addr)
{
pgd_t *pgd;
pud_t *pud;
pmd_t *pmd;
pte_t *ptep;
unsigned long long address;

address = (unsigned long long)addr;

pgd = pgd_offset(current->mm, address);
printk(KERN_INFO "\npgd is: %p\n", (void *)pgd);
printk(KERN_INFO "pgd value: %llx\n", *pgd);
if (pgd_none(*pgd) || pgd_bad(*pgd))
return -1;
//check if (*pgd) is a table entry. Exit here if you get the table entry.

pud = pud_offset(pgd, address);
printk(KERN_INFO "\npud is: %p\n", (void *)pud);
printk(KERN_INFO "pud value: %llx\n", (*pud).pgd);
if (pud_none(*pud) || pud_bad(*pud))
return -2;
//check if (*pud) is a table entry. Exit here if you get the table entry.

pmd = pmd_offset(pud, address);
printk(KERN_INFO "\npmd is: %p\n", (void *)pmd);
printk(KERN_INFO "pmd value: %llx\n",*pmd);
if (pmd_none(*pmd) || pmd_bad(*pmd))
return -3;
//check if (*pmd) is a table entry. Exit here if you get the table entry.

ptep = pte_offset_kernel(pmd, address);
printk(KERN_INFO "\npte is: %p\n", (void *)ptep);
printk(KERN_INFO "pte value: %llx\n",*ptep);
if (!ptep)
return -4;

return 1;
}

Getting error when compiling kernel for page table walk

I recently got the same problem and I found that just like pgd_offset and pud_offset, there is a p4d_offset. Put it between pgd and pud:

pgd_t *pgd;
p4d_t* p4d;
pud_t *pud;
pmd_t *pmd;
pte_t *ptep, pte;

...

pgd = pgd_offset(task_mm, vmpage);
if (pgd_none(*pgd) || pgd_bad(*pgd))
return 0;

p4d = p4d_offset(pgd, vmpage);
if (p4d_none(*p4d) || p4d_bad(*p4d))
return 0;

pud = pud_offset(p4d, vmpage);
if (pud_none(*pud) || pud_bad(*pud))
return 0;

...

Edit: Here is some information about the additional level: Five-level page tables.

It has been implemented in kernel version 4.11.

Page table bits in linux virtual address (4-level paging)

The document in here explains very clearly
https://www.kernel.org/doc/Documentation/arm64/memory.txt

Translation table lookup with 4KB pages:

+--------+--------+--------+--------+--------+--------+--------+--------+
|63 56|55 48|47 40|39 32|31 24|23 16|15 8|7 0|
+--------+--------+--------+--------+--------+--------+--------+--------+
| | | | | |
| | | | | v
| | | | | [11:0] in-page offset
| | | | +-> [20:12] L3 index
| | | +-----------> [29:21] L2 index
| | +---------------------> [38:30] L1 index
| +-------------------------------> [47:39] L0 index
+-------------------------------------------------> [63] TTBR0/1

L0 - PGD, L1 - PUD, L2 - PMD, L3 - PTE

Aarch64 uses only 0-39 bits(3-level paging). Hence For aarch systems, PGD(L0) = PUD(L1) = [38:30]. Rest of the mapping remains the same.

MMU page table descriptor size of indexes

First of all, here is a general description about ARMv8 page table design.

The index bits(9 bits, 11 bits, and 13 bits) are fixed for different page size (4kb, 16kb, and 64kb). The details about the bits information can be found in the link above.
If you are using 3 levels page table then it should be 16kb page granule and the corresponding index bits of each level are 11.

Finally, here is the answer I posted before to explain the AArch64 page table walk. Maybe it's helpful for you to understand the ARM page table.



Related Topics



Leave a reply



Submit