In Linux, Physical Memory Pages Belong to The Kernel Data Segment Are Swappable or Not

Is Kernel Virtual Memory pages are swappable

Kernel space pages don't get page-{in,out} by design and are pinned to memory. The pages in the kernel can usually be trusted from a security point of view, while the user space pages should NOT be trusted.

For this reason you don't have to worry about accessing kernel buffers directly in your code. While its not the same the user space buffers, without worrying about handling page faults.

Kernel space pages cannot page-out by design, as you may want to consider what would your application do when the page containing the instructions for handling a page fault gets page-out!

physical storage of the kernel data

  1. First GB of physical memory mapped to high GB of virtual addresses linearly. But kernel can modify this mappings.
  2. Yes, it is.
  3. No, linux kernel is not swappable. Only user processes memory can be swapped out.

Note that this is only valid for 32-bit systems. Mappings on 64-bit systems are different.

Allocate swappable memory in linux kernel

You can create a file in the internal shm shared memory filesystem.

const char *name = "example";
loff_t size = PAGE_SIZE;
unsigned long flags = 0;
struct file *filp = shmem_file_setup(name, size, flags);
/* assert(!IS_ERR(filp)); */

The file isn't actually linked, so the name isn't visible. The flags may include VM_NORESERVE to skip accounting up-front, instead accounting as pages are allocated. Now you have a shmem file. You can map a page like so:

struct address_space *mapping = filp->f_mapping;
pgoff_t index = 0;
struct page *p = shmem_read_mapping_page(mapping, index);
/* assert(!IS_ERR(filp)); */
void *data = page_to_virt(p);
memset(data, 0, PAGE_SIZE);

There is also shmem_read_mapping_page_gfp(..., gfp_t) to specify how the page is allocated. Don't forget to put the page back when you're done with it.

put_page(p);

Ditto with the file.

fput(filp);

How does kernel know, which pages in the virtual address space correspond to a swapped out physical page frame?

Linux:

When swap file is used the Page Table Entry gets updated with one marked as invalid and holding information about where it is saved in the swap file. That is: an index to the swap_info array and an offset within the swap_map.

Example from (an a bit old) Page Table Entry type (pte_t) on a x86. Some
of the bits are used as flags by the hardware:

Bit         Function
_PAGE_PRESENT Page is resident in memory and not swapped out
_PAGE_PROTNONE Page is resident but not accessable
_PAGE_RW Set if the page may be written to
_PAGE_USER Set if the page is accessible from user space
_PAGE_DIRTY Set if the page is written to
_PAGE_ACCESSED Set if the page is accessed

Table 3.1: Page Table Entry Protection and Status Bits

See also another SO answer with a diagram of the x86-64 page table format. When the low bit = 0, the hardware ignores all the other bits, so the kernel can use them for anything. Even in a "present" entry, there are some guaranteed-ignored bits that aren't reserved for future hardware use, so the kernel can use them for its own purposes.

Presumably other architectures are similar.


In simple terms: A process points to a page, the page get updated. Thus the processes are, in effect, also updated. When the physical page get requested it is swapped in and thus all processes as well. The point being that the Page Table Entry is not removed when memory is swapped out.

You might find some of this useful:

  • Gustavo Duarte: How The Kernel Manages Your Memory.

The kernel documentation included book of Mel Gorman (2007):

  • 11.2 Mapping Page Table Entries to Swap Entries
  • 3.2 Describing a Page Table Entry

  • Red Hat on VM's Life of a page.



Related Topics



Leave a reply



Submit