Linux 3/1 Virtual Address Split

Linux 3/1 virtual address split

The reason that kernel virtual space is a limiting factor on useable physical memory is because the kernel needs access to all physical memory, and the way it accesses physical memory is through kernel virtual addresses. The kernel doesn't use special instructions that allow direct access to physical memory locations - it has to set up page table entries for any physical ranges that it wants to talk to.

In the "old style" scheme, the kernel set things up so that every process's page tables mapped virtual addresses from 0xC0000000 to 0xFFFFFFFF directly to physical addresses from 0x00000000 to 0x3FFFFFFF (these pages were marked so that they were only accessible in ring 0 - kernel mode). These are the "kernel virtual addresses". Under this scheme, the kernel could directly read and write any physical memory location without having to fiddle with the MMU to change the mappings.

Under the HIGHMEM scheme, the mappings from kernel virtual addresses to physical addresses aren't fixed - parts of physical memory are mapped in and out of the kernel virtual address space as the kernel needs access to that memory. This allows more physical memory to be used, but at the cost of having to constantly change the virtual-to-physical mappings, which is quite an expensive operation.

Is an entire process’s virtual address space split into pages

First note that "pages" are simply regions of an address space. A region that is "non-pageable" (by which I assume you mean it cannot be swapped to disk) is still logically divided into pages, but the OS might implement a different policy on those pages.

The most common page size is 4096 bytes. Many architectures support use of multiple page sizes at the same time (e.g. 4K pages as well as 1MB pages). However, operating systems often stick with just one page size, since under most circumstances, the costs of managing multiple page sizes are much higher than the benefits this provides. Exceptions exist but I don't think you need worry about them.

Every virtual page has certain permissions attached to it, like whether it's readable, writeable, executable (varies depending on hardware support). The OS can use this to help enforce security, cache coherency (for shared memory), and swapping pages out of physical memory.

The .text, .bss and .data regions need not be known to the OS (though most OSes do know about them, for security and performance reasons).

The OS may not actually allocate memory for a stack/heap page until the first time that page is accessed. The OS may provide system calls to request more pages of heap/stack space. Some OSes provide shared memory or shared library functionality which leads to more regions appearing in the address space. Depends on the OS.

Kernel space and user space virtual address division

No, the split is only for dividing up the virtual address space.

It just means that the address space from 0x00000000 up to 0xBFFFFFFF 'belongs' to or is available for mapping in user-space. Virtual addresses 0xC0000000 to 0xFFFFFFFF belong to the kernel.

The amount of available RAM and how it is used has nothing to do with how the virtual address space is partitioned in the Linux kernel.

FWIW, on ARM, you can configure what the split is so it doesn't HAVE to be 3:1 (user:kernel). It can be 1:3, 2:2 or 3:1. I'm assuming there is a similar option for the x86 arch.

How much memory does a 64bit Linux Kernel take up?

Each user-space process can use its own 2^47 bytes (128 TiB) of virtual address space. Or more on a system with PML5 support.

The available physical RAM to back those pages is the total size of physical RAM, minus maybe 30 MiB or so that the kernel needs for its own code/data. (Not including the pagecache: Linux will use any spare pages as buffers and disk cache). This is mostly unrelated to virtual address-space limits.

1G is how much virtual address space a kernel used up. Not how much physical RAM.

The address-space question mattered for how much memory a single process could use at the same time, but the kernel can still use all your RAM for caching file data, etc. Unless you're finding the 2^(48-1) or 2^(57-1) bytes of the low half virtual address-space range cramped, there's no equivalent problem.

See the kernel's Documentation/x86/x86-64/mm.txt for the x86-64 virtual memory map. Also Why 4-level paging can only cover 64 TiB of physical address re: x86-64 Linux not doing inconvenient HIGHMEM stuff - the entire high half of virtual address space is reserved for the kernel, and it maps all the RAM because it's a kernel.

Virtual address space usage does indirectly set a 64 TiB limit on how much physical RAM the kernel can use, but if you have less than that there's no effect. Just like how a 32-bit kernel wasn't a problem if your machine had less than 1 or 2 GiB of RAM.

The amount of physical RAM actually reserved by the kernel depends on build options and modules, but might be something like 16 to 32 MiB.

Check dmesg output and look for something like this kernel log message from an x86-64 5.16.3-arch1 kernel I found in an old boot-log message.

Memory: 32538176K/33352340K available (14344K kernel code, 2040K rwdata, 8996K rodata, 1652K init, 4336K bss, 813904K reserved, 0K cma-reserved

Don't count the init (freed in after boot) or reserved parts; I'm pretty sure Linux doesn't actually reserve ~800 MiB in a way that makes it unusable for anything else.

Also look for the later Freeing unused decrypted memory: 2036K / Freeing unused kernel image (initmem) memory: 1652K etc. (That's the same size as the init part listed earlier, which is why you don't have to count it.)

It might also dynamically allocate some memory during startup; that initial "memory" line is just the sum of its .text, .data, and .bss sections, static code+data sizes.

High memory mappings in kernel virtual address space

The x86-32 kernel needs high memory to access more than 1G of physical memory, as it is impossible to permanently map more than 2^{32} addresses within a 32-bit address space and the kernel/user split is 1G/3G.
The x86-64 kernel has no such limitation, as the amount of physically-addressable memory (currently 256T) fits within its 64-bit address space and thus may always be permanently mapped.
High memory is a hack. Ideally you don't need it. Indeed, the point of x86-64 is to be able to directly address all the memory you could possibly want. Taken
from https://www.quora.com/Linux-Kernel/What-is-the-difference-between-high-memory-and-normal-memory
I think page descriptor means struct page. And considering the sizeof struct page. Yes all of them can be stored in ZONE_NORMAL

Linux 3/1 Virtual Address Split