Difference Between Flat Memory Model and Protected Memory Model

Difference between flat memory model and protected memory model?

In order to give an answer that makes sense, let's review some concepts first.

Most modern processors have a Memory Management Unit (MMU) which is used for a number of purpose.

One purpose is to map between the Virtual Address (the one the CPU "sees") and the Physical Address (where the chips are actually connected). This is called address translation.

Another purpose is to set access attributes for certain virtual memory locations (things like memory is Read-Write, or Read-Only, or not accessible)

With an MMU, you can have what is called "unity mapping" where the processor's virtual address is the same as the physical address (i.e. you don't use address translation). For example, if the processor accesses 0x10000, then it is accessing the physical location 0x10000.

A "flat" memory model typically refers to the fact that any virtual address the CPU accesses is unique. Thus, for a 32-bit CPU, you are limited to a maximum of 4G of address space.

It is most often (though not necessarily) used to refer to a unity mapping between virtual and physical memory.

In contrast, in the workstation world, most Operating Systems (Linux/Windows) use an "overlapped" memory model. For example, any program you launch (a Process) in Windows, will have a start address of 0x10000.

How can windows have 10 processes all running from address 0x10000?

It's because each processes uses the MMU to map the virtual address 0x10000 to different physical addresses. To P1 could have 0x10000 = 0x10000 while P2 has 0x10000 = 0x40000, etc...

In RAM, the programs are at different physical addresses, but the CPU Virtual address space looks the same for each process.

As far as I know, Windows and Standard Linux always use an overlapped model (i.e. they don't have a flat model). It is possible that uLinux or other special kernel could have a flat model.

Now, protection has nothing to do with a flat vs. protected model.
I would say that most overlapped model OSes will use protection so that one process does not affect (i.e. write into the memory of) another one.

With VxWorks 6.x and the introduction of Real-Time Processes, even with a flat memory model, individual RTPs are protected from each other (and kernel apps) by using protection.

If you don't use RTPs and run everything in the vxWorks kernel, then there is no protection used.

So, how does protection work (whether in VxWorks RTPs or other OSes Process)?
Essentially, the RTP/Process exists inside a "memory bubble" with a certain range of (virtual) addresses that contains code, data, heap, and other assorted memory location.

If the RTP/Process attempts to access a memory location outside it's bubble, the MMU generates an exception and the OS (or signal handler) gets called. The typical result is a segment violation/bus exception.

But how can a process send a packet to an ethernet port if it can't escape it's memory bubble? This varies by processor architectures, but essentially, the user-side (RTP) socket library (for example) makes a "system call" - which is a special instruction which switches the cpu to the kernel space and supervisor mode. At this point, some sort of device driver (that typically reside in the kernel) runs to push the data to some hardware device. Once that's done, the system call returns and we're back in the RTP/process space running the user code.

The OS takes care of all the MMU programming, system call handling, etc... This is invisible to the application.

flat memory model vs. real-address mode memory model

Real-address mode also uses segments.

The flat memory model is the intuitive, straightforward memory model used by most processors (not from Intel). Most processors do not support multiple memory models. Intel supports this plus many others for compatibility.

Real-address mode uses segment registers. An addresses is the value specified by the programmer plus the value in a segment register, In some cases, the segment may be implicit from the particular instruction.

Which type of memory model (i.e. flat / segmentation) is used by linux kernel?

Linux generally uses neither. On x86, Linux has separate page tables for userspace processes and the kernel. The userspace page tables do not contain user mappings to kernel memory, which makes it impossible for user-space processes to access kernel memory directly.

Technically, "virtual addresses" on x86 pass through segmentation first (and are converted from logical addresses to linear addresses) before being remapped from linear addresses to physical addresses through the page tables. Except in unusual cases, segmentation won't change the resulting physical address in 64 bit mode (segmentation is just used to store traits like the current privilege level, and enforce features like SMEP).

One well known "unusual case" is the implementation of Thread Local Storage by most compilers on x86, which uses the FS and GS segments to define per logical processor offsets into the address space. Other segments can not have non-zero bases, and therefore cannot shift addresses through segmentation.

Segmented Memory vs Flat Memory

If you're only interested in applications running on existing 32/64 bits operating systems, you can simply forget segmented memory. On 32 bits OSes, you can assume that you have 4 GB of “flat” memory space. Flat means that you can manipulate addresses with 32 bits values and registers, as you would expect.

On 16 bits processors, I believe an address was 20 bits wide, and you couldn't store that in a register, so you had to store a base in one register, and to specify an actual address, you had to add an offset to that base. (If I remember correctly, the base was multiplied by 16, then the offset was added to get the actual address.) This means that you could only address 64 KB at once; memory had to be “segmented” in 64 KB blocks.

To be honest, I think the only reason beginners still hear about that is because a lot of old 16 bits tutorials and books are still around. It's really not needed to understand how a program works at the assembly level. Now if you want to learn OS development, that's another story. Since a PC starts up in 16 bits mode, you will need to learn at least enough to be able to activate the flat 32 bits mode.

Just noticed you also asked about real mode vs protected mode. Real mode is the mode that MS DOS used. Any program had access to any hardware feature, for example it was common to directly talk to the graphics card's controller to print something. It didn't cause any problem because it wasn't a multitasking OS.

But on any modern OS, normal programs don't access hardware directly, they don't even access the memory directly. The OS manages the hardware and decides which process gets to run on the processor(s). It also manages a virtual address space for every process. This kind of feature is available with protected mode, which I believe came with the 386, which was the first 32 bits processor for PC.

what is the difference between small memory model and large memory model?

It refers to very old concept of 16-bit memory model. 32bit & 64bit computers know nothing about these memory models.

So returning to your questions: small - declares that pointers allows you address only 64k of data or code. Pointer has length 16 bit. Entire your program is resided in single 64k segment. To explicitly address another part of memory you need explicitly declare pointer as FAR. large - declares that pointer to code or data has 32 bit, so it is FAR by default.

Hope you would not hang on these questions so long, since it is obsolete concept.

Linux memory segmentation

Yes, Linux uses paging so all addresses are always virtual. (To access memory at a known physical address, Linux keeps all physical memory 1:1 mapped to a range of kernel virtual address space, so it can simply index into that "array" using the physical address as the offset. Modulo complications for 32-bit kernels on systems with more physical RAM than kernel address space.)

This linear address space constituted of pages, is split into four segments

No, Linux uses a flat memory model. The base and limit for all 4 of those segment descriptors are 0 and -1 (unlimited). i.e. they all fully overlap, covering the entire 32-bit virtual linear address space.

So the red part consists of two segments __KERNEL_CS and __KERNEL_DS

No, this is where you went wrong. x86 segment registers are not used for segmentation; they're x86 legacy baggage that's only used for CPU mode and privilege-level selection on x86-64. Instead of adding new mechanisms for that and dropping segments entirely for long mode, AMD just neutered segmentation in long mode (base fixed at 0 like everyone used in 32-bit mode anyway) and kept using segments only for machine-config purposes that are not particularly interesting unless you're actually writing code that switches to 32-bit mode or whatever.

(Except you can set a non-zero base for FS and/or GS, and Linux does so for thread-local storage. But this has nothing to do with how copy_from_user() is implemented, or anything. It only has to check that pointer value, not with reference to any segment or the CPL / RPL of a segment descriptor.)

In 32-bit legacy mode, it is possible to write a kernel that uses a segmented memory model, but none of the mainstream OSes actually did that. Some people wish that had become a thing, though, e.g. see this answer lamenting x86-64 making a Multics-style OS impossible. But this is not how Linux works.

Linux is a https://wiki.osdev.org/Higher_Half_Kernel, where kernel pointers have one range of values (the red part) and user-space addresses are in the green part. The kernel can simple dereference user-space addresses if the right user-space page-tables are mapped, it doesn't need to translate them or do anything with segments; this is what it means to have a flat memory model. (The kernel can use "user" page-table entries, but not vice versa). For x86-64 specifically, see https://www.kernel.org/doc/Documentation/x86/x86_64/mm.txt for the actual memory map.

The only reason those 4 GDT entries all need to be separate is for privilege-level reasons, and that the data vs. code segments descriptors have different formats. (A GDT entry contains more than just the base/limit; those are the parts that need to be different. See https://wiki.osdev.org/Global_Descriptor_Table)

And especially https://wiki.osdev.org/Segmentation#Notes_Regarding_C which describes how and why the GDT is typically used by a "normal" OS to create a flat memory model, with a pair of code and data descriptors for each privilege level.

For a 32-bit Linux kernel, only gs gets a non-zero base for thread-local storage (so addressing modes like [gs: 0x10] will access a linear address that depends on the thread that executes it). Or in a 64-bit kernel (and 64-bit user-space), Linux uses fs. (Because x86-64 made GS special with the swapgs instruction, intended for use with syscall for the kernel to find the kernel stack.)

But anyway, the non-zero base for FS or GS are not from a GDT entry, they're set with the wrgsbase instruction. (Or on CPUs that don't support that, with a write to an MSR).

but what are those flags, namely 0xc09b, 0xa09b and so on ? I tend to believe they are the segments selectors

No, segment selectors are indices into the GDT. The kernel is defining the GDT as a C array, using designated-initializer syntax like [GDT_ENTRY_KERNEL32_CS] = initializer_for_that_selector.

(Actually the low 2 bits of a selector, i.e. segment register value, are the current privilege level. So GDT_ENTRY_DEFAULT_USER_CS should be `__USER_CS >> 2.)

mov ds, eax triggers the hardware to index the GDT, not linear search it for matching data in memory!

GDT data format:

You're looking at x86-64 Linux source code, so the kernel will be in long mode, not protected mode. We can tell because there are separate entries for USER_CS and USER32_CS. The 32-bit code segment descriptor will have its L bit cleared. The current CS segment description is what puts an x86-64 CPU into 32-bit compat mode vs. 64-bit long mode. To enter 32-bit user-space, an iret or sysret will set CS:RIP to a user-mode 32-bit segment selector.

I think you can also have the CPU in 16-bit compat mode (like compat mode not real mode, but the default operand-size and address size are 16). Linux doesn't do this, though.

Anyway, as explained in https://wiki.osdev.org/Global_Descriptor_Table and Segmentation,

Each segment descriptor contains the following information:

The base address of the segment

The default operation size in the segment (16-bit/32-bit)

The privilege level of the descriptor (Ring 0 -> Ring 3)

The granularity (Segment limit is in byte/4kb units)

The segment limit (The maximum legal offset within the segment)

The segment presence (Is it present or not)

The descriptor type (0 = system; 1 = code/data)

The segment type (Code/Data/Read/Write/Accessed/Conforming/Non-Conforming/Expand-Up/Expand-Down)

These are the extra bits. I'm not particularly interested in which bits are which because I (think I) understand the high level picture of what different GDT entries are for and what they do, without getting into the details of how that's actually encoded.

But if you check the x86 manuals or the osdev wiki, and the definitions for those init macros, you should find that they result in a GDT entry with the L bit set for 64-bit code segments, cleared for 32-bit code segments. And obviously the type (code vs. data) and privilege level differ.

Assembly Segmented Model 32bit Memory Limit

Edit: My answer assumes that by "4GB limit" you are referring to the maximum size of linear (virtual) address space, rather than of physical address space. As explained in the comments below, the latter is not actually limited to 4GB at all - even when using a flat memory model.

Repeating your quote, with emphasis:

the logical address space consists
of as many as 16,383 segments of up to
4 gigabytes each

Now, quoting from "Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 1: Basic Architecture" (PDF available here):

Internally, all the segments that are
defined for a system are mapped into
the processor’s linear address space.

It is this linear address space which (on 32-bit processor) is limited to 4GB. So, a segmented memory model would still be subject to the limit.

How are segment registers unused in protected mode memory addressing in modern x86 systems?

A funny combination of them, perhaps. What happens from a high level (if it can be called high) perspective is that most segments are configured with a base of 0 and a limit of 0xFFFFFFFF (fs and gs may be used for special purposes though).

But configuring a segment with a non-zero base may have performance consequences. For example on AMD K8 and K10, configuring the code segment to have a non-zero base increases the latency of branch mispredictions by two cycles, and a general address costs a cycle longer to compute if a segment with a non-zero base is involved. This may mean that the processor has a special fast-path for segments with a base of zero, so that the base does not participate in the calculation of the address at all rather than adding zero (which would still take time).

I could find no reference to this effect existing on any other µarchs, but it may not be fully explored because it is a relatively rare effect, especially in performance-sensitive code. In a quick test, a similar effect seems to exist on Haswell, with this code (skips some trivial set-up):

.loop:
    mov rax, [rsp+rax]
    add ecx, 1
    jnz .loop

Running two cycles per iteration faster (5 cycles/iteration) than this code (7 cycles/iteration):

.loop:
    mov rax, [gs:rax]
    add ecx, 1
    jnz .loop

Possibly that means that more Intel µarchs are effected as well, though perhaps this is inaccurate since no segment is involved in the first code at all (since it's 64bit code) and perhaps that is what mattered.

Difference Between Flat Memory Model and Protected Memory Model