Why Does High-Memory Not Exist for 64-Bit Cpu

Why does high-memory not exist for 64-bit cpu?

A 32-bit system can only address 4GB of memory. In Linux this is divided into 3GB of user space and 1GB of kernel space. This 1GB is sometimes not enough so the kernel might need to map and unmap areas of memory which incurs a fairly significant performance penalty. The kernel space is the "high" 1GB hence the name "high memory problem".

A 64-bit system can address a huge amount of memory - 16 EB -so this issue does not occur there.

Why can't OS use entire 64-bits for addressing? Why only the 48-bits?

"If we now decide to use only 48 of the 64 bits for addressing". Why? & Why only 48bits? Why not some other number?

System architects make tradeoffs. 256TB seems like more than enough room for 1 process's address space. Remember virtual address != physical address, and generally speaking, each process has its own address space.

As long as pointers are 64 bits, this is more of a performance capability issue than anything else. If & when 48 bits becomes a limitation, the OS could be tweaked to use more bits of the 64-bit address space without breaking application incompatibility. For now, the architects are just buying themselves a very comfortable amount of time.

It may have to do with processor-side virtual addressing capabilities, as many processors now have memory management units to handle the virtual -> physical memory mapping.

How to know the number of address pins (= size of address bus) for a processor. http://ark.intel.com specifications of a processor doesn't include this spec.

This is for the most part irrelevant. It's a way for a processor to implement various physical addressing schemes. A 64-bit processor could achieve external address/data buses for its complete address space with 64, 32, 16, 8, 4, 2, or 1 address pin if the bus is synchronous and the address bits get multiplexed in time. Again, virtual address != physical address; 64-bit virtual addressing could be implemented with 48-bit or 32-bit physical addresses (just that you would be limited to 2⁴⁸ or 2³² words of memory).

update: if you really want to know, you have to look at the datasheet of each processor in question. E.g. Intel Core 2 Duo -- section 4.2 of the datasheet talks about the signals -- the address bus is 36-bits wide (but is really 33 signal lines; the data width is 64-bit = 8 bytes so the other 3 lines are probably unnecessary with proper data alignment)

Well, I'm just a regular PC user & programmer. Its just hard to believe for me that 32-bit addressing ie.. 4GB (2GB/3GB to be more correct) address space per process is a limit. If you really encountered this limit. Please give me example.

Two words: memory-mapped files.

CLR / High memory consumption after switching from 32-bit process to 64-bit process

There are a few things to consider:

1) You mentioned you're using Server GC mode. In server GC mode, CLR creates one heap for every CPU core on the machine, which is more efficient more multi-threaded processing in server processes, e.g. Asp.Net processes. Each heap has two segment: one for small objects, one for large objects. Each segment starts with 4 gb reserved memory. Basically server GC mode tries to use more memory on the system to trade for overall system performance.

2) Pointer is bigger on 64-bit, of course.

3) Foreground Gen2 GC becomes super expensive in server GC mode due to heap is much larger. So CLR tries super hard to reduce the number of foreground Gen2 GC, sometimes using background Gen2 GC.

4) Depending on usage, fragmentation can become a real issue. I've seen heaps with 98% fragmentation (98% heap is free blocks).

To really solve your problem, you need to get an ETW trace + a memory dump, and then use tools like PerfView for detailed analysis.

What is the memory usage overhead for a 64-bit application?

It depends on the programming style (and on the language, but you are referring to C).

If you work a lot with pointers (or you have a lot of references in some languages), RAM consumption goes up.
If you use a lot of data with fixed size, such as double or int32_t, RAM consumption does not go up.
For types like int or long, it depends on the architecture; there may be differences between Linux and Windows. Here you see the alternatives you have. In short, Windows uses LLP64, meaning that long long and pointers are 64 bit, while Linux uses LP64, where longis 64 bit as well. Other architectures might make int or even short 64 bit as well, but these are quite uncommon.
float and double should remain the same in size in all cases.

So you see it strongly depends on the usage of the data types.

Why does the cpu responsible for setting the dirty and accessed bits but the OS is responsible for clearing them?

An access bit couldn't be set by the kernel unless it intercepted all memory accesses. That would kind of ruin performance. Same with the dirty bit, it's way easier and simpler and cheaper for the CPU to set it since it's in fact doing the write.

Clearing the dirty bit can't be done by the CPU, because it's part of the paging and swapping, which can only be handled by the OS.

How does a 64-bit computer change one byte in memory?

On x86-64, the hardware will read one cache line, modify the byte in cache, and eventually that cache line will be written back to memory.

The main reason for the write-back to happen is that the CPU needs the cache line for other data. There are explicit instructions to force the write-back, but a C compiler would be unlikely to use those. It slows down the CPU to force an unnecessary write.

Why do x86-64 systems have only a 48 bit virtual address space?

Because that's all that's needed. 48 bits give you an address space of 256 terabyte. That's a lot. You're not going to see a system which needs more than that any time soon.

So CPU manufacturers took a shortcut. They use an instruction set which allows a full 64-bit address space, but current CPUs just only use the lower 48 bits. The alternative was wasting transistors on handling a bigger address space which wasn't going to be needed for many years.

So once we get near the 48-bit limit, it's just a matter of releasing CPUs that handle the full address space, but it won't require any changes to the instruction set, and it won't break compatibility.

Restrictions on Memory Available to Applications

Firstly, I think it'll be really helpful if you dived a little into operating system concepts such as "process address space" and how processes and virtual memory work at the basic level.
This seems good -> [1]: http://duartes.org/gustavo/blog/post/anatomy-of-a-program-in-memory

Each process has its own process address space and thereby its own heap. The heap is a part of the "virtual address space" that caters to dynamic memory allocation. It also depends
on the range of addressable memory your OS can accommodate. For instance, in a 32 bit system the maximum addressable range can never exceed 4GB.

Adding more RAM can make some difference (not a night and day one though) by reducing things like thrashing.

64-bit Performance Advantages

In order to take advantage of the 64 bit architecture of the latest CPU's you have to:

use a 64 bit CPU and OS
develop specifically for 64 bits using a 64 bit api - the implementation has to go all the way down to the most basic code working with the CPU registers (normally written in assembler) to take advantage of the extra registers.
develop an application that will really benefit from the extra registers - WinRAR is an application that will take full advantage of the extra registers because it involves a lot of calculus with complex algorithms. If you instead write an application with very simple algorithms, it will not require extra register address space and it will not work faster on 64 bit
take also into consideration that when you use a CPU register even if you don't use the the whole address space for a value, it will still take up as much space ( = 64bits).. therefore writing a small application in 64 bit aiming for getting a optimized code will just not work.. that app will take up twice the RAM than if it would be developed under 32 bit and it may be even slower. Programming in 64 bit makes sense for applications using heavy algorithms or that need to allocate huge pieces of memory (4Gb is the limit for a 32bit app).

Why Does High-Memory Not Exist for 64-Bit Cpu