Does gcc, icc, or Microsoft's C/C++ compiler support or know anything about NUMA?
Linux kernel knows about NUMA and will try to give your process pages from memory local to the current CPU (source: U. Drepper, "What Every Programmer Should Know About Memory".)
NUMA memory regions allocation in Windows 7
Windows will allocate memory local to the requesting thread; however, local is not specified by Microsoft. Local could be one of three options: the thread's ideal processor, the thread's processor affinity mask, or the thread's current processor (I forget what the current implementation is).
In essence, the answer is yes; however, a common gotcha is allocating all memory from a "controller thread" that isn't affinitized, and thus the memory is near to the controller and not the threads with specific affinity.
In NUMA, does each CPU also has local I/O controller similar to local RAM?
Nodes in a NUMA system have local RAM and can have local I/O. The later depends heavily on how the system is configured at the hardware level. If memory interchanges are performed by I/O accesses, then obviously each CPU must have its own I/O controller.
Here you have an example of an (old) NUMA system with local I/O for each node:
http://lse.sourceforge.net/numa/older_stuff/meetings/mtg.2001.07.25/minutes.html
Related Topics
Link a Static Library to a Shared One During Build
Loading U-Boot in Memory Instead of Flashing It
Linux Bash - Parse Date in Custom Format
Crontab Day of the Week Syntax
Linux Cross-Compilation for Arm Architecture
How to Delete Files Older Than Specific Date in Linux
How Is the Microsecond Time of Linux Gettimeofday() Obtained and What Is Its Accuracy
Linux Serial Port Listener and Interpreter
How to Extract Numbers from a String
How to Find Which Elf Dependency Is Not Fulfilled
Setting Limit to Total Physical Memory Available in Linux
Hook Functions for Linux Filesystem
Why Does This Code Crash with Address Randomization On
Colors with Unix Command "Watch"
How to Check the Version of Openmp on Linux
What's the Accepted Method for Deploying a Linux Application That Relies on Shared Libraries
How to Get the Physical Address from the Logical One in a Linux Kernel Module