Allocate Writable Memory in the .Text Section

Allocate writable memory in the .text section

If you want to allocate memory at runtime, reserve some space on the stack with sub rsp, 4096 or something. Or run an mmap system call or call malloc from libc, if you linked against libc.

If you want to test shellcode / self-modifying code,

or have some other reason for have a writeable .text:

Link with ld --omagic or gcc -Wl,--omagic. From the ld(1) man page:

-N

--omagic

Set the text and data sections to be readable and writable. Also, do not page-align the data segment, and disable linking against shared
libraries. If the output format supports Unix style magic numbers, mark the output as "OMAGIC".

See also How can I make GCC compile the .text section as writable in an ELF binary?

Or probably you can use a linker script. It might also be possible to use NASM section attribute stuff to declare a custom section that has read, write, exec permission.

There's normally (outside of shellcode testing) no reason to do any of this, just put your static storage in .data or .bss, and your static read-only data in .rodata like a normal person.

Putting read/write data near code is actively bad for performance: possible pipeline nukes from the hardware that detects self-modifying-code, and it at least pollutes the iTLB with data and the dTLB with code, if you have a page that includes some of both instead of being full of one or the other.

If v8 uses the code or text memory type, or if everything is in the heap/stack

The below is true for the major operating systems running on the major CPUs in common use today. Things will differ on old or some embedded operating systems (in particular things are a lot simpler on operating systems without virtual memory) or when running code without an OS or on CPUs with no support for memory protection.

The picture in your question is a bit of a simplification. One thing it does not show is that (virtual) memory is made up of pages provided to you by the operating system. Each page has its own permissions controlling whether your process can read, write and/or execute the data in that page.

The text section of a binary will be loaded onto pages that are executable, but not writable. The read-only data section will be loaded onto pages that are neither writable nor executable. All other memory in your picture ((un)initialized data, heap, stack) will be stored on pages that are writable, but not executable.

These permissions prevent security flaws (such as buffer overruns) that could otherwise allow attackers to execute arbitrary code by making the program jump into code provided by the attacker or letting the attacker overwrite code in the text section.

Now the problem with these permissions, with regards to JIT compilation, is that you can't execute your JIT-compiled code: if you store it on the stack or the heap (or within a global variable), it won't be on an executable page, so the program will crash when you try to jump into the code. If you try to store it in the text area (by making use of left-over memory on the last page or by overwriting parts of the JIT-compilers code), the program will crash because you're trying to write to read-only memory.

But thankfully operating systems allow you to change the permissions of a page (on POSIX-systems this can be done using mprotect and on Windows using VirtualProtect). So your first idea might be to store the generated code on the heap and then simply make the containing pages executable. However this can be somewhat problematic: VirtualProtect and some implementations of mprotect require a pointer to the beginning of a page, but your array does not necessarily start at the beginning of a page if you allocated it using malloc (or new or your language's equivalent). Further your array may share a page with other data, which you don't want to be executable.

To prevent these issues, you can use functions, such as mmap on Unix-like operating systems and VirtualAlloc on Windows, that give you pages of memory "to yourself". These functions will allocate enough pages to contain as much memory as you requested and return a pointer to the beginning of that memory (which will be at the beginning of the first page). These pages will not be available to malloc. That is, even if you array is significantly smaller than the size of a page on your OS, the page will only be used to store your array - a subsequent call to malloc will not return a pointer to memory in that page.

So the way that most JIT-compilers work is that they allocate read-write memory using mmap or VirtualAlloc, copy the generated machine instructions into that memory, use mprotect or VirtualProtect to make the memory executable and non-writable (for security reasons you never want memory to be executable and writable at the same time if you can avoid it) and then jump into it. In terms of its (virtual) address, the memory will be part of the heap's area of the memory, but it will be separate from the heap in the sense that it won't be managed by malloc and free.

x86 Assembly: Data in the Text Section

The top level answer is, that x86 machine is not aware of ".text" and ".data" sections. Modern x86 CPU provides OS with tools to create virtual address space with specific rights (like read-only, no-exec and read-write).

But the content of memory is just bytes, and those can be either read, written, or executed, the CPU has no means to guess which part of memory are data and what is code, and will happily execute anything what you point it to.

Those .text/.data/... sections are logical construct supported by compiler, linker, and OS (executable loader), which together cooperate to prepare the runtime environment for the code in such way, that .text is read-only nowadays, and you need to put writeable variables into .data or .bss or similar. Also non-executable stack may be provided by some OS and configurations.

The OS usually also has API, so application can change the rights or memory mapping, or allocate further memory with the attributes it needs (for example JIT compilers would get nowhere, if they would be unable to first write compiled code into memory, and then execute it).

So if you will use your code example on common linux in default config, it will very likely segfault as the .text will be read-only. Many of those "exploits" books have whole dedicated chapter how to compile + set up runtime environment for their examples in such way, that several protections (ASLR, NX, ...) are switched OFF, thus allowing their samples to work.

Then a real exploit in the wild will usually use some bug/weak spot in application to inject its payload somewhere. Depending on the hostility of "somewhere" the real exploit may have to first elevate its rights to get writeable+executable memory (or it must be written in a way to not write into code parts and use other memory for variables), unless the app itself already has some friendly environment for exploit due to its internal needs.

Keep in mind the OS and applications are not written in a way to make sure the exploits will work, quite opposite. Each exploit is usually targetting particular version of application on particular version of OS, which is vulnerable, and it is expected that it will break with the security update later. So if you know you have writeable and executable memory, you just exploit it as is, without bothering what will happen in next version, when they will fix the app to keep their code memory RO.

How do you allocate memory on the heap without using libc in linux

Well, a great opportunity to use strace and use the debugger. From man 2 syscall:

   Arch/ABI      arg1  arg2  arg3  arg4  arg5  arg6  arg7  Notes
   ──────────────────────────────────────────────────────────────
   x86-64        rdi   rsi   rdx   r10   r8    r9    -

gdb a.out:

(gdb) b syscalls.s:5
Breakpoint 1 at 0x1050: file syscalls.s, line 5.
(gdb) r
Starting program: /dev/shm/.1000.home.tmp.dir/a.out 

Breakpoint 1, mmap () at syscalls.s:5
5           syscall
(gdb) info registers
rax            0x9                 9
rbx            0x0                 0
rcx            0x22                34                # here it is
rdx            0x3                 3
rsi            0x1000              4096
rdi            0x0                 0
rbp            0x7fffffffd968      0x7fffffffd968
rsp            0x7fffffffd950      0x7fffffffd950
r8             0xffffffffffffffff  -1
r9             0x0                 0
r10            0x555555554000      93824992231424     # WRONG!
r11            0x206               518
r12            0x555555555000      93824992235520
r13            0x7fffffffd970      140737488345456
r14            0x0                 0
r15            0x0                 0
rip            0x555555555050      0x555555555050 <mmap+7>
eflags         0x202               [ IF ]
cs             0x33                51
ss             0x2b                43
ds             0x0                 0
es             0x0                 0
fs             0x0                 0

We see the value in r10 is some garbage, while 0x22 is in rcx. Consult https://uclibc.org/docs/psABI-x86_64.pdf .

You have to do https://github.com/numactl/numactl/blob/master/syscall.c#L160 some rotating. Just mov %rcx, %r10 is enough.

Overall, use https://github.com/lattera/glibc/blob/master/sysdeps/unix/sysv/linux/x86_64/syscall.S#L29 .

When is a variable placed in `.rdata` section and not in `.text` section?

In the first case, the variable is declared as a local variable. It has "automatic" storage duration, which means that it goes away at the end of the enclosing scope. It cannot occupy any piece of memory permanently because of its storage duration (this is true regardless of const). Thus, it is usually stored on the stack, or in a register.

In the second case, the variable is declared as a global variable. It has static storage duration, so it persists for the lifetime of the program. This can be stored in many places, such as .data, .bss, .text, or .rdata (or .rodata).

.data is generally used for writable static data with some predefined (nonzero) content, e.g. a global int foo = 42;. .bss is used for writable static data initialized to zero (or not initialized). rdata is used for constant static data like strings and const variables. Of course, these uses are all "in general" and may vary from compiler to compiler.

So why did .text get bigger in the first case? It is because the compiler had to generate some extra instructions to load 10 on the stack or into a register.

Allocate Writable Memory in the .Text Section