Allocating a Data Page in Linux with Nx Bit Turned Off

Allocating a data page in linux with NX bit turned off

The mmap(2) (with munmap(2)) and mprotect(2) syscalls are the elementary operations to do that. Recall that syscalls are elementary operations from the point of view of an application. You want PROT_EXEC

You could just strace any dynamically linked executable to get a clue about how you might call them, since the dynamic linker ld.so is using them.

Generating a shared object might be less expensive than you imagine. Actually, generating C code, running the compiler, then dlopen-ing the resulting shared object has some sense, even when you work interactively. My MELT domain specific language (to extend GCC) is doing this. Recall that you can do a big lot of dlopen-s without issues.

If you want to generate machine code in memory, you could use GNU lightning (quick generation of slow machine code), libjit from dotgnu (generate less bad machine code), LuaJit, asmjit (x86 or amd64 specific), LLVM (slowly generate optimized machine code). BTW, the SBCL Common Lisp implementation is dynamically compiling to memory and produces good machine code at runtime (and there is also all the JIT for JVMs doing that).

Was there ever a need to have the stack be executable?

Creating a non-executable stack requires help from the hardware. Early Intel processors had no NX bit, and it wasn't until the 386 or Pentium that it was really useful. (There were earlier processors that did have no-execute protections. But it wasn't ubiquitous.)

At various times people have made use of executable stacks and even writable code segments to write self-modifying programs. I think we've all come to agree that self-modifying code is a bad idea, but I recall there being a lot of interest in it when I got started in the 80s. You can really write some amazingly tight programs if you throw away pesky restrictions like arbitrary separations between code and data.

Speaking of which, the concept of executing your data, and storing lots of that data on the stack, is an important part of Lisp. See Allocating a data page in linux with NX bit turned off for one discussion of how this impacts JIT compilers. If you want to JIT compile something very small and very often, allocating and running it on the stack can be convenient.

That said, as you say, it's been a big problem, which is why there's generally been moves away from it. This brings a bit of the Harvard architecture back into the Von Neumann architecture most of us have grown up with.

Allocate executable ram in c on linux

See mprotect(). Once you have filled a (n-)page-sized memory region (allocated with mmap()) with code, change its permissions to disallow writes and allow execution.

How the kernel gives seg. fault for a scenario like this?

That's correct. But typically you're not allocating memory directly from the operating. You usually allocate it via some library function (new or malloc, etc). The library function will take the 4KB (usually it allocates more than 4KB in one chunk, too) and splits it up into the actual chunks that you ask for. So usually when you ask for 100 bytes of memory, that 100 bytes will be "wedged" in between two other allocation requests that you've made.

This is why it's "undefined behaviour" when you access data off the end of an array: you might get a segmentation fault, you might trash some other variable that happens to be stored there, or you might be OK and it actually works (for a while at least).

Can i execute code that resides in data segment (ELF binary)?

i receive a segmentation fault.

It is hardware control of data execution prevention (https://en.wikipedia.org/wiki/Executable_space_protection#Linux) - you can't just jump to data page if it has no 'x' (execute) bit set in page tables. Memory mappings with all bits are listed in /proc/$pid/maps / /proc/$pid/smaps files as 'rwx' for writable code, 'rw-' for data without execution, 'r--' for readonly data, 'r-x' for normal code.

If you want to execute data, you should call mprotect syscall with PROT_EXEC flag on the section of your data which wants to be code.

In x86 world this was fully implemented as "NX bit" / "XD bit" feature in Pentium 4 (Prescott) and newer (Core, Core2, Core i*, core m) / in Athlon 64 / Opteron and newer. If OS works in 32-bit mode, it must turn on PAE to have this bit in page table. In x86_64 mode (64-bit) there is always NX/XD bit supported.

First variants of support were added to linux around 2004: http://linuxgazette.net/107/pramode.html

In 2007 you may have outdated hardware, old kernel or 32-bit mode kernel without PAE.

Info about NX/XD bits: https://en.wikipedia.org/wiki/NX_bit

Sometimes 'rwx' mode may be prohibited, check https://en.wikipedia.org/wiki/W^X.

For pre-NX systems there were solutions based on segment registers of x86 to partially disable part of memory space from executing.

can i execute the program above without having an segmentation fault ?

You can:

  • make the data page executable by calling mprotect on it with PROT_READ|PROT_EXEC
  • make the data segment of elf file marked as executable (need to hack deeply inside ld scripts - default is in ld --verbose)
  • make all pages including .data and the heap executable (not just the stack)

    with ld or gcc -z execstack
  • move shellcode to text data of elf file
  • try to disable nx/xd bit in kernel (hard; recompilation may be needed)
  • use 32-bit OS (kernel) without PAE option enabled (build time option).
  • use older cpu without NX/XD

Stack resident buffer overflow on 64-bit?

Those two instructions are doing exactly what you expect them to do. You have overwritten the previous stack frame with 0x41's so when you hit the leaveq, you are doing this:

mov rsp, rbp
pop rpb

Now rsp points to where rbp did before. However, you have overwritten that region of memory, so when you do the pop rbp, the hardware is essentially doing this

mov rbp, [rsp]
add rsp,1

But [rsp] now has 0x41's. So this is why you're seeing rbp get filled with that value.

As for why rip isn't getting set like you expect, it's because ret is setting the rip to 0x41 and then generating an exception (page fault) on the instruction fetch. I wouldn't rely on GDB to show the right thing in this case. You should try overwriting the return value with a valid address within the program's text segment and you likely won't see this weird behavior.



Related Topics



Leave a reply



Submit