Allocating a data page in linux with NX bit turned off
The mmap(2) (with munmap(2)
) and mprotect(2) syscalls are the elementary operations to do that. Recall that syscalls are elementary operations from the point of view of an application. You want PROT_EXEC
You could just strace
any dynamically linked executable to get a clue about how you might call them, since the dynamic linker ld.so
is using them.
Generating a shared object might be less expensive than you imagine. Actually, generating C code, running the compiler, then dlopen
-ing the resulting shared object has some sense, even when you work interactively. My MELT domain specific language (to extend GCC) is doing this. Recall that you can do a big lot of dlopen
-s without issues.
If you want to generate machine code in memory, you could use GNU lightning (quick generation of slow machine code), libjit
from dotgnu (generate less bad machine code), LuaJit, asmjit (x86 or amd64 specific), LLVM (slowly generate optimized machine code). BTW, the SBCL Common Lisp implementation is dynamically compiling to memory and produces good machine code at runtime (and there is also all the JIT for JVMs doing that).
Was there ever a need to have the stack be executable?
Creating a non-executable stack requires help from the hardware. Early Intel processors had no NX bit, and it wasn't until the 386 or Pentium that it was really useful. (There were earlier processors that did have no-execute protections. But it wasn't ubiquitous.)
At various times people have made use of executable stacks and even writable code segments to write self-modifying programs. I think we've all come to agree that self-modifying code is a bad idea, but I recall there being a lot of interest in it when I got started in the 80s. You can really write some amazingly tight programs if you throw away pesky restrictions like arbitrary separations between code and data.
Speaking of which, the concept of executing your data, and storing lots of that data on the stack, is an important part of Lisp. See Allocating a data page in linux with NX bit turned off for one discussion of how this impacts JIT compilers. If you want to JIT compile something very small and very often, allocating and running it on the stack can be convenient.
That said, as you say, it's been a big problem, which is why there's generally been moves away from it. This brings a bit of the Harvard architecture back into the Von Neumann architecture most of us have grown up with.
Allocate executable ram in c on linux
See mprotect(). Once you have filled a (n-)page-sized memory region (allocated with mmap()) with code, change its permissions to disallow writes and allow execution.
How the kernel gives seg. fault for a scenario like this?
That's correct. But typically you're not allocating memory directly from the operating. You usually allocate it via some library function (new
or malloc
, etc). The library function will take the 4KB (usually it allocates more than 4KB in one chunk, too) and splits it up into the actual chunks that you ask for. So usually when you ask for 100 bytes of memory, that 100 bytes will be "wedged" in between two other allocation requests that you've made.
This is why it's "undefined behaviour" when you access data off the end of an array: you might get a segmentation fault, you might trash some other variable that happens to be stored there, or you might be OK and it actually works (for a while at least).
Can i execute code that resides in data segment (ELF binary)?
i receive a segmentation fault.
It is hardware control of data execution prevention (https://en.wikipedia.org/wiki/Executable_space_protection#Linux) - you can't just jump to data page if it has no 'x' (execute) bit set in page tables. Memory mappings with all bits are listed in /proc/$pid/maps
/ /proc/$pid/smaps
files as 'rwx' for writable code, 'rw-' for data without execution, 'r--' for readonly data, 'r-x' for normal code.
If you want to execute data, you should call mprotect
syscall with PROT_EXEC
flag on the section of your data which wants to be code.
In x86 world this was fully implemented as "NX bit" / "XD bit" feature in Pentium 4 (Prescott) and newer (Core, Core2, Core i*, core m) / in Athlon 64 / Opteron and newer. If OS works in 32-bit mode, it must turn on PAE to have this bit in page table. In x86_64 mode (64-bit) there is always NX/XD bit supported.
First variants of support were added to linux around 2004: http://linuxgazette.net/107/pramode.html
In 2007 you may have outdated hardware, old kernel or 32-bit mode kernel without PAE.
Info about NX/XD bits: https://en.wikipedia.org/wiki/NX_bit
Sometimes 'rwx' mode may be prohibited, check https://en.wikipedia.org/wiki/W^X.
For pre-NX systems there were solutions based on segment registers of x86 to partially disable part of memory space from executing.
can i execute the program above without having an segmentation fault ?
You can:
- make the data page executable by calling
mprotect
on it withPROT_READ|PROT_EXEC
- make the data segment of elf file marked as executable (need to hack deeply inside
ld
scripts - default is inld --verbose
) - make all pages including
.data
and the heap executable (not just the stack)
with ld or gcc-z execstack
- move shellcode to text data of elf file
- try to disable nx/xd bit in kernel (hard; recompilation may be needed)
- use 32-bit OS (kernel) without PAE option enabled (build time option).
- use older cpu without NX/XD
Stack resident buffer overflow on 64-bit?
Those two instructions are doing exactly what you expect them to do. You have overwritten the previous stack frame with 0x41
's so when you hit the leaveq
, you are doing this:
mov rsp, rbp
pop rpb
Now rsp
points to where rbp
did before. However, you have overwritten that region of memory, so when you do the pop rbp
, the hardware is essentially doing this
mov rbp, [rsp]
add rsp,1
But [rsp]
now has 0x41
's. So this is why you're seeing rbp
get filled with that value.
As for why rip
isn't getting set like you expect, it's because ret
is setting the rip
to 0x41
and then generating an exception (page fault) on the instruction fetch. I wouldn't rely on GDB to show the right thing in this case. You should try overwriting the return value with a valid address within the program's text segment and you likely won't see this weird behavior.
Related Topics
Update Specific Field in Text File in Specific Line
Dotnetcore: Cross Platform Version of Getinvalidfilenamechars
How to Hide Password from Jenkins Shell Output
How to Merge Two Seperate - Yet Similar - Codebases into One Svn Rep
Rust Linux Version Glibc Not Found - Compile for Different Glibc/Libc6 Version
Will Data Written via Write() Be Flushed to Disk If a Process Is Killed
How Create a Bash Script with Another Bash Script
Eliminate Unwanted Output Using Awk and Sed
Bash - How to Match Files Names to Use in Loop
Compute Base64 Encoded Hash from a Given Hash
Unable to Set Variable in Case Statement Bash
Old Logs Are Not Imported into Es by Logstash
How to Make a Bash String of Command with Redirect and Pipe
Linux, Serial Port, Non-Buffering Mode
Using Sed to Print Between Two Patterns
Average of Multiple Files Without Considering Missing Values
How to Use the Parallel Command to Exploit Multi-Core Parallelism on My MACbook