Loading Elf File in C in User Space

Loading a non-relocatable, static ELF binary in userspace

Archimedes called "heureka" when he found that at a location can only be one object. If your ELF binary must be at one location because you can't rebuild it for another location you have to relocate the loader itself.

The non-relocatable ELF doesn't include enough Information to move it to a different address. You could probably write a decompiler that detects all address references in the code but it's not worth. You will have problems when you try to analyze data references like pointers stored in pre-initialized variables.

Rewrite the loader if you can't get the source code of you ELF binary or a relocatable version.

BTW: Archimedes heureka was deadly for the goldsmith who cheated. I hope it's not so expensive in your case.

how to load an ELF file all at once in linux?

You can have your program call mlockall() when it starts:

mlockall() locks all pages mapped into the address space of the calling process. This includes the pages of the code, data and stack segment, as well as shared libraries, user space kernel data, shared memory, and memory-mapped files. All mapped pages are guaranteed to be resident in RAM when the call returns successfully; the pages are guaranteed to stay in RAM until later unlocked.

Note that you have to be root for this, or have the CAP_IPC_LOCK capability, since ordinary user processes aren't allowed to forcibly hog physical memory this way.

load ELF file into memory

I'm trying to put an elf file into memory and then execute it,

For a fully-statically-linked executable, your steps would work (except you need to jump to _start == entry point 0x8120, not main).

Then I have copied all the elf bytes into the allocations

Another possible problem is not paying attention to the .p_offset. Your memcpyies should look something like this:

unsigned char buf1[0x16828];  // read 0x16828 bytes from start of file
memcpy(0x8000, buf1, 0x16828);

unsigned char buf2[0x250]; // read 0x250 bytes from offset 0x016840 into the file
memcpy(0x0001f840, buf2, 0x250);

which part of ELF file must be loaded into the memory?

Sections and segments are two different concepts completely. Sections pertain the the semantics of the data stored there (i.e. what it will be used for) and are actually irrelevant once a program or shared library is linked except for debugging purposes. You could even remove the section headers entirely (or overwrite them with random garbage) and a program would still work.

Segments (i.e. program header load directives) are what the kernel and/or dynamic linker actually look at when loading a program. For example, in your case you have two load directives. The first one causes the first 4k (1 page) of the file to be mapped at address 0x08048000, and indicates that only the first 0x4b8 bytes of this mapping are actually to be used (the rest is alignment). The second causes the first 8k (2 pages) of the file to be mapped at address 0x08049000. The vast majority of that is alignment. The first 0xf14 bytes are not part of the load directive (just alignment) and will be wasted. Beginning at 0x08049f14, 0x108 bytes mapped from the file are actually used, and another 0x10 bytes (to reach the MemSize of 0x118) are zero-filled by the loader (kernel or dynamic linker). This spans up to 0x0804a02c (in the second mapped page). The rest of the second mapped page is unused/wasted (but malloc might be able to recover it for use as part of the heap).

Finally, while the section headers will not be used at all, the contents of many different sections may be used by your program while it's running. Note that the address ranges of .ctors and .dtors lie in the beginning of the second load mapping, so they are mapped and accessible by the program at runtime (the runtime startup/exit code will use them to run global constructors and destructors, if C++ or "GNU C" code with ctor/dtor attribute was used). Also note that .data starts at address 0x0804a00c, in the second mapped page. This allows the first page to be protected read-only after relocations are applied (the RELRO directive in the program header).



Related Topics



Leave a reply



Submit