Can a Program Read Its Own Elf Section

can a program read its own elf section?

How can I print the build version of the program (located in the .note.gnu.build-id elf section) from the program itself?

  1. You need to read the ElfW(Ehdr) (at the beginning of the file) to find program headers in your binary (.e_phoff and .e_phnum will tell you where program headers are, and how many of them to read).

  2. You then read program headers, until you find PT_NOTE segment of your program. That segment will tell you offset to the beginning of all the notes in your binary.

  3. You then need to read the ElfW(Nhdr) and skip the rest of the note (total size of the note is sizeof(Nhdr) + .n_namesz + .n_descsz, properly aligned), until you find a note with .n_type == NT_GNU_BUILD_ID.

  4. Once you find NT_GNU_BUILD_ID note, skip past its .n_namesz, and read the .n_descsz bytes to read the actual build-id.

You can verify that you are reading the right data by comparing what you read with the output of readelf -n a.out.

P.S.

If you are going to go through the trouble to decode build-id as above, and if your executable is not stripped, it may be better for you to just decode and print symbol names instead (i.e. to replicate what backtrace_symbols does) -- it's actually easier to do than decoding ELF notes, because the symbol table contains fixed-sized entries.

ELF section identification

I over-wrote all the section names ...

... the file can be linked ... correctly.

Unlike the object files used in 32-bit Windows, the section names in ELF object files are ignored if no linker script is used.

Each "PROGBITS" section contains flags that specify if the section is writeable, executable and/or not even part of the image (debug information).

(Actually, the object files used by Windows also have such flags, but they are typically set to 0 and the section name is used to distinguish between code and data sections.)

For other section types (such as symbol tables) it is clear how they have to be handled anyway.

... the file can be ... executed correctly.

For executable files and shared libraries, the sections are ignored anyway. Instead, the "program headers" of the file are used.

A "program header" tells the OS that a certain address range in the file must be loaded to memory. A "program header" may cover multiple sections. And "program headers" don't have names.

Example:

Sections:
Name Address Offset in file Length
.text 0x10100 0x100 0x30 read-only
.rodata 0x10130 0x130 0x20 read-only
.data 0x20250 0x150 0x10 read-write
.sdata 0x20260 0x160 0x10 read-write

Program headers:
Address Offset in file Length
0x10100 0x100 0x50 read-only
0x20250 0x150 0x20 read-write

Reading the contents of an ELF section(programmatically)

To extract .text section you need to copy 0x182 (Size) bytes starting from 0x440 (Offset) address in your binary file.

Ignore 0x400440 (Address) value, it has nothing to do with file addresses, it's address in RAM memory where your .text section will be copied by loader. From ELF format specification:

sh_addr: If the section will appear in the memory image of a process, this member gives the
address at which the section’s first byte should reside. Otherwise, the member contains 0.

Align value is actually decimal, not hexadecimal. So it's 16, not 0x16. Alignment means that section address must be multiple of 16 (bytes).


You can verify all this, exploring the binary by yourself. First, observe disassemble of your binary:

$ objdump -D your-file | less

Find where .text starts and then look at .text section data. Now just make a dumb hexdump operation:

$ hexdump -C your-file | less

Now find the Offset address and look at bytes starting from this address. You will find out they are the same bytes as from disassembled .text section.

Conclusion: you need to use Offset value (from readelf output) when working with your file, not Address value.

How can I read a custom section within the loader?

The ELF header (Elf32_Ehdr or Elf64_Ehdr) contains information pointing to the section header table (members e_shoff, e_shentsize). Together with the section string table index (e_shstrndx), this information can be used to read the section headers and eventually locate the data you are interested in.

use program or section headers to load an ELF

The program loader should look at the program header only. The section headers are for tools such as debuggers. I don't think this is spelled out explicitly in the original ELF specification or the System V ABI specification, but it is very much implied:

  • System V Application Binary Interface

Even today, when new features are defined which are used by the dynamic linker, references are added the dynamic to the dynamic section, even though in theory, the information could also be obtained from the section header (but there are probably some exceptions for certain architectures).

ELF files - What is a section and why do we need it?

  1. How are ELF files generated? is it the compiler responsibility?

    They can be generated by a compiler, an assembler, or any other tool that can generate them. Even your own program you wrote for generating ELF files ;) They're just streams of bytes after all, so they can be generated by just writing bytes into a file in binary mode. You can do that too.

  2. What are sections and why do we need them?

    ELF files are subdivided into sections. Sections are the smallest continuous regions in the file. You can think of them as pages in an organizer, each with its own name and type that describes what does it contain inside. Linkers use this information to combine different parts of the program coming from different modules into one executable file or a library, by merging sections of the same type (gluing pages together, if you will).

    In executable files, sections are optional, but they're usually there to describe what's in the file and where does it begin, and how much bytes does it take.

  3. What are program headers and why do we need them?

    They're mostly for making executable files. In order to run a program, sections aren't enough, because you have to specify not only what's there in the file, but also where should it be loaded into memory in the running process. Program headers are just for that purpose: they describe segments, which are regions of memory in the running process, with different access privileges & stuff.

    Each program header describes one segment. It tells the loader where should it load a certain region in the file into memory and what permissions should it set for that region (e.g. should it be allowed to execute code from it? should it be writable or just for reading?)

    Segments can be further subdivided into sections. For example, if you have to specify that your code segment is further subdivided into code and static read-only strings for the messages the program displays. Or that your data segment is subdivided into funky data and hardcore data :J It's for you to decide.

    In executable files, sections are optional, but it's nice to have them, because they describe what's in the file and allow for dumping selected parts of it (e.g. with the objdump tool). Sometimes they are needed, though, for storing dynamic linking information, symbol tables, debugging information, stuff like that.

  4. Inside program headers, what's the meaning of the fields p_vaddr and p_paddr?

    Those are the addresses at which the data in the file will be loaded. They map the contents of the file into their corresponding memory locations. The first one is a virtual address, the second one is physical address.

    Physical addresses are the "raw" memory addresses. On modern operating systems, those are no longer used in the userland. Instead, userland programs use virtual addresses. The operating system deceives the userland program that it is alone in memory, and that the entire address space is available for it. Under the hood, the operating system maps those virtual addresses to physical ones in the actual memory, and it does it transparently to the program.

    Of course, not every address in the virtual address space is available at the same time. There are limitations imposed by the actual physical memory available. So the operating system just maps the memory for the segments the program actually uses (here's where the "segments" part from the ELF file's program headers comes into play). If the process tries to access some unmapped memory, the operating system steps in and says, "sorry, chap, this memory doesn't belong to you". (The program can address it, but it cannot access it.)

  5. Does each section have it's own section header?

    Yes. If it doesn't have an entry in the Section Headers Table, it's not a section :q Because they only way to tell if some part of the file is a section, is by looking in to the Section Headers Table which tells you what sections are defined in the file and where you can find them.

    You can think of the Section Headers Table as a table of contents in a book. Without the table of contents, there aren't any chapters after all, because they're not listed anywhere. The book may have headings, but the content is not subdivided into logical chapters that can be found through the table of contents. Same goes with sections in ELF files: there can be some regions of data, but you can't tell without the "table of contents" which is the SHT.



Related Topics



Leave a reply



Submit