How to extract only the raw contents of an ELF section?
Rather inelegant hack around objdump
and dd
:
IN_F=/bin/echo
OUT_F=./tmp1.bin
SECTION=.text
objdump -h $IN_F |
grep $SECTION |
awk '{print "dd if='$IN_F' of='$OUT_F' bs=1 count=$[0x" $3 "] skip=$[0x" $6 "]"}' |
bash
The objdump -h
produces predictable output which contains section offset in the elf file. I made the awk
to generate a dd
command for the shell, since dd
doesn't support hexadecimal numbers. And fed the command to shell.
In past I did all that manually, without making any scripts, since it is rarely needed.
How can I examine contents of a data section of an ELF file on Linux?
objdump -s -j .rodata exefile
gives a side-by-side hex/printable ASCII dump of the contents of the rodata
section like:
Contents of section .rodata:
0000 67452301 efcdab89 67452301 efcdab89 gE#.....gE#.....
0010 64636261 68676665 64636261 68676665 dcbahgfedcbahgfe
It doesn't look like there's anything in there to control formatting, but it's a start. You could always undump the hex and feed it to od, I suppose :)
Extract read-only data sections from an archive/lib (ELF i guess?) for compression
Eureka! I figured it out!
While I fully acknowledge and appreciate the advice given in comments / other answers, it still bothered me that I speculated it should be relatively easy to "play the linker" and extract blobs of hardcoded data from e.g an object-file.
Well, it turns out to be relatively easy (for elf format objects anyways) just as expected by using readelf
To dump a symbol, I used two steps:
Figure out the symbol index, by looking at the symbols in the object:
$ readelf --syms company_logo.o
Symbol table '.symtab' contains 17 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000 0 FILE LOCAL DEFAULT ABS company_logo.c
2: 00000000 0 SECTION LOCAL DEFAULT 1
3: 00000000 0 SECTION LOCAL DEFAULT 2
4: 00000000 0 SECTION LOCAL DEFAULT 3
5: 00000000 0 SECTION LOCAL DEFAULT 4
6: 00000000 0 NOTYPE LOCAL DEFAULT 4 $d
7: 00000000 0 SECTION LOCAL DEFAULT 6
8: 00000000 0 SECTION LOCAL DEFAULT 7
9: 00000000 0 SECTION LOCAL DEFAULT 9
10: 00000000 0 SECTION LOCAL DEFAULT 10
11: 00000000 0 SECTION LOCAL DEFAULT 12
12: 00000000 0 SECTION LOCAL DEFAULT 13
13: 00000000 0 SECTION LOCAL DEFAULT 14
14: 00000000 0 SECTION LOCAL DEFAULT 15
15: 00000000 12 OBJECT GLOBAL DEFAULT 4 company_logo
16: 00000000 21879 OBJECT GLOBAL DEFAULT 6 company_logo_map
Dump the contents of the symbol.
Now company_logo_map
was my target, so use its index 6
, as follows:
`readelf --hex-dump=6 company_logo.o`
` `
`Hex dump of section '.rodata.company_logo_map':`
` 0x00000000 00000000 00000000 00000000 00000000 ................`
` 0x00000010 00000000 00000000 00000000 00000000 ................`
` 0x00000020 00000000 00000000 00000000 00000000 ................`
` 0x00000030 00000000 00000000 00000000 00000000 ................`
` ... lots more data here`
How do I get the real position of a section in an ELF archive file?
the File Offset column appears to be relative to the object file position in the archive
The file offset you get from objdump
is relative to the beginning of the individual object file. You can think of an archive library as a bookshelf, and the ouput of objdump -h
as the index within each individual book. You wouldn't expected the index to change depending on which other books are on the shelf, or when you take the book from the shelf. Similarly, the object file itself (and the output of objdump -h
) does not change when you put into the library, or extract it out again (you get bit-identical copy).
I expected I could use dd to extract binary information from the archive file
You could use dd
, but you'd have to first find the position of each individual object file in the archive. That's not too difficult: the format of UNIX archive files is documented. But the format can change depending on which UNIX variant you use, and it's not really necessary for the task you want to perform.
How do I do this with an archive?
If you know that .mysection
has identical contents in all object files in libmylib.a
(as would be the case for the objcopy --add-section
command you gave), then extract one object, from the archive, then extract the section:
firstobj=$(ar t libmylib.a | grep '\.o$' | head -1)
ar x libmylib.a $firstobj
# use objdump -h and dd to extract section context.
# or use "readelf -p .mysection $firstobj
rm -f $firstobj
If the contents of .mysection
may be different in different object files, extract them to a temporary directory:
mkdir tmp.$$ && cd tmp.$$ && ar x ../libmylib.a
for obj in $(find . -type f); do
# extract .mysection from $obj
done
cd .. && rm -rf tmp.$$
objdump to extract contents of text segment to a binary format
We have to specify the file format explicitly using the -I.
objcopy -I #file type format# -j #ELF segment contents to copy# -O #data type to output, binary, etc# #input file# #output file#
eg.
objcopy -I elf32-little -j .text -O binary firmware.ko content.bin
c# capture raw data from a GNU objcopy process, that dumps to a file
On Windows there is a device "CON" which you might leverage.
objcopy file "someFile" --dump-section .text=CON
I did not test it, because I do not have OBJCOPY, but it worked with OpenSSL. So it should output everything to the console.
How to see the GNU debuglink value of an ELF file?
Something like this should work:
objcopy --output-target=binary --set-section-flags .gnu_debuglink=alloc \
--only-section=.gnu_debuglink helloworld helloworld.dbg
--output-target=binary
avoids adding ELF headers. --set-section-flags .gnu_debuglink=alloc
is needed because objcopy
only writes allocated sections by default (with the binary
emulation). And --only-section=.gnu_debuglink
finally identifies the answer. See this earlier answer.
Note that the generated file may have a trailing NUL byte and four bytes of CRC, so some post-processing is needed to extract everything up to the first NUL byte (perhaps using head -z -n 1 helloworld.dbg | tr -d '\0'
or something similar).
DT_USED entry in .dynamic section of ELF file
In general, when looking at Solaris dynamic linker features, it is possible to find more information in the public Illumos sources (which were once derived from OpenSolaris). In this case, it seems that DT_USED
is always treated like DT_NEEDED
, so they are the essentially same thing. One of the header files, usr/src/uts/common/sys/link.h
also contains this:
/*
* DT_* entries between DT_HIPROC and DT_LOPROC are reserved for processor
* specific semantics.
*
* DT_* encoding rules apply to all tag values larger than DT_LOPROC.
*/
#define DT_LOPROC 0x70000000 /* processor specific range */
#define DT_AUXILIARY 0x7ffffffd /* shared library auxiliary name */
#define DT_USED 0x7ffffffe /* ignored - same as needed */
#define DT_FILTER 0x7fffffff /* shared library filter name */
#define DT_HIPROC 0x7fffffff
There may have been planned something here, but it doesn't seem to be implemented (or it used to be and no longer is).
Related Topics
Sorting on the Last Field of a Line
How to Set Memory Limit for Oom Killer for Chrome
Converting Serial Port Data to Tcp/Ip in a Linux Environment
Mongodb Gui Client (Cross-Platform or Linux)
How to Establish Ssl Connection Upon Wget on Ubuntu 14.04 Lts
Return Code When Oom Killer Kills a Process
How to Check If Smtp Is Working from Commandline (Linux)
Bash: Inserting a Line in a File at a Specific Location
What Are the Real Rules for Linux Usernames on Centos 6 and Rhel 6
Grep Without Showing Path/File:Line
How to Get the Source Code for the Linux Utility Tail
How to Have Simple and Double Quotes in a Scripted Ssh Command
How to Use Gdb to Debug a Running Process
Linux Bash Script to Extract Ip Address
Receiving Key Press and Key Release Events in Linux Terminal Applications