how to self dlopen an executable binary
You need to code:
// file ds.c
#include <stdio.h>
#include <stdlib.h>
#include <dlfcn.h>
void hello ()
{
printf ("hello world\n");
}
int main (int argc, char **argv)
{
char *buf = "hello";
void *hndl = dlopen (NULL, RTLD_LAZY);
if (!hndl) { fprintf(stderr, "dlopen failed: %s\n", dlerror());
exit (EXIT_FAILURE); };
void (*fptr) (void) = dlsym (hndl, buf);
if (fptr != NULL)
fptr ();
else
fprintf(stderr, "dlsym %s failed: %s\n", buf, dlerror());
dlclose (hndl);
}
Read carefully dlopen(3), always check the success of the dlopen
& dlsym
functions there, and use dlerror
on failure.
and compile the above ds.c
file with
gcc -std=c99 -Wall -rdynamic ds.c -o ds -ldl
Don't forget the -Wall
to get all warnings and the -rdynamic
flag (to be able to dlsym
your own symbols which should go into the dynamic table).
On my Debian/Sid/x86-64 system (with gcc
version 4.8.2, and libc6
version 2.17-93 providing the -ldl
, kernel 3.11.6 compiled by me, binutils
package 2.23.90 providing ld
), the execution of ./ds
gives the expected output:
% ./ds
hello world
and even:
% ltrace ./ds
__libc_start_main(0x4009b3, 1, 0x7fff1d0088b8, 0x400a50, 0x400ae0 <unfinished ...>
dlopen(NULL, 1) = 0x7f1e06c9e1e8
dlsym(0x7f1e06c9e1e8, "hello") = 0x004009a0
puts("hello world"hello world
) = 12
dlclose(0x7f1e06c9e1e8) = 0
+++ exited (status 0) +++
Using dlopen() on an executable
You can't open executables as libraries. The entry point of an executable will attempt to re-initialize the C library, and take over the brk
pointer. This will corrupt your malloc heap. Additionally, the executable is likely to be mapped at a fixed address with no relocations, and if this address overlaps with anything already loaded, it's not possible to map it for that reason as well.
You need to refactor the other program into a library, or add a RPC interface to the other program.
Note that this does not necessarily apply for PIE executables. However, unless the executable is specifically designed for being dlopen()
ed, this is unsafe, as main()
will not be run, and any initialization done in main()
therefore will not occur.
How to use dlopen() to get the executables path
Am I wrong in my assumption that the handle for the executable can be used in the dlinfo functions the same way a .so handle can be used?
Yes, you are.
The dynamic linker has no idea which file the main executable was loaded from. That's because the kernel performs all mmap
s for the main executable, and only passes a file descriptor to the dynamic loader (who's job it is to load other required libraries and star the executable running).
I'm trying to replicate some of the functionality of GetModuleFileName() on linux
There is no reliable way to do that. In fact the executable may no longer exist anywhere on disk at all -- it's perfectly fine to run the executable and remove the executable file while the program is still running.
Also hard links mean that there could be multiple correct answers -- if a.out
and b.out
are hard linked, there isn't an easy way to tell whether a.out
or b.out
was used to start the program running.
Your best options probably are reading /proc/self/exe
, or parsing /proc/self/cmdline
and/or /proc/self/maps
.
shared object can't find symbols in main binary, C++
Try:
g++ -fPIC -rdynamic -o testexe testexe.cpp -ldl
Without the -rdynamic
(or something equivalent, like -Wl,--export-dynamic
), symbols from the application itself will not be available for dynamic linking.
Finding number of dlopen calls of an ELF binary in C
You could simply use ltrace
:
Example:
#include <dlfcn.h>
#include <stdio.h>
int main(int C, char **V)
{
char **a = V+1;
while(*a){
void *h;
if(0==(h=dlopen(*a++, RTLD_LAZY)))
fprintf(stderr, "%s\n", dlerror());
}
}
Compile it:
$ gcc example.c -fpic -pie
Invoke it on self and count dlopen
calls:
$ ltrace -o /dev/fd/3 \
./a.out ./a.out ./a.out ./a.out 3>&1 >/dev/null| \
grep ^dlopen\( -c
3
How to compile ELF binary so that it can be loaded as dynamic library?
Based on links provided in comments and other answers here is how it can be done without linking these programs compile time:
test1.c:
#include <stdio.h>
int a(int b)
{
return b+1;
}
int c(int d)
{
return a(d)+1;
}
int main()
{
int b = a(3);
printf("Calling a(3) gave %d \n", b);
int d = c(3);
printf("Calling c(3) gave %d \n", d);
}
test2.c:
#include <dlfcn.h>
#include <stdio.h>
int (*a_ptr)(int b);
int (*c_ptr)(int d);
int main()
{
void* lib=dlopen("./test1",RTLD_LAZY);
a_ptr=dlsym(lib,"a");
c_ptr=dlsym(lib,"c");
int d = c_ptr(6);
int b = a_ptr(5);
printf("b is %d d is %d\n",b,d);
return 0;
}
Compilation:
$ gcc -fPIC -pie -o test1 test1.c -Wl,-E
$ gcc -o test2 test2.c -ldl
Execution results:
$ ./test1
Calling a(3) gave 4
Calling c(3) gave 5
$ ./test2
b is 6 d is 8
References:
- building a .so that is also an executable
- Compile C program using dlopen and dlsym with -fPIC
PS: In order to avoid symbol clashes imported symbols and pointers they assigned to better have different names. See comments here.
Calling aarch64 shared library from amd64 executable, maybe using binary translation/QEMU
The solution that I implemented for this is to use shared memory IPC. This solution is particularly nice since it integrates pretty well with fixed-length C structs, allowing you to simply just use a struct on one end and the other end.
Let's say you have a function with a signature uint32_t so_lib_function_a(uint32_t c[2])
You can write a wrapper function in an amd64 library: uint32_t wrapped_so_lib_function_a(uint32_t c[2])
.
Then, you create a shared memory structure:
typedef struct {
uint32_t c[2];
uint32_t ret;
int turn; // turn = 0 means amd64 library, turn = 1 means arm library
} ipc_call_struct;
Initialise a struct like this, and then run shmget(SOME_SHM_KEY, sizeof(ipc_call_struct), IPC_CREAT | 0777);
, get the return value from that, and then get a pointer to the shared memory. Then copy the initialised struct into shared memory.
You then run shmget(3)
and shmat(3)
on the ARM binary side, getting a pointer to the shared memory as well. The ARM binary runs an infinite loop, waiting for its "turn." When turn
is set to 1
, the amd64 binary will block in a forever loop until the turn
is 0
. The ARM binary will execute the function, using the shared struct details as parameters and updating the shared memory struct with the return value. Then the ARM library will set the turn
to 0
and block until turn
is 1
again, which will allow the amd64 binary to do its thing until it's ready to call the ARM function again.
Here is an example (it might not compile yet, but it gives you a general idea):
Our "unknown" library : shared.h
#include <stdint.h>
#define MAGIC_NUMBER 0x44E
uint32_t so_lib_function_a(uint32_t c[2]) {
// Add args and multiplies by MAGIC_NUMBER
uint32_t ret;
for (int i = 0; i < 2; i++) {
ret += c[i];
}
ret *= MAGIC_NUMBER;
return ret;
}
Hooking into the "unknown" library: shared_executor.c
#include <dlfcn.h>
#include <sys/shm.h>
#include <stdint.h>
#define SHM_KEY 22828 // Some random SHM ID
uint32_t (*so_lib_function_a)(uint32_t c[2]);
typedef struct {
uint32_t c[2];
uint32_t ret;
int turn; // turn = 0 means amd64 library, turn = 1 means arm library
} ipc_call_struct;
int main() {
ipc_call_struct *handle;
void *lib_dlopen = dlopen("./shared.so", RTLD_LAZY);
so_lib_function_a = dlsym(lib_dlopen, "so_lib_function_a");
// setup shm
int shm_id = shmget(SHM_KEY, sizeof(ipc_call_struct), IPC_CREAT | 0777);
handle = shmat(shm_id, NULL, 0);
// We expect the handle to already be initialised by the time we get here, so we don't have to do anything
while (true) {
if (handle->turn == 1) { // our turn
handle->ret = so_lib_function_a(handle->c);
handle->turn = 0; // hand off for later
}
}
}
On the amd64 side: shm_shared.h
#include <stdint.h>
#include <sys/shm.h>
typedef struct {
uint32_t c[2];
uint32_t ret;
int turn; // turn = 0 means amd64 library, turn = 1 means arm library
} ipc_call_struct;
#define SHM_KEY 22828 // Some random SHM ID
static ipc_call_struct* handle;
void wrapper_init() {
// setup shm here
int shm_id = shmget(SHM_KEY, sizeof(ipc_call_struct), IPC_CREAT | 0777);
handle = shmat(shm_id, NULL, 0);
// Initialise the handle
// Currently, we don't want to call the ARM library, so the turn is still zero
ipc_call_struct temp_handle = { .c={0}, .ret=0, .turn=0 };
*handle = temp_handle;
// you should be able to fork the ARM binary using "qemu-arm-static" here
// (and add code for that if you'd like)
}
uint32_t wrapped_so_lib_function_a(uint32_t c[2]) {
handle->c = c;
handle->turn = 1; // hand off execution to the ARM librar
while (handle->turn != 0) {} // wait
return handle->ret;
}
Again, there's no guarantee this code even compiles (yet), but just a general idea.
Related Topics
How Are Sbrk/Brk Implemented in Linux
Differencebetween Clock_Monotonic & Clock_Monotonic_Raw
Can You Enter X64 32-Bit "Long Compatibility Sub-Mode" Outside of Kernel Mode
Add Text Between Two Patterns in File Using Sed Command
What's the Accepted Method for Deploying a Linux Application That Relies on Shared Libraries
Installing Jenkins Plugins to Docker Jenkins
Saving Gmon.Out Before Killing a Process
Application 'Appname' Failed to Start (Port 8080 Not Available) on Open Shift Node App
Where Are the Stacks for the Other Threads Located in a Process Virtual Address Space
Whiptail: How to Redirect Output to Environment Variable
Reliability of Linux Kernel Add_Timer at Resolution of One Jiffy
Finding Process Count in Linux via Command Line
Meaning of Exit Status 1 Returned by Linux Command
Run a Shell Script from Docker-Compose Command, Inside the Container
Bash: Difference Between "Export K=1" VS. "K=1"
Why Crontab Uses or When Both Day of Month and Day of Week Specified