I Can Load Functions from Dynamic Library with Linking This Dl ,But I Can Not Load It Using 'Dlsym' in the Code Without Linking This Dl

I can load functions from dynamic library with linking this dl ,but I can not load it using 'dlsym' in the code without linking this dl

Have you considered name mangling? C++ identifiers are typically "mangled" to incorporate information on their namespace and arguments (which historically helped linkers differentiate overloaded functions). You may want to make the function extern "C" to prevent mangling, or find its mangled name to use with dlsym (e.g. on Linx use nm on an object, or gcc -S -o /dev/tty ... | grep some_func on the source).

Unable to load symbol even if it is present in .so file?

The function is defined as a C++ function (I can see that because it has argument types in the listing). So you need to figure out what the name is, probably _Z25vmcGetDiskChangedInfoStrmPvilPKcS1_S1_S1_RSt18basic_stringstreamIcSt11char_traitsIcESaIcEEb and then look for that.

Calling function from dynamic library?

dlsym - obtain address of a symbol in a shared object or executable

This means that when you do dlsym(dl, "show_version"); you are not actually calling the function show_version in your shared library. You obtain the address of that function - which can be used to call the function over and over again.

To "decode" what char *(*ver)() means, you can use what is often called the Clockwise/Spiral Rule

        +-----+
| V
char* (*ver) () ver is a
^ ^ | | pointer to
| | | | a function (taking no arguments)
| +-+ | returning char*
| |
+------------+

I assume the above matches the signature of the show_version function that you put in the shared library. Example:

// a function (taking no arguments), returning a char*
char *show_version(void) {
static char version[] = "1.0";
return version;
}

Using the same rule on your first attempt, char* ver:

char* ver
^ | ver is a
| | char*
+----+

You need a pointer to a function (with the correct signature) to be able to call the function and get the result you want. You can't call a char* and when you do printf("%s\n", ver); it'll just start reading the memory at the address (where your function is stored) until it finds a null terminator. You probably see just gibberish.

If you on the other hand have a proper function pointer, you can as you've noticed, call the function it points at with ver() and you get a char* in return which points at the string your dynamically loaded function returned.

You can also use function pointers in your programs without involving shared libraries.

#include <stdio.h>

long foo(short x, int y) {
return x + y;
}

int main() {
long(*foo_ptr)(short, int) = foo;

// foo_ptr is a pointer to a function taking (short, int) as
// arguments and returning a long

printf("%ld\n", foo(1, 2) ); // prints 3
printf("%ld\n", foo_ptr(1, 2) ); // also prints 3
}

dlsym-like functionality for non-dynamically-loaded code?

You can indeed just use dlsym() for that purpose.. You just have to export all symbols to the dynamic symbol table. Link the binary with gcc -rdynamic for that.

Example:

#include <stdio.h>
#include <dlfcn.h>

void foo (void) {
puts("foo");
}

int main (void) {
void (*foo)(void) = dlsym(NULL, "foo");
foo();
return 0;
}

Compile with: gcc -rdynamic -O2 dl.c -o dl -ldl

$ ./dl
foo
$

Manually loading libcrypto (dlmopen, dlsym) segfaults; dynamically linked works

Thanks for providing excellent repro instructions.

What Am I doing wrong?

You are using dlmopen, which is a minefield.

I suspect you are doing this in order to to have several incompatible versions of OpenSSL in a single process. My advice: just don't do it™️.

What's happening... is complicated.

Let's call the first libpthread.so linked into ./a.out P1, and the second copy (which is brought in via dlmopen as a dependency of libcrypto.so) P2. Let's call the dlmopened version of libcrypto C2.

Both P1 and P2 have separate hidden variables __pthread_keys. When C2 calls P2:pthread_key_create, that function looks in P2:__pthread_keys, and discovers that no keys have been used (which is true in P2, but not in P1 -- the loader has already used some keys from P1:__pthread_keys).

So C2 gets an answer from P2:pthread_key_create -- use key==0 (P2 is oblivious to the fact that key==0 has already been used in P1!).

Now C2 calls P2:pthread_getspecific(0), and expects to get a NULL back -- it hasn't called pthread_setspecific(0, ...) yet.

But pthread_getspecific looks in the thread control block, which is unique for the given thread and shared between P1 and P2, and herein lies the disaster: P2 doesn't get a NULL, it gets whatever P1:pthread_setspecific(0, ...) has set previously!

At that point, C2 decides that some other code must have already set up C2's thread-local data appropriately, and proceeds to use that data, with the resulting SIGSEGV.

So who calls P1:pthread_setspecific? It happens here:

Breakpoint 2, __GI___pthread_setspecific (key=0, value=value@entry=0x5555555592a0) at pthread_setspecific.c:33
33 pthread_setspecific.c: No such file or directory.
(gdb) bt
#0 __GI___pthread_setspecific (key=0, value=value@entry=0x5555555592a0) at pthread_setspecific.c:33
#1 0x00007ffff7f9eb3c in _dlerror_run (operate=operate@entry=0x7ffff7f9ee90 <dlmopen_doit>, args=args@entry=0x7fffffffdb60) at dlerror.c:157
#2 0x00007ffff7f9efd9 in __dlmopen (nsid=<optimized out>, file=<optimized out>, mode=<optimized out>) at dlmopen.c:93
#3 0x00005555555551f8 in main ()

And the subsequent call to P2:pthread_getspecific (note the same key==0 being re-used) happens here:

#0  __GI___pthread_getspecific (key=0) at pthread_getspecific.c:30
#1 0x00007ffff77985cd in CRYPTO_THREAD_get_local (key=<optimized out>) at crypto/threads_pthread.c:160
#2 0x00007ffff778a2d2 in get_thread_default_context () at crypto/context.c:166
#3 0x00007ffff778a2ee in get_default_context () at crypto/context.c:171
#4 0x00007ffff778a43b in ossl_lib_ctx_get_concrete (ctx=<optimized out>) at crypto/context.c:278
#5 0x00007ffff778a681 in ossl_lib_ctx_get_data (ctx=<optimized out>, index=index@entry=0, meth=meth@entry=0x7ffff79684e0) at crypto/context.c:356
#6 0x00007ffff776ab6c in get_evp_method_store (libctx=<optimized out>) at crypto/evp/evp_fetch.c:82
#7 0x00007ffff776ab9b in inner_evp_generic_fetch (methdata=methdata@entry=0x7fffffffd9c0, prov=<optimized out>, prov@entry=0x0, operation_id=operation_id@entry=10, name_id=name_id@entry=0, name=0x7ffff78aa0ac "X25519", properties=0x0, new_method=0x7ffff7772e68 <keymgmt_from_algorithm>, up_ref_method=0x7ffff7772d7a <EVP_KEYMGMT_up_ref>,
free_method=0x7ffff7772d88 <EVP_KEYMGMT_free>) at crypto/evp/evp_fetch.c:248
#8 0x00007ffff776b37a in evp_generic_fetch (libctx=<optimized out>, operation_id=operation_id@entry=10, name=<optimized out>, properties=<optimized out>, new_method=new_method@entry=0x7ffff7772e68 <keymgmt_from_algorithm>, up_ref_method=up_ref_method@entry=0x7ffff7772d7a <EVP_KEYMGMT_up_ref>, free_method=0x7ffff7772d88 <EVP_KEYMGMT_free>)
at crypto/evp/evp_fetch.c:372
#9 0x00007ffff77732e8 in EVP_KEYMGMT_fetch (ctx=<optimized out>, algorithm=<optimized out>, properties=<optimized out>) at crypto/evp/keymgmt_meth.c:230
#10 0x00007ffff777d066 in int_ctx_new (libctx=0x0, pkey=pkey@entry=0x0, e=e@entry=0x0, keytype=0x7ffff78aa0ac "X25519", propquery=0x0, id=<optimized out>, id@entry=-1) at crypto/evp/pmeth_lib.c:280
#11 0x00007ffff777d299 in EVP_PKEY_CTX_new_from_name (libctx=<optimized out>, name=<optimized out>, propquery=<optimized out>) at crypto/evp/pmeth_lib.c:368
#12 0x00007ffff7778c70 in new_raw_key_int (libctx=libctx@entry=0x0, strtype=strtype@entry=0x0, propq=propq@entry=0x0, nidtype=1034, e=0x0, key=0x555555556020 <scalar> "\001\002\003\004\005\006\a\b\t\020\021\022\023\024\025\026\027\030\031 !\"#$%&'()012main.c", len=32, key_is_priv=1) at crypto/evp/p_lib.c:406
#13 0x00007ffff7778f3e in EVP_PKEY_new_raw_private_key (type=<optimized out>, e=<optimized out>, priv=<optimized out>, len=<optimized out>) at crypto/evp/p_lib.c:497
#14 0x000055555555529d in main ()

P.S. This only took me 3 hours to debug, and is probably only the first of many problems you are likely to encounter.

P.P.S. Indeed this is only the first problem of many. See this GLIBC bug.

Loading so files with dlsym, cannot load library

Read carefully and several times the dlopen(3) man page.

When the flle path does not contain any / some specific processing happens.

You should dlopen a path like ./foo.so (otherwise add . at the end of your LD_LIBRARY_PATH environment variable, but this might open a security risk, so I don't advise doing that)

(always test against NULL result both dlopen and dlsym function calls and display dlerror() on failure)

BTW, your plugin manager should be compiled with:

      g++ -Wall -g plugin_manager.cpp -o -plugin_manager -ldl

Don't mention plugin_manager.o

How to properly setup dynamic library loading with header file?

You have a few options, in order of preference:

  1. Get the libraries from the maintainer. Providing the header but not the library (at least a stub library like we do for libraries in the NDK) just won't work.
  2. Build your own stub library. It's pretty straightforward if you have a list of symbols to expose. Put int foo; void bar() {} in a C file for all the variables and functions you need to expose and build it as a shared lib. If you have the list of symbols in a version script, you might be able to use Android's gen_stub_libs.py to do it for you.
  3. Mark all the symbols with __attribute__((weak)) in the header file. The linker won't complain that they are missing. If they're missing at runtime, the library will still load but each function's address will be nullptr. Not really what you want in most cases because if your definition of the library is wrong, you turn build time failures into runtime failures, but in some cases this can be handy because it's easier to check for function availability with if (foo) { foo(); } then to do similar with dlsym.
  4. Add -Wl,--allow-shlib-undefined to your ldflags. This is even worse than 3 because it affects all the libraries you link, but it wouldn't require you to meddle with the header.

Different version require of CMake results dlopen undefined symbol

This is caused by policy CMP0065 which states:

New in version 3.4.

Do not add flags to export symbols from executables without the ENABLE_EXPORTS target property.

CMake 3.3 and below, for historical reasons, always linked executables on some platforms with flags like -rdynamic to export symbols from the executables for use by any plugins they may load via dlopen. CMake 3.4 and above prefer to do this only for executables that are explicitly marked with the ENABLE_EXPORTS target property.

The OLD behavior of this policy is to always use the additional link flags when linking executables regardless of the value of the ENABLE_EXPORTS target property.

The NEW behavior of this policy is to only use the additional link flags when linking executables if the ENABLE_EXPORTS target property is set to True.

This policy was introduced in CMake version 3.4. Unlike most policies, CMake version 3.24.0-rc2 does not warn by default when this policy is not set and simply uses OLD behavior. See documentation of the CMAKE_POLICY_WARNING_CMP0065 variable to control the warning.

Note: The OLD behavior of a policy is deprecated by definition and may be removed in a future version of CMake.

Hence, CMake is working as intended. Indeed this is a reasonable policy; adding such flags affects how the linker loads libraries and can have performance consequences when such functionality isn't actually used.

To fix your program, write:

add_executable(loader
Loader.cpp
common/log/Log2Console.cpp
common/misc/Converts.cpp
)
target_link_libraries(loader PRIVATE ${CMAKE_DL_LIBS})
set_target_properties(loader PROPERTIES ENABLE_EXPORTS TRUE)

The last line is the important one for your problem; it will make sure that symbols in your executable are available to dynamically loaded libraries/plugins. The change from dl to ${CMAKE_DL_LIBS} is to account for systems which don't have a separate dl lib or have it named differently (for instance most BSDs including macOS, and HP-UX).

Why does this dynamic library loading code work with gcc?

As aschepler says, its because you got lucky.

As it turns out, the ABI used for gcc (and most other compilers) for both x86 and x64 returns 'large' structs (too big to fit in a register) by passing an extra 'hidden' pointer arg to the function, which uses that pointer as space to store the return value, and then returns the pointer itself. So it turns out that a function of the form

struct foo func(...)

is roughly equivlant to

struct foo *func(..., struct foo *)

where the caller is expected to allocate space for a 'foo' (probably on the stack) and pass in a pointer to it.

So it just happens that if you have a function that is expecting to be called this way (expecting to return a struct) and instead call it via a function pointer that returns a pointer, it MAY appear to work -- if the garbage bits it gets for the extra arg (random register contents left there by the caller) happen to point to somewhere writable, the called function will happily write its return value there and then return that pointer, so the called code will get back something that looks a like a valid pointer to the struct it is expecting. So the code may superficially appear to work, but its actually probably clobbering a random bit of memory that may be important later.



Related Topics



Leave a reply



Submit