Shared Libraries (Dlopen) and Thread-Safety of Library Static Pointers

shared libraries (dlopen) and thread-safety of library static pointers

I'll expand on what Basile said. I followed up with glibc and found out dlopen there does in deed use mmap. All guarantees of memory visibility are assumed from the mmap system call, dlopen itself doesn't make any additional guarantees.

Users of mmap generally assume that it will map memory correctly across all processors at the point of its return such that visibility is not a concern. This does not appear to be an explicit guarantee, but the OS would probably be unusable without such a guarantee. There is also no known system where this doesn't work as expected.

What happens to the thread spawned by a shared library, upon dlclose

Congratulations - you've rediscovered that a non-nop dlclose implementation is fundamentally unsafe. The C language makes no provision for code or pseudo-static-storage data whose lifetime is anything other than the whole lifetime of the program, and in general library code cannot be safely removed since there are all sorts of ways that references to it may have leaked and still be reachable. You've actually found an exceptionally good example; common implementations attempt to catch and "fix" leaks via atexit and such by running the handlers at dlclose time, but there doesn't seem to be any way to catch and fix a thread left running.

As a workaround, there is a special ELF flag you can set by passing -Wl,-z,nodelete when linking the shared library libB.so (or libA.so if that's more convenient, e.g. if you don't have control over how libB.so is linked) that will prevent it from being unloaded by dlclose. Unfortunately, this design is backwards. Unloading is fundamentally unsafe unless a library is specifically written to be safe against unloading, so the default should be "nodelete" with an explicit option required to make unloading possible. Unfortunately there's little chance this will ever be fixed.

Another way to prevent unloading is have a constructor call dlopen on itself to leak a reference, so that the reference count will always be positive and dlclose will do nothing.

When using dlopen, shall I link against a library I open?

Your "partial answer" is a correct fix. For explanations, see that nice answer about virtual keyword.

In short:

Until init method in ntclass is declared as virtual, expression

A->init("A")

uses definition of the method in that class (independent on which actual type object A has). And because this definition is absent in main_dlopen.cpp, linker generates the error.

With virtual keyword, resolution of init method is deffered to runtime, when actual type of A object will be known.

using std::thread in shared library causes SIGSEGV

I guess the problem is more related to dynamic linking
than threads.

The call dlopen("libso.so", RTLD_LAZY) will try to
find the library in a standard location.

Except if you set the LD_LIBRARY_PATH environment
variable to something that includes . (the current
directory) this library won't be found.

For a simple test you can either:

use export LD_LIBRARY_PATH=. in the terminal before
launching your program,
use dlopen("./libso.so", RTLD_LAZY) in your source code.

After using dlopen() or dlsym() if you obtain a null
pointer, then dlerror() can help displaying the reason
of the failure.

Note that on Windows the current directory and the executable
path are standard search paths for dynamic libraries; on UNIX
this is not the case, which could be surprising when changing
the target platform.

edit

cmake uses the -Wl,-rpath option to hardcode a library search
path in the executable, so all of what I explained above becomes
useless for this problem.

Assuming the dynamic library is found, the only way I can reproduce
the crash is to forget pthread in target_link_libraries for
test.

second edit

I finally managed to reproduce the crash with Ubuntu (in WSL).

Apparently your linker decides to ignore the libraries that are
not directly used by the executable.

This behavior suggests that the linker option --as-needed is
switched on by default.

To contradict this default behaviour, you need to pass the linker
option --no-as-needed before -lpthread.

This way, you don't have to insert a dummy thread in your
executable.

Using set(CMAKE_CXX_FLAGS -Wl,--no-as-needed) in the CMakeLists.txt
file you provide did the trick for me.

What happens to global and static variables in a shared library when it is dynamically linked?

This is a pretty famous difference between Windows and Unix-like systems.

No matter what:

Each process has its own address space, meaning that there is never any memory being shared between processes (unless you use some inter-process communication library or extensions).
The One Definition Rule (ODR) still applies, meaning that you can only have one definition of the global variable visible at link-time (static or dynamic linking).

So, the key issue here is really visibility.

In all cases, static global variables (or functions) are never visible from outside a module (dll/so or executable). The C++ standard requires that these have internal linkage, meaning that they are not visible outside the translation unit (which becomes an object file) in which they are defined. So, that settles that issue.

Where it gets complicated is when you have extern global variables. Here, Windows and Unix-like systems are completely different.

In the case of Windows (.exe and .dll), the extern global variables are not part of the exported symbols. In other words, different modules are in no way aware of global variables defined in other modules. This means that you will get linker errors if you try, for example, to create an executable that is supposed to use an extern variable defined in a DLL, because this is not allowed. You would need to provide an object file (or static library) with a definition of that extern variable and link it statically with both the executable and the DLL, resulting in two distinct global variables (one belonging to the executable and one belonging to the DLL).

To actually export a global variable in Windows, you have to use a syntax similar to the function export/import syntax, i.e.:

#ifdef COMPILING_THE_DLL
#define MY_DLL_EXPORT extern "C" __declspec(dllexport)
#else
#define MY_DLL_EXPORT extern "C" __declspec(dllimport)
#endif

MY_DLL_EXPORT int my_global;

When you do that, the global variable is added to the list of exported symbols and can be linked like all the other functions.

In the case of Unix-like environments (like Linux), the dynamic libraries, called "shared objects" with extension .so export all extern global variables (or functions). In this case, if you do load-time linking from anywhere to a shared object file, then the global variables are shared, i.e., linked together as one. Basically, Unix-like systems are designed to make it so that there is virtually no difference between linking with a static or a dynamic library. Again, ODR applies across the board: an extern global variable will be shared across modules, meaning that it should have only one definition across all the modules loaded.

Finally, in both cases, for Windows or Unix-like systems, you can do run-time linking of the dynamic library, i.e., using either LoadLibrary() / GetProcAddress() / FreeLibrary() or dlopen() / dlsym() / dlclose(). In that case, you have to manually get a pointer to each of the symbols you wish to use, and that includes the global variables you wish to use. For global variables, you can use GetProcAddress() or dlsym() just the same as you do for functions, provided that the global variables are part of the exported symbol list (by the rules of the previous paragraphs).

And of course, as a necessary final note: global variables should be avoided. And I believe that the text you quoted (about things being "unclear") is referring exactly to the platform-specific differences that I just explained (dynamic libraries are not really defined by the C++ standard, this is platform-specific territory, meaning it is much less reliable / portable).

Injecting dynamic lib into a thread

The shared libraries are loaded into a process. It shouldn't matter what thread you've used to load the library, the code should equally be available to all the threads once the library has been loaded. This is because the POSIX threads share the same memory space, namely the memory space of the process.

When a CPU jumps to a memory address belonging to a loaded library, the memory is already there, mapped by the operating system into the process memory space. Threads has no influence whatsoever on the availability of the said memory address.

Keep in mind though, that according to the library documentation you might want to avoid calling Library::new from multiple threads simultaneously.

What's a good, threadsafe, way to pass error strings back from a C shared library

My approach to the problem would be a little different from everyone else's. They're not wrong, it's just that I've had to wrestle with a different aspect of this problem.

A C API needs to provide numeric error codes, so that the code using the API can take sensible measures to recover from errors when appropriate, and pass them along when not. The errno.h codes demonstrate a good categorization of errors; in fact, if you can reuse those codes (or just pass them along, e.g. if all your errors come ultimately from system calls), do so.
- Do not copy errno itself. If possible, return error codes directly from functions that can fail. If that is not possible, have a GetLastError() method on your state object. You have a state object, yes?
If you have to invent your own codes (the errno.h codes don't cut it), provide a function analogous to strerror, that converts these codes to human-readable strings.
- It may or may not be appropriate to translate these strings. If they're meant to be read only by developers, don't bother. But if you need to show them to the end user, then yeah, you need to translate them.
- The untranslated version of these strings should indeed be just string constants, so you have no allocation headaches. However, do not waste time and effort coding your own translation infrastructure. Use GNU gettext.
If your code is layered on top of another piece of code, it is vital that you provide direct access to all the error information and relevant context information that that code produces, and you make it easy for developers against your code to wrap up all that information in an error message for the end user.
- For instance, if your library produces error codes of its own devising as a direct consequence of failing system calls, your state object needs methods that return the errno value observed immediately after the system call that failed, the name of the file involved (if any), and ideally also the name of the system call itself. People get this wrong waaay too often -- for instance, SQLite, otherwise a well designed API, does not expose the errno value or the name of the file, which makes it infuriatingly hard to distinguish "the file permissions on the database are wrong" from "you have a bug in your code".

EDIT: Addendum: common mistakes in this area include:

Contorting your API (e.g. with use of out-parameters) so that functions that would naturally return some other value can return an error code.
Not exposing enough detail for callers to be able to produce an error message that allows a knowledgeable human to fix the problem. (This knowledgeable human may not be the end user. It may be that your error messages wind up in server log files or crash reports for developers' eyes only.)
Exposing too many different fine distinctions among errors. If your callers will never plausibly do different things in response to two different error codes, they should be the same code.
Providing more than one success code. This is asking for subtle bugs.

Also, think very carefully about which APIs ought to be allowed to fail. Here are some things that should never fail:

Read-only data accessors, especially those that return scalar quantities, most especially those that return Booleans.
Destructors, in the most general sense. (This is a classic mistake in the UNIX kernel API: close and munmap should not be able to fail. Thankfully, at least _exit can't.)
There is a strong case that you should immediately call abort if malloc fails rather than trying to propagate it to your caller. (This is not true in C++ thanks to exceptions and RAII -- if you are so lucky as to be working on a C++ project that uses both of those properly.)

In closing: for an example of how to do just about everything wrong, look no further than XPCOM.

In C++, can two other different shared objects access a Singleton from a third shared object?

A dynamic library (shared object), which includes a singleton class with a thread-safe Queue.

Singleton are used when you want to constraint a class to be instantiated only once. That's not what you want: you want all your plugins to work on a particular instance of a class. There is no "only one can live" requirement here.

A thread-safe singleton in C++11 using Meyer's pattern may looks like this:

class Singleton
{
private:
    Singleton();

public:
    Singleton(const &Singleton) = delete;
    Singleton& operator=(const &Singleton) = delete;

    static Singleton& get_instance()
    {
         static Singleton s;
         return s;
    }
};

Default constructor is declared private, and copy/assignment operations are deleted to avoid multiple instances.

You need something more simple: a function always returning the same instance. Something like this:

class Manager
{
public:
    static Resource& get_resource()
    {
         static Resource r;
         return r;
    }
};

No need to prevent multiple instantiation: if you want the same instance, just ask for the same instance.

You can also extend the design with a resource pool returning a same instance given some id:

enum class ResourceId
{
    ID_FOR_A_FAMILY_OF_PLUGIN,
    ID_FOR_AN_OTHER_FAMILY_OF_PLUGIN
};

class Pool
{
public:
    static Resource& get_resource(ResourceId id)
    {
         static std::map<ResourceId, Resource> p;
         return p[id];
    }
};

Note that in this example p[id] is created on the fly with Resource's default constructor. You may want to pass parameters during construction:

class Resource
{
public:
    Resource():ready(false){}

    void init(some parameters)
    {
        // do some intialization
        ready = true;
    }

    bool is_ready() const { return ready; }

private:
    bool ready;
};

class Pool
{
public:
    static Resource& get_resource(ResourceId id)
    {
         static std::map<ResourceId, Resource> p;
         auto& r = p[id];
         if(!r.is_ready())
         {
             r.init(some parameters);
         }
         return r;
    }
};

Or, using pointers to allow polymorphism:

class Pool
{
public:
    static std::unique_ptr<Resource>& get_resource(ResourceId id)
    {
         static std::map<ResourceId, std::unique_ptr<Resource>> p;
         auto& r = p[id];
         if(!r)
         {
             r = std::make_unique<SomeResourceTypeForId>(some parameters);
         }
         return r;
    }
};

Note that the last two implementations need a mutex around the non-static code to be thread-safe.

Shared Libraries (Dlopen) and Thread-Safety of Library Static Pointers