Does Linking an '-Lpthread' Changes Application Behaviour? (Linux, Glibc)

Does linking an `-lpthread` changes application behaviour? (Linux, Glibc)

glibc itself contains stub code for many pthread functions. These glibc pthread functions do nothing. However, when the program is linked with libpthread then those stubs are replaced with the real pthread locking functions.

This is intended for use in libraries that need to be thread safe but do not use threads themselves. These libraries can use pthread locks, but those locks will not actually happen until a program or library that links to libpthread is loaded.

How to explain this glibc modification on libpthread?

From the standard:

Attempting to destroy a condition variable upon which other threads
are currently blocked results in undefined behavior.

From my perspective it brings in a defect.

Your program depended on undefined behavior. Now you pay the price of doing so.

undefined behavior in shared lib using libpthread, but not having it in ELF as dependency

There's a lot happening here: differences between gcc and clang, differences between gnu ld and gold, the --as-needed linker flag, two different failure modes, and maybe even some timing issues.

Let's start with how to link a program using POSIX threads.

The compiler's -pthread flag is all you should need. It's a compiler flag, so you should use it both when compiling code that uses threads and when linking the final executable. When you use -pthread on the link step, the compiler will provide the -lpthread flag automatically, and in the right place in the link line.

Typically, you would only use it when linking the final executable, and not when linking a shared library. If you simply want to make your library thread safe, but don't want to force every program that uses your library to link with pthreads, you'd want to use a runtime check to see if the pthreads library is loaded, and call the pthread APIs only if it is. On Linux, this is typically done by checking a "canary" -- for example, make a weak reference to an arbitrary symbol like __pthread_key_create, which will only be defined if the library is loaded, and will have the value 0 if the program was linked without it.

In your case, however, your library libodr.so pretty much depends on threads, so it's reasonable to link it with the -pthread flag.

That brings us to the first failure mode: if you use g++ and gold for both link steps, the program throws std::system_error and says you need to enable multithreading. This is due to the --as-needed flag. GCC passes --as-needed to the linker by default, while clang (apparently) does not. With --as-needed, the linker will only record library dependencies that resolve a strong reference. Since all the references to pthread APIs are weak, none of them are sufficient to tell the linker that libpthread.so should be added to the dependency list (via a DT_NEEDED entry in the dynamic table). Changing to clang or adding a -Wl,--no-as-needed flag solves this problem, and the program will load the pthread library.

But, wait, why don't you need to do this when using the Gnu linker? It uses the same rule: only a strong reference causes the library to be recorded as a dependency. The difference is that Gnu ld also considers references from other shared libraries, while gold only considers references from regular object files. It turns out that the pthread library provides overriding definitions of several libc symbols, and there are strong references from libstdc++.so to some of those symbols (e.g., write). Those strong references are enough to get Gnu ld to record libpthread.so as a dependency. This is more of an accident than design; I don't think changing gold to consider references from other shared libraries would actually be a robust fix. I think the proper solution is for GCC to put --no-as-needed in front of the -lpthread flag when you use -pthread.

This begs the question of why this issue doesn't come up all the time when using POSIX threads and the gold linker. But this is a small test program; a larger program is almost certain to contain strong references to some of those libc symbols that libpthread.so overrides.

Now let's look at the second failure mode, where both Notify() and Get() block indefinitely if you link libodr.so with g++, gold and -lpthread.

In Notify(), you're holding the lock through the end of the function, while you call cv.notify_one(). You really only need to hold the lock to set the ready flag; if we change it so that we release the lock before that, then the thread calling Get() will timeout after 300 ms, and does not block. So it's really the call to notify_one() that's blocking, and the program is deadlocking because Get() is waiting on that same lock.

So why does it block only when __pthread_key_create is FUNC instead of NOTYPE? I think the type of the symbol is a red herring, and that the real problem is caused by the fact that gold doesn't record the symbol versions for references resolved by a library that isn't added as a needed library. The implementation of wait_for calls pthread_cond_timedwait, which has two versions in both libpthread and libc. It's possible that the loader is binding the reference to the wrong version, causing a deadlock by failing to unlock the mutex. I made a temporary patch to gold to record those versions, and that made the program work. Unfortunately, that's not a solution, as that patch can cause ld.so to crash under other circumstances.

I tried changing cv.wait_for(...) to cv.wait(lock, []{ return ready; }), and the program runs perfectly in all scenarios, which further suggests that the problem is with pthread_cond_timedwait.

The bottom line is that adding the --no-as-needed flag will fix the problem for this very small test case. Anything larger is likely to work without the extra flag, as you'll be increasing the odds of making a strong reference to a symbol in libpthread. (For example, adding a call to std::this_thread::sleep_for anywhere in odr.cpp adds a strong reference to nanosleep, which puts libpthread in the needed list.)

Update: I've verified that the failing program is linking to the wrong version of pthread_cond_timedwait. For glibc 2.3.2, the pthread_cond_t type was changed, and the old versions of the APIs that use the type were changed to dynamically allocate a new (bigger) structure and store a pointer to it in the original type. So now, if the consuming thread reaches cv.wait_for before the producing thread reaches cv.notify_one, the implementation of cv.wait_for calls the old version of pthread_cond_timedwait, which initializes what it thinks is an old pthread_cond_t in cv with a pointer to a new pthread_cond_t. After that, when the other thread reaches cv.notify_one, its implementation assumes that cv contains a new-style pthread_cond_t rather than a pointer to one, so it calls pthread_mutex_lock with the pointer to the new pthread_cond_t instead of the pointer to the mutex. It locks that would-be mutex, but it never gets unlocked because the other thread unlocks the real mutex.

Why no error without library specifiers in Linux?

Can't reproduce:

#include <pthread.h>

void *thread(void *arg)
{
(void) arg;
return 0;
}

int main(void)
{
pthread_t t;
pthread_create(&t, 0, thread, 0);
return 0;
}

Trying to link without libpthread:

> gcc -Wall -o thread thread.c
/tmp/ccyyu0cn.o: In function `main':
thread.c:(.text+0x2e): undefined reference to `pthread_create'
collect2: error: ld returned 1 exit status

edit: you can check the symbols defined in a library with nm -D, e.g. in my case:

> nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep pthread_create
> nm -D /lib/x86_64-linux-gnu/libpthread.so.0 | grep pthread_create
00000000000082e0 T pthread_create

(so pthread_create is not found in libc, but indeed in libpthread)

edit2: The only possible reason for the behavior you claim to observe would be that one of the libraries linked per default (libc, maybe libgcc) defines pthread_create. Then it would probably still be dependent on things only defined in libpthread. I now wonder whether this really is the case for some particular version. Please give feedback.

-pthread, -lpthread and minimal dynamic linktime dependencies

The idea that you should use GCC's special option -pthread instead of -lpthread is outdated by probably some decade and a half (with respect to glibc, that is). In modern glibc, the switch to threading is entirely dynamic, based on whether the pthreads library is linked or not. Nothing in the glibc headers changes its behavior based on whether _REENTRANT is defined.

As an example of the dynamic switching, consider FILE * streams. Certain operations on streams are locking, like putc. Whether you're compiling a single-threaded program or not, it calls the same putc function; it is not re-routed by the preprocessor to a "pthread-aware" putc. What happens is that do-nothing stub functions are used to go through the motions of locking and unlocking. These functions get overridden to real ones when the threading library is linked in.



I just did a cursory grep through the include file tree of a glibc installation. In features.h, _REENTRANT causes __USE_REENTRANT to be defined. In turn, exactly one thing seems to depend on whether __USE_REENTRANT is present, but has a parallel condition which also enables it. Namely, in <unistd.h> there is this:

#if defined __USE_REENTRANT || defined __USE_POSIX199506
/* Return at most NAME_LEN characters of the login name of the user in NAME.
If it cannot be determined or some other error occurred, return the error
code. Otherwise return 0.

This function is a possible cancellation point and therefore not
marked with __THROW. */
extern int getlogin_r (char *__name, size_t __name_len) __nonnull ((1));
#endif

This looks dubious and is obsolete; I can't find it in the master branch of the glibc git repo.

And, oh look, just mere days ago (December 6) a commit was made on this topic:

https://sourceware.org/git/?p=glibc.git;a=commit;h=c03073774f915fe7841c2b551fe304544143470f

Make _REENTRANT and _THREAD_SAFE aliases for _POSIX_C_SOURCE=199506L.

For many years, the only effect of these macros has been to make
unistd.h declare getlogin_r. _POSIX_C_SOURCE >= 199506L also causes
this function to be declared. However, people who don't carefully
read all the headers might be confused into thinking they need to
define _REENTRANT for any threaded code (as was indeed the case a long
time ago).

Among the changes:

--- a/posix/unistd.h
+++ b/posix/unistd.h
@@ -849,7 +849,7 @@ extern int tcsetpgrp (int __fd, __pid_t __pgrp_id) __THROW;
This function is a possible cancellation point and therefore not
marked with __THROW. */
extern char *getlogin (void);
-#if defined __USE_REENTRANT || defined __USE_POSIX199506
+#ifdef __USE_POSIX199506
/* Return at most NAME_LEN characters of the login name of the user in NAME.
If it cannot be determined or some other error occurred, return the error
code. Otherwise return 0.

See? :)

What are the correct link options to use std::thread in GCC under linux?

I think on Linux pthread is used to implement std::thread so you need to specify the -pthread compiler option.

As this is a linking option, this compiler option need to be AFTER the source files:

$ g++ -std=c++0x test.cpp -pthread


Related Topics



Leave a reply



Submit