Calling Pthread_Cond_Signal Without Locking Mutex

Calling pthread_cond_signal without locking mutex

If you do not lock the mutex in the codepath that changes the condition and signals, you can lose wakeups. Consider this pair of processes:

Process A:

pthread_mutex_lock(&mutex);
while (condition == FALSE)
    pthread_cond_wait(&cond, &mutex);
pthread_mutex_unlock(&mutex);

Process B (incorrect):

condition = TRUE;
pthread_cond_signal(&cond);

Then consider this possible interleaving of instructions, where condition starts out as FALSE:

Process A                             Process B

pthread_mutex_lock(&mutex);
while (condition == FALSE)

                                      condition = TRUE;
                                      pthread_cond_signal(&cond);

pthread_cond_wait(&cond, &mutex);

The condition is now TRUE, but Process A is stuck waiting on the condition variable - it missed the wakeup signal. If we alter Process B to lock the mutex:

Process B (correct):

pthread_mutex_lock(&mutex);
condition = TRUE;
pthread_cond_signal(&cond);
pthread_mutex_unlock(&mutex);

...then the above cannot occur; the wakeup will never be missed.

(Note that you can actually move the pthread_cond_signal() itself after the pthread_mutex_unlock(), but this can result in less optimal scheduling of threads, and you've necessarily locked the mutex already in this code path due to changing the condition itself).

What happens to a thread calling pthread_cond_signal?

The thread that calls pthread_cond_signal returns immediately. It does not wait for the woken thread (if there is one) to do anything.

If you call pthread_cond_signal while holding the mutex that the blocked thread is using with pthread_cond_wait, then the blocked thread will potentially wake from the condition variable wait, then immediately block waiting for the mutex to be acquired, since the signalling thread still holds the lock.

For the best performance, you should unlock the mutex prior to calling pthread_cond_signal.

Also, pthread_cond_wait may return even though no thread has signalled the condition variable. This is called a "spurious wake". You typically need to use pthread_cond_wait in a loop:

pthread_mutex_lock(&mut);
while(!ready){
    pthread_cond_wait(&cond,&mut);
}
// do stuff
pthread_mutex_unlock(&mut);

The signalling thread then sets the flag before signalling:

pthread_mutex_lock(&mut);
ready=1
pthread_mutex_unlock(&mut);
pthread_cond_signal(&cond);

Does pthread_cond_signal function unlock the mutex the calling thread locked?

Only the mutex given to pthread_cond_(timed_)wait() is unlocked to give other threads the chance to change the condition. At the end of pthread_cond_wait, the mutex is locked again. No other functions lock/unlock mutexes.

Not locking mutex for pthread_cond_timedwait and pthread_cond_signal ( on Linux )

The first is not OK:

The pthread_cond_timedwait() and
pthread_cond_wait() functions shall
block on a condition variable. They
shall be called with mutex locked by
the calling thread or undefined
behavior results.

http://opengroup.org/onlinepubs/009695399/functions/pthread_cond_timedwait.html

The reason is that the implementation may want to rely on the mutex being locked in order to safely add you to a waiter list. And it may want to release the mutex without first checking it is held.

The second is disturbing:

if predictable scheduling behaviour is
required, then that mutex is locked by
the thread calling
pthread_cond_signal() or
pthread_cond_broadcast().

http://www.opengroup.org/onlinepubs/007908775/xsh/pthread_cond_signal.html

Off the top of my head, I'm not sure what the specific race condition is that messes up scheduler behaviour if you signal without taking the lock. So I don't know how bad the undefined scheduler behaviour can get: for instance maybe with broadcast the waiters just don't get the lock in priority order (or however your particular scheduler normally behaves). Or maybe waiters can get "lost".

Generally, though, with a condition variable you want to set the condition (at least a flag) and signal, rather than just signal, and for this you need to take the mutex. The reason is that otherwise, if you're concurrent with another thread calling wait(), then you get completely different behaviour according to whether wait() or signal() wins: if the signal() sneaks in first, then you'll wait for the full timeout even though the signal you care about has already happened. That's rarely what users of condition variables want, but may be fine for you. Perhaps this is what the docs mean by "unpredictable scheduler behaviour" - suddenly the timeslice becomes critical to the behaviour of your program.

Btw, in Java you have to have the lock in order to notify() or notifyAll():

This method should only be called by a
thread that is the owner of this
object's monitor.

http://java.sun.com/j2se/1.4.2/docs/api/java/lang/Object.html#notify()

The Java synchronized {/}/wait/notifty/notifyAll behaviour is analogous to pthread_mutex_lock/pthread_mutex_unlock/pthread_cond_wait/pthread_cond_signal/pthread_cond_broadcast, and not by coincidence.

pthread_cond_signal() not giving enough time for the signaled thread to run

Signalling the condition variable does not give any kind of priority on locking the mutex to a thread that was waiting on that condition variable. All it means is that at least one thread waiting on the condition variable will start trying to acquire the mutex so it can return from the pthread_cond_wait(). The signalling thread will keep executing and can easily re-acquire the mutex first, as you've seen.

You should never have a condition variable without an actual condition over some shared state that you're waiting for - returning from a pthread_cond_wait() doesn't mean a thread should definitely proceed, it means that it should check if the condition it was waiting for is true. That's why they're called condition variables.

In this case, the state your writing thread wants to wait for is "the main thread has consumed the last data I wrote.". However, your reading (main) thread also needs to wait on a condition - "the writing thread has written some new data". You can achieve both these conditions with a flag variable that indicates that some new, unconsumed data has been written to the data variable. The flag starts out unset, is set by the writing thread when it updates data, and is unset by the main thread when it reads from data. The writing thread waits for the flag to be unset, and the reading thread waits for the flag to be set.

With this arrangement, you also don't need to have the mutex locked when you start the writing thread - it doesn't matter which order the threads start, because everything is consistent either way.

The updated code looks like:

#include <stdio.h>
#include <pthread.h>

pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;

int data = 0;
int data_available = 0;

void *thread(void *arg)
{
    int length = *(int *) arg;
    for (int i = 0; i < length; i++) {
        // Do some work
        pthread_mutex_lock(&lock);
        fprintf(stdout, "Waiting to write\n");
        while (data_available)
            pthread_cond_wait(&cond, &lock);
        fprintf(stdout, "Writing\n");
        data = i;
        data_available = 1;
        pthread_cond_signal(&cond);
        pthread_mutex_unlock(&lock);
    }
    pthread_exit(0);
}


int main(int argc, const char *argv[])
{
    pthread_t worker;
    int length = 4;

    pthread_create(&worker, 0, thread, &length);

    for (int i = 0; i < length; i++) {
        pthread_mutex_lock(&lock);
        fprintf(stdout, "Waiting to read\n");
        while (!data_available)
            pthread_cond_wait(&cond, &lock);
        fprintf(stdout, "read data: %d\n", data);
        data_available = 0;
        pthread_cond_signal(&cond);
        pthread_mutex_unlock(&lock);
    }

    pthread_join(worker, NULL);
    return 0;
}

Of course, the threads end up working in lockstep - but essentially you have a producer-consumer with a maximum queue length of 1, so that's expected.

condition variable - why calling pthread_cond_signal() before calling pthread_cond_wait() is a logical error?

The answer of blaze comes closest, but is not totally clear:

conditional variables should only be used to signal a change in a condition.

Thread 1 checks a condition. If the condition doesn't meet, he waits on the condition variable until the condition meets. Because the condition is checked first, he shouldn't care whether the condition variable was signaled:

pthread_mutex_lock(&mutex); 
while (!condition)
    pthread_cond_wait(&cond, &mutex); 
pthread_mutex_unlock(&mutex);

Thread 2 changes the condition and signals the change via the condition variable. He doesn't care whether threads are waiting or not:

pthread_mutex_lock(&mutex); 
changeCondition(); 
pthread_mutex_unlock(&mutex); 
pthread_cond_signal(&cond)

The bottom line is: the communication is done via some condition. A condition variable only wakes up waiting threads so they can check the condition.

Examples for conditions:

Queue is not empty, so an entry can be taken from the queue
A boolean flag is set, so the thread wait s until the other thread signal it's okay to continue
some bits in a bitset are set, so the waiting thread can handle the corresponding events

Calling Pthread_Cond_Signal Without Locking Mutex