Linux Thread Synchronization

linux thread synchronization

Thanks to all who answered. We resorted to using gcc atomic operations to synchronize all of our threads. The atomic ops were about 2x slower than setting a value without synchronization, but magnitudes faster than locking a mutex, changeing the value, and then unlocking the mutex (this becomes super slow when you start having threads bang into the locks...) We only use pthread_create, attr, cancel, and kill. We use pthread_kill to signal threads to wake up that we put to sleep. This method is 40x faster than cond_wait. So basicly....use pthreads_mutexes if you have time to waste.

Synchronization of threads from different processes in C, Linux

I think the problem is your wait()’s. Main forks pid8, then waits for it to complete before moving on to fork pid2 ( which in turn forks pid4). Pid2 cannot run until pid8 has completed; so their threads cannot co-ordinate.

Instead of waiting after each fork, you should fire off (from main) all three,
then wait three times. In pid2, you should do the same thing.

I am not sure what the expected output is, but when I changed it, it went from printing 11 lines and hanging to generating this and exitting:

[ ] BEGIN P1 T0 pid=17861 ppid=17097 tid=1351051072
[ ] BEGIN P2 T0 pid=17863 ppid=17861 tid=1351051072
[ ] BEGIN P8 T0 pid=17862 ppid=17861 tid=1351051072
[ ] BEGIN P3 T0 pid=17864 ppid=17861 tid=1351051072
[ ] BEGIN P8 T1 pid=17862 ppid=17861 tid=1308952320
[ ] BEGIN P8 T5 pid=17862 ppid=17861 tid=1342523136
[ ] BEGIN P8 T4 pid=17862 ppid=17861 tid=1334130432
[ ] BEGIN P4 T0 pid=17867 ppid=17863 tid=1351051072
[ ] BEGIN P4 T6 pid=17867 ppid=17863 tid=1342523136
[ ] BEGIN P7 T0 pid=17871 ppid=17864 tid=1351051072
[ ] BEGIN P8 T2 pid=17862 ppid=17861 tid=1317345024
[ ] BEGIN P4 T3 pid=17867 ppid=17863 tid=1317345024
[ ] BEGIN P9 T0 pid=17879 ppid=17863 tid=1351051072
[ ] BEGIN P4 T5 pid=17867 ppid=17863 tid=1334130432
[ ] BEGIN P4 T1 pid=17867 ppid=17863 tid=1300559616
[ ] END P4 T5 pid=17867 ppid=17863 tid=1334130432
[ ] END P4 T6 pid=17867 ppid=17863 tid=1342523136
[ ] BEGIN P4 T2 pid=17867 ppid=17863 tid=1308952320
[ ] END P7 T0 pid=17871 ppid=17864 tid=1351051072
[ ] END P8 T2 pid=17862 ppid=17861 tid=1317345024
[ ] BEGIN P8 T3 pid=17862 ppid=17861 tid=1325737728
[ ] BEGIN P5 T0 pid=17873 ppid=17863 tid=1351051072
[ ] END P4 T3 pid=17867 ppid=17863 tid=1317345024
[ ] END P9 T0 pid=17879 ppid=17863 tid=1351051072
[ ] END P4 T1 pid=17867 ppid=17863 tid=1300559616
[ ] END P8 T1 pid=17862 ppid=17861 tid=1308952320
[ ] END P4 T2 pid=17867 ppid=17863 tid=1308952320
[ ] END P8 T3 pid=17862 ppid=17861 tid=1325737728
[ ] END P3 T0 pid=17864 ppid=17861 tid=1351051072
[ ] BEGIN P5 T34 pid=17873 ppid=17863 tid=1325737728
[ ] BEGIN P5 T33 pid=17873 ppid=17863 tid=1317345024
[ ] BEGIN P5 T31 pid=17873 ppid=17863 tid=1300559616
[ ] BEGIN P4 T4 pid=17867 ppid=17863 tid=1325737728
[ ] BEGIN P5 T36 pid=17873 ppid=17863 tid=1342523136
[ ] BEGIN P5 T30 pid=17873 ppid=17863 tid=1292166912
[ ] END P5 T33 pid=17873 ppid=17863 tid=1317345024
[ ] END P5 T30 pid=17873 ppid=17863 tid=1292166912
[ ] BEGIN P5 T35 pid=17873 ppid=17863 tid=1334130432
[ ] BEGIN P5 T2 pid=17873 ppid=17863 tid=1057171200
[ ] END P5 T36 pid=17873 ppid=17863 tid=1342523136
[ ] END P5 T34 pid=17873 ppid=17863 tid=1325737728
[ ] END P8 T5 pid=17862 ppid=17861 tid=1342523136
[ ] END P8 T4 pid=17862 ppid=17861 tid=1334130432
[ ] END P4 T4 pid=17867 ppid=17863 tid=1325737728
[ ] END P5 T31 pid=17873 ppid=17863 tid=1300559616
[ ] END P5 T35 pid=17873 ppid=17863 tid=1334130432
[ ] END P4 T0 pid=17867 ppid=17863 tid=1351051072
[ ] BEGIN P5 T26 pid=17873 ppid=17863 tid=1258596096
[ ] BEGIN P5 T28 pid=17873 ppid=17863 tid=1275381504
[ ] BEGIN P5 T29 pid=17873 ppid=17863 tid=1283774208
[ ] END P5 T28 pid=17873 ppid=17863 tid=1275381504
[ ] END P5 T2 pid=17873 ppid=17863 tid=1057171200
[ ] END P8 T0 pid=17862 ppid=17861 tid=1351051072
[ ] END P5 T26 pid=17873 ppid=17863 tid=1258596096
[ ] BEGIN P5 T27 pid=17873 ppid=17863 tid=1266988800
[ ] END P5 T29 pid=17873 ppid=17863 tid=1283774208
[ ] BEGIN P5 T25 pid=17873 ppid=17863 tid=1250203392
[ ] BEGIN P5 T24 pid=17873 ppid=17863 tid=1241810688
[ ] END P5 T27 pid=17873 ppid=17863 tid=1266988800
[ ] END P5 T25 pid=17873 ppid=17863 tid=1250203392
[ ] BEGIN P5 T23 pid=17873 ppid=17863 tid=1233417984
[ ] END P5 T24 pid=17873 ppid=17863 tid=1241810688
[ ] BEGIN P5 T22 pid=17873 ppid=17863 tid=1225025280
[ ] BEGIN P5 T17 pid=17873 ppid=17863 tid=1183061760
[ ] END P5 T22 pid=17873 ppid=17863 tid=1225025280
[ ] END P5 T23 pid=17873 ppid=17863 tid=1233417984
[ ] BEGIN P5 T14 pid=17873 ppid=17863 tid=1157883648
[ ] END P5 T17 pid=17873 ppid=17863 tid=1183061760
[ ] BEGIN P5 T18 pid=17873 ppid=17863 tid=1191454464
[ ] BEGIN P5 T13 pid=17873 ppid=17863 tid=1149490944
[ ] BEGIN P5 T15 pid=17873 ppid=17863 tid=1166276352
[ ] END P5 T14 pid=17873 ppid=17863 tid=1157883648
[ ] END P5 T18 pid=17873 ppid=17863 tid=1191454464
[ ] BEGIN P5 T11 pid=17873 ppid=17863 tid=1132705536
[ ] END P5 T13 pid=17873 ppid=17863 tid=1149490944
[ ] END P5 T15 pid=17873 ppid=17863 tid=1166276352
[ ] BEGIN P5 T10 pid=17873 ppid=17863 tid=1124312832
[ ] BEGIN P5 T5 pid=17873 ppid=17863 tid=1082349312
[ ] END P5 T11 pid=17873 ppid=17863 tid=1132705536
[ ] BEGIN P5 T7 pid=17873 ppid=17863 tid=1099134720
[ ] BEGIN P5 T4 pid=17873 ppid=17863 tid=1073956608
[ ] END P5 T5 pid=17873 ppid=17863 tid=1082349312
[ ] END P5 T7 pid=17873 ppid=17863 tid=1099134720
[ ] END P5 T4 pid=17873 ppid=17863 tid=1073956608
[ ] END P5 T10 pid=17873 ppid=17863 tid=1124312832
[ ] BEGIN P5 T6 pid=17873 ppid=17863 tid=1090742016
[ ] BEGIN P5 T3 pid=17873 ppid=17863 tid=1065563904
[ ] BEGIN P5 T8 pid=17873 ppid=17863 tid=1107527424
[ ] BEGIN P5 T9 pid=17873 ppid=17863 tid=1115920128
[ ] BEGIN P5 T12 pid=17873 ppid=17863 tid=1141098240
[ ] END P5 T6 pid=17873 ppid=17863 tid=1090742016
[ ] END P5 T12 pid=17873 ppid=17863 tid=1141098240
[ ] END P5 T8 pid=17873 ppid=17863 tid=1107527424
[ ] END P5 T9 pid=17873 ppid=17863 tid=1115920128
[ ] BEGIN P5 T20 pid=17873 ppid=17863 tid=1208239872
[ ] BEGIN P5 T21 pid=17873 ppid=17863 tid=1216632576
[ ] BEGIN P5 T19 pid=17873 ppid=17863 tid=1199847168
[ ] BEGIN P5 T16 pid=17873 ppid=17863 tid=1174669056
[ ] END P5 T3 pid=17873 ppid=17863 tid=1065563904
[ ] END P5 T20 pid=17873 ppid=17863 tid=1208239872
[ ] END P5 T21 pid=17873 ppid=17863 tid=1216632576
[ ] END P5 T19 pid=17873 ppid=17863 tid=1199847168
[ ] END P5 T16 pid=17873 ppid=17863 tid=1174669056
[ ] BEGIN P5 T32 pid=17873 ppid=17863 tid=1308952320
[ ] BEGIN P5 T1 pid=17873 ppid=17863 tid=1048778496
[ ] END P5 T1 pid=17873 ppid=17863 tid=1048778496
[ ] END P5 T32 pid=17873 ppid=17863 tid=1308952320
[ ] BEGIN P6 T0 pid=17916 ppid=17873 tid=1351051072
[ ] END P6 T0 pid=17916 ppid=17873 tid=1351051072
[ ] END P5 T0 pid=17873 ppid=17863 tid=1351051072
[ ] END P2 T0 pid=17863 ppid=17861 tid=1351051072
[ ] END P1 T0 pid=17861 ppid=17097 tid=1351051072

Thread synchronization in Linux?

I think you have a few problems here. With apologies (I'm on my phone so typing a long answer is hard) I'm just going to focus on a couple things, since it's not 100% clear to me what you're actually trying to do.

When all your threads start they all try to acquire the mutex, and only one succeeds. Probably l3 but I don't think that's guaranteed here. It then calls the pthread_cond_wait and unlocks the mutex allowing one of the other threads to reach its pthread_cond_wait. But in the meantime. You've allowed your main thread to call pthread_cond_broadcast, and you've taken no steps to synchronize this with the other threads. It may happen before the others get unblocked from waiting for the mutex, and before their wait call, so they could miss the signal and block forever.

Further, I think it's a bit sketchy to immediately call pthread_cond_destroy. Like I said there's no synchronization between your main thread and your worker threads, so it's possible you could call pthread_cond_broadcast followed by pthread_cond_destroy, so some of your threads might be calling pthread_cond_wait on an invalid condition variable and deadlock.

Check the return values of pthread_cond_wait. If I'm right, it might return EINVAL in some cases. But I haven't tested this so there might be a flaw in my reasoning.

C/Linux: Uses of thread synchronization

You are assuming that operations are atomic. Now assume that they are not, which is the case in real systems.

Being a global variable, count can be accessed by all threads in the system. This means that, without synchronization, all threads will perform the following operations in a non-deterministic and interleaved way:

for(int i = 0; i < 1000; i++){
aux = count;
aux++;
usleep(random() % 10);
count = aux;

}

To sum up, in each iteration of the loop each thread will have a copy of the value of count at a given time instant, increment that copy (aux++) and then count will be assigned the local value count = aux;.

Problem 1: The value count read by each thread may vary across threads as threads execute because one thread may be reading a value that is being modified by another thread (or several) at a time instant immediately after (remember, operations are not atomic and can be executed in an interleaved manner).

Problem 2: The value assigned to count is not protected by any locking mechanism which means that several threads may be executing this instruction in an interleaved manner or even at the same time (for instance, in a multiprocessor system this is possible). This means that one of the threads executing (which you don't know what it is) will set the value of count to aux in count = aux.

A simple example of a possible execution scenario:

For instance, let's assume three threads. Thread 1 reads value count = 100 and is preempted. Thread 2 reads value 100 and executes for some time setting count to (let's say) 300 and then it is preempted. Finally, thread 3 reads value 300 and executes some loop iterations. If Thread 1 executes again and sets the value count = aux, after one loop iteration the value will be set to 101. See the problem!

Synchronization is needed to make sure that only one thread is doing the read, increment and assignment, in fact to make operations behave as if they were atomic.


Q: If I put a semaphore, sem_wait before the for loop, and sem_post after the for loop, does that mean that my threads are not running in parallel anymore?

A: That means that each thread will interleave the execution of the for loop. For instance, thread 1 will execute let's say 100 iterations of the for loop, thread 2 will execute 200 iterations, etc. Remember: the scheduler controls the execution of each thread and therefore the number of iterations is not controlled by the user. Your code is synchronized but not in an ideal manner.


Q: Where should I put sem_wait and sem_post in order for my threads to be correctly synchronized?

A: You should use semaphores for the smallest number of operations possible that require synchronization in order to benefit as most from concurrent/parallel code execution. For instance, using semaphores your code could be:

for(int i = 0; i < 1000; i++){
sem_wait(...);
count++;
sem_post(...);
}

You don't need aux anymore as the semaphore ensures that only one thread is incrementing the value of count.

Note, as you are using threads you can also use mutexes instead of semaphores.

I hope this clarifies your doubts.

Synchronization among 2 threads in linux pthreads

You need a mutex, a condition variable and a helper variable.

in thread 1:

pthread_mutex_lock(&mtx);

// We wait for helper to change (which is the true indication we are
// ready) and use a condition variable so we can do this efficiently.
while (helper == 0)
{
pthread_cond_wait(&cv, &mtx);
}

pthread_mutex_unlock(&mtx);

in thread 2:

pthread_mutex_lock(&mtx);

helper = 1;
pthread_cond_signal(&cv);

pthread_mutex_unlock(&mtx);

The reason you need a helper variable is because condition variables can suffer from spurious wakeup. It's the combination of a helper variable and a condition variable that gives you exact semantics and efficient waiting.

How to synchronize threads without blocking?

You cannot synchronize threads without blocking by the very definition of synchronization. However, good synchronization technique will limit the scope of where things are blocked to the absolute minimum. To illustrate, and point out exactly why the article is wrong consider the following:

From the article:

pthread_t tid[2];
int counter;
pthread_mutex_t lock;

void* doSomeThing(void *arg)
{
pthread_mutex_lock(&lock);

unsigned long i = 0;
counter += 1;
printf("\n Job %d started\n", counter);

for(i=0; i<(0xFFFFFFFF);i++);

printf("\n Job %d finished\n", counter);

pthread_mutex_unlock(&lock);

return NULL;
}

What it should be:

pthread_t tid[2];
int counter;
pthread_mutex_t lock;

void* doSomeThing(void *arg)
{
unsigned long i = 0;

pthread_mutex_lock(&lock);
counter += 1;
int myJobNumber = counter;
pthread_mutex_unlock(&lock);

printf("\n Job %d started\n", myJobNumber);

for(i=0; i<(0xFFFFFFFF);i++);

printf("\n Job %d finished\n", myJobNumber);

return NULL;
}

Notice that in the article, the work being done (the pointless for loop) is done while holding the lock. This is complete nonsense, since the work is supposed to be done concurrently. The reason the lock is needed is only to protect the counter variable. Thus the threads only need to hold the lock when changing that variable as in the second example.

Mutex locks protect the critical section of code, which are those areas of code which only 1 thread at a time should touch - and all the other threads must block if trying to access the critical section at the same time. However, if thread 1 is in the critical section, and thread 2 is not, then it's perfectly fine for both to run concurrently.



Related Topics



Leave a reply



Submit