Creating Semaphore with Initial Value of 0 Make Issues with Execution

Creating semaphore with initial value of 0 make issues with execution

You can use the semaphore in 2 different ways:

To say when work or a resource is ready.
In this case you start the semaphore at 0. The creator calls signal when something is ready. The consumer calls wait to wait for the expected item / resource.
To limit the number of concurrent operations / requests / usages.
In this case you start the semaphore at a positive value, like 4. The users each call wait and if resource is available they are allowed to continue. If not they are blocked. When each has finished with the resource they call signal.

So, what you see it expected because you're setting the semaphore up as a ready flag but using it as an access limit (because you call wait first).

When is semaphore initialized with value 0?

While both values are allowed and perfectly reasonable, that doesn't mean they are interchangeable.

Imagine if you wanted Thread 2 to wait for Thread 1 to complete it's work before proceeding. You could use the following:

main:

sem_t sem; 
sem_init(&sem, 0, 0);

Thread 1:

do_work();
sem_post(&sem);

Thread 2:

do_some_work();

// Wait for Thread 1 to complete before proceeding.
sem_wait(&sem);
do_other_work();

As you can see, initializing a semaphore to zero can be perfectly reasonable. That doesn't mean it does the same thing as initializing it to one. Your program is buggy because your thread waits for the value of the semaphore to be raised above zero, but nothing ever does that.

Unexpected semaphore behavior, threads executing beyond sem_wait()

Semaphores persist after a process dies unless you specifically unlink them. The behavior you're seeing is due to the threads pulling old jobs sem_post'd to the semaphore by a previous process. Your sem_getvalue call would show the existence of those old jobs if the call actually worked, but it is failing and you're not noticing because you aren't checking sem_getvalue's return value. The "Function not implemented" errno value is in fact from sem_getvalue, not sem_wait.

Add

sem_unlink("semaphore");

before your call to sem_open and the weird behavior will go away.

Semaphore in iOS Swift

Let us imagine that you decided to adopt Swift concurrency, rather than using semaphores. So, what would this look like with async-await?

Let us imagine for a second that you refactored performFirstTask, performSecondTask, and performThirdTask to adopt Swift concurrency. Then, you eliminate the semaphore completely, and your fifteen lines of code are reduced to:

Task {
    await performFirstTask()
    await performSecondTask()
    await performThirdTask()
}

That performs those three asynchronous tasks sequentially, but avoids all of the downsides of semaphores. The whole idea of async-await is that you can represent dependencies between a series of asynchronous tasks very elegantly.

Now, generally you would refactor the performXXXTask methods to adopt Swift concurrency. Alternatively, you could also just write async “wrapper” functions for them, e.g.:

func performFirstTask() async {
    await withCheckedContinuation { continuation in
        performFirstTask() {
            continuation.resume(returning: ())
        }
    }
}

That is an async rendition of performFirstTask that calls the completion handler rendition.

However you decide to do it (refactor these three methods or just write wrappers for them), Swift concurrency simplifies the process greatly. See WWDC 2021 video Swift concurrency: Update a sample app for more examples about how to convert legacy code to adopt Swift concurrency.

How do I recover a semaphore when the process that decremented it to zero crashes?

Turns out there isn't a way to reliably recover the semaphore. Sure, anyone can post_sem() to the named semaphore to get the count to increase past zero again, but how to tell when such a recovery is needed? The API provided is too limited and doesn't indicate in any way when this has happened.

Beware of the ipc tools also available -- the common tools ipcmk, ipcrm, and ipcs are only for the outdated SysV semaphores. They specifically do not work with the new POSIX semaphores.

But it looks like there are other things that can be used to lock things, which the operating system does automatically release when an application dies in a way that cannot be caught in a signal handler. Two examples: a listening socket bound to a particular port, or a lock on a specific file.

I decided the lock on a file is the solution I needed. So instead of a sem_wait() and sem_post() call, I'm using:

lockf( fd, F_LOCK, 0 )

and

lockf( fd, F_ULOCK, 0 )

When the application exits in any way, the file is automatically closed which also releases the file lock. Other client apps waiting for the "semaphore" are then free to proceed as expected.

Thanks for the help, guys.

UPDATE:

12 years later, thought I should point out that posix mutexes do have a "robust" attribute. That way, if the owner of the mutex gets killed or exits, the next user to lock the mutex will get the non-error return value of EOWNERDEAD, allowing the mutex to be recovered. This will make it similar to the file and socket locking solution. Look up pthread_mutexattr_setrobust() and pthread_mutex_consistent() for details. Thanks, Reinier Torenbeek, for this hint.

When to use Semaphore instead of Dispatch Group?

Conceptually, both of DispatchGroup and Semaphore serve the same purpose (unless I misunderstand something).

The above is not exactly true. You can use a semaphore to do the same thing as a dispatch group but it is much more general.

Dispatch groups are used when you have a load of things you want to do that can all happen at once, but you need to wait for them all to finish before doing something else.

Semaphores can be used for the above but they are general purpose synchronisation objects and can be used for many other purposes too. The concept of a semaphore is not limited to Apple and can be found in many operating systems.

In general, a semaphore has a value which is a non negative integer and two operations:

wait If the value is not zero, decrement it, otherwise block until something signals the semaphore.
signal If there are threads waiting, unblock one of them, otherwise increment the value.

Needless to say both operations have to be thread safe. In olden days, when you only had one CPU, you'd simply disable interrupts whilst manipulating the value and the queue of waiting threads. Nowadays, it is more complicated because of multiple CPU cores and on chip caches etc.

A semaphore can be used in any case where you have a resource that can be accessed by at most N threads at the same time. You set the semaphore's initial value to N and then the first N threads that wait on it are not blocked but the next thread has to wait until one of the first N threads has signaled the semaphore. The simplest case is N = 1. In that case, the semaphore behaves like a mutex lock.

A semaphore can be used to emulate a dispatch group. You start the sempahore at 0, start all the tasks - tracking how many you have started and wait on the semaphore that number of times. Each task must signal the semaphore when it completes.

However, there are some gotchas. For example, you need a separate count to know how many times to wait. If you want to be able to add more tasks to the group after you have started waiting, the count can only be updated in a mutex protected block and that may lead to problems with deadlocking. Also, I think the Dispatch implementation of semaphores might be vulnerable to priority inversion. Priority inversion occurs when a high priority thread waits for a resource that a low priority has grabbed. The high priority thread is blocked until the low priority thread releases the resource. If there is a medium priority thread running, this may never happen.

You can pretty much do anything with a semaphore that other higher level synchronisation abstractions can do, but doing it right is often a tricky business to get right. The higher level abstractions are (hopefully) carefully written and you should use them in preference to a "roll your own" implementation with semaphores, if possible.

Parent is executing before child process, how to force opposite through semaphores?

You're mis-using semaphores. The general idea is that a semaphore counts "how many entities (threads, whatever) are allowed to use this data right now". By starting the count at 2, you're saying "two threads may use this now". Semaphores do not say which entities, nor how (read vs write), only how many. For example, semaphores can to be used to count the number of retrievable items in a producer/consumer queue: the producer increments and the consumer decrements. (Of course, semaphores come in all kinds of expanded flavors; since you say these are "not POSIX", but not what they are, it's hard to generalize much more.)

One way to make this work as described—but of course, actual code tends to vary from descriptions—is to start the semaphore count at 0, fork a child, have the child write without looking at the semaphore count, fork another child, have that child also write without looking at the semaphore count, and then have the parent wait on the semaphore (P). That is, the semaphore says "none shall pass" but the children don't actually look at it. Then, the two children each do V operations (+1 in each). Once the semaphore has gone to 1, the parent starts: he can then find at least one (but perhaps only one) child-result. The parent can do another P immediately if he needs to have both results.

(More generally, though, you may want reader/writer locks or mutexes and condition variables. If you have POSIX threads, see pthread_cond_init(), pthread_cond_wait(), pthread_cond_signal(), pthread_mutex_init(), etc.)

Aha, from the comment and question-edit, I see that you're using the wretched System V shared memory and semaphore interface.

Are you really stuck with that? The POSIX thread stuff is nicer, in my opinion (and generally lighter-weight).
How do you intend to organize your shared-memory? You may get less lock-contention if each car has its own lap times region, shared only with the display thread/proc: there's one single producer (the car) and one single consumer (display thread/proc), but 24 such locks (one per car). If all cars share one shared-memory region with the display thread/proc, you need only one lock, but it's much more active. Which one is "better" depends on what you are doing.
And, if you want to wait for "some car to finish 50 laps", consider having each car have its own private (or possibly shared-with-display) counter, and one counting semaphore for "number of cars that have hit 50 laps". Each car simply counts up and upon reaching 50, increments the counting semaphore (once) too.

Final (I hope) edit: after fixing smaller problems, the last remaining one was the use of SEM_UNDO in each child process, which would do a V (of +1) to signal "data produced and all done" and then exit. SEM_UNDO records a balancing adjustment value that is applied on process exit, so the semaphore would count up, but then immediately count right back down, leaving the parent waiting for another V that would never occur.

Understanding parallel thread execution

If you want to get the desired output, you should use two semaphores instead of one. Each thread should wait on its own semaphore and post the other thread's semaphore after it is done with each loop iteration. The main thread can create one semaphore with a value of 1 and the other with a value of zero to start things off right. This will force the two threads to run in an alternating sequence.

As the program is written currently, doing a sem_post followed by a sem_wait is likely to result in the same thread grabbing the semaphore right away (on a single cpu system). I'm surprised that pthread_yield doesn't help, but using two semaphores will guarantee correct ordering no matter what.

Creating Semaphore with Initial Value of 0 Make Issues with Execution