Is It Safe to Fork from Within a Thread

Is it safe to fork from within a thread?

The problem is that fork() only copies the calling thread, and any mutexes held in child threads will be forever locked in the forked child. The pthread solution was the pthread_atfork() handlers. The idea was you can register 3 handlers: one prefork, one parent handler, and one child handler. When fork() happens prefork is called prior to fork and is expected to obtain all application mutexes. Both parent and child must release all mutexes in parent and child processes respectively.

This isn't the end of the story though! Libraries call pthread_atfork to register handlers for library specific mutexes, for example Libc does this. This is a good thing: the application can't possibly know about the mutexes held by 3rd party libraries, so each library must call pthread_atfork to ensure it's own mutexes are cleaned up in the event of a fork().

The problem is that the order that pthread_atfork handlers are called for unrelated libraries is undefined (it depends on the order that the libraries are loaded by the program). So this means that technically a deadlock can happen inside of a prefork handler because of a race condition.

For example, consider this sequence:

Thread T1 calls fork()
libc prefork handlers are called in T1 (e.g. T1 now holds all libc locks)
Next, in Thread T2, a 3rd party library A acquires its own mutex AM, and then makes a libc call which requires a mutex. This blocks, because libc mutexes are held by T1.
Thread T1 runs prefork handler for library A, which blocks waiting to obtain AM, which is held by T2.

There's your deadlock and its unrelated to your own mutexes or code.

This actually happened on a project I once worked on. The advice I had found at that time was to choose fork or threads but not both. But for some applications that's probably not practical.

What happens when a thread forks?

The new process will be the child of the main thread that created the thread. I think.

fork creates a new process. The parent of a process is another process, not a thread. So the parent of the new process is the old process.

Note that the child process will only have one thread because fork only duplicates the (stack for the) thread that calls fork. (This is not entirely true: the entire memory is duplicated, but the child process will only have one active thread.)

If its parent finishes first, the new process will be attached to init process.

If the parent finishes first a SIGHUP signal is sent to the child. If the child does not exit as a result of the SIGHUP it will get init as its new parent. See also the man pages for nohup and signal(7) for a bit more information on SIGHUP.

And its parent is main thread, not the thread that created it.

The parent of a process is a process, not a specific thread, so it is not meaningful to say that the main or child thread is the parent. The entire process is the parent.

One final note: Mixing threads and fork must be done with care. Some of the pitfalls are discussed here.

How to safely fork() a multi-threaded process?

So now what I want to do is to fork the process before the 3rd party static library is loaded. So is there a proper way to achieve this?

"Before the static library is loaded" is not a thing, static libraries are linked in at build time, not loaded dynamically at runtime.

The library will start a thread triggered by the initialization of a global variable

You should consider using a better-designed library, or work with the maintainer to come up with a better-thought-out initialization scheme. Life before main is usually a bad idea. Spawning threads before main is a really bad idea. Alternatives might be, use a lazy static that gets triggered whenever an API of the library is called, or better yet, introducing an explicit init function, or using context pointers so that the user can manage library state. Sorry, it sounds like this thing is a tire fire and theres no sane way to fork it until you invest some time in fixing it.

Calling fork on a multithreaded process

Read carefully what POSIX says about fork() and threads. In particular:

A process shall be created with a single thread. If a multi-threaded process calls fork(), the new process shall contain a replica of the calling thread and its entire address space, possibly including the states of mutexes and other resources. Consequently, to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called.

The child process will have a single thread running in the context of the calling thread. Other parts of the original process may be tied up by threads that no longer exist (so mutexes may be locked, for example).

The rationale section (further down the linked page) says:

There are two reasons why POSIX programmers call fork(). One reason is to create a new thread of control within the same program (which was originally only possible in POSIX by creating a new process); the other is to create a new process running a different program. In the latter case, the call to fork() is soon followed by a call to one of the exec functions.
The general problem with making fork() work in a multi-threaded world is what to do with all of the threads. There are two alternatives. One is to copy all of the threads into the new process. This causes the programmer or implementation to deal with threads that are suspended on system calls or that might be about to execute system calls that should not be executed in the new process. The other alternative is to copy only the thread that calls fork(). This creates the difficulty that the state of process-local resources is usually held in process memory. If a thread that is not calling fork() holds a resource, that resource is never released in the child process because the thread whose job it is to release the resource does not exist in the child process.
When a programmer is writing a multi-threaded program, the first described use of fork(), creating new threads in the same program, is provided by the pthread_create() function. The fork() function is thus used only to run new programs, and the effects of calling functions that require certain resources between the call to fork() and the call to an exec function are undefined.

Safe to call multiprocessing from a thread in Python?

It sounds like the state of any locks in the process are copied into the child process when the current thread forks (which seems like a design error and certain to deadlock).

It is not a design error, rather, fork() predates single-process multithreading. The state of all locks is copied into the child process because they're just objects in memory; the entire address-space of the process is copied as is in fork. There are only bad alternatives: either copy all threads over fork, or deny forking in multithreaded application.

Therefore, fork()ing in a multithreading program was never the safe thing to do, unless then followed by execve() or exit() in the child process.

Does replacing threading.Lock with multiprocessing.Lock everywhere avoid this issue and allow us to safely combine threads and forks?

No. Nothing makes it safe to combine threads and forks, it cannot be done.

The problem is that when you have multiple threads in a process, after fork() system call you cannot continue safely running the program in POSIX systems.

For example, Linux manuals fork(2):

After a fork(2) in a multithreaded program, the child can safely call
only async-signal-safe functions (see signal(7)) until such time as it
calls execve(2).

I.e. it is OK to fork() in a multithreaded program and then only call async-signal-safe C functions (which is a rather limited subset of C functions), until the child process has been replaced with another executable!

Unsafe C function calls in child processes are then for example

malloc for dynamic memory allocation
any <stdio.h> functions for formatted input
most of the pthread_* functions required for thread state handling, including creation of new threads...

Thus there is very little what the child process can actually safely do. Unfortunately CPython core developers have been downplaying the problems caused by this. Even now the documentation says:

Note that safely forking a multithreaded process is
problematic.

Quite an euphemism for "impossible".

It is safe to use multiprocessing from a Python process that has multiple threads of control provided that you're not using the fork start method; in Python 3.4+ it is now possible to change the start method. In previous Python versions including all of Python 2, the POSIX systems always behaved as if fork was specified as the start method; this would result in undefined behaviour.

The problems are not limited to just threading.Lock objects but all locks held by the C standard library, the C extensions etc. What is worse that most of the time people would say "it works for me"... until it stops from working.

There were even a cases where a seemingly single-threading Python program is actually multithreading in MacOS X, causing failures and deadlocks upon using multiprocessing.

Another problem is that all opened file handles, their use, shared sockets might behave oddly in programs that forks, but that would be the case even in single-threaded programs.

TL;DR: using multiprocessing in multithreaded programs, with C extensions, with opened sockets etc:

fine in 3.4+ & POSIX if you explicitly specify a starting method that is not fork,
fine in Windows because it doesn't support forking;
in Python 2 - 3.3 on POSIX: you'll mostly shoot yourself in the foot.

Is it possible to use fork in modern C++?

The answer to this question, like almost everything else in C++, is "it depends".

If we assume there are other threads in the program, and those threads are synchronizing with each other, calling fork is dangerous. This is because, fork does not wait for all threads to be a synchronization point (i.e. mutex release) to fork the process. In the forked process, only the thread that called fork will be present, and the others will have been terminated, possibly in the middle of a critical section. This means any memory shared with other threads, that wasn't a std::atomic<int> or similar, is an undefined state.

If your forked process reads from this memory, or indeed expects the other threads to be running, it is likely not going to work reliably. However, most uses of fork actually have effectively no preconditions on program state. That is because the most common thing to do is to immediately call execv or similar to spawn a subprocess. In this case your entire process is kinda "replaced" by some new process, and all memory from your old process is discarded.

tl;dr - Calling fork may not be safe in multithreaded programs. Sometimes it is safe; like if no threads have spawned yet, or evecv is called immediately. If you are using fork for something else, consider using a thread instead.

See the fork man page and this helpful blog post for the nitty-gritty.

Are threads copied when calling fork?

No.

Threads are not copied on fork(). POSIX specification says (emphasize is mine):

fork - create a new process

A process shall be created with a single thread. If a multi-threaded process calls fork(), the new process shall contain a replica of the calling thread and its entire address space, possibly including the states of mutexes and other resources. Consequently, to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called.

To circumvent this problem, there exists a pthread_atfork() function to help.

What happens to other threads when one thread forks()?

Nothing. Only the thread calling fork() gets duplicate. The child process has to start any new threads. The parents threads are left alone.

Is It Safe to Fork from Within a Thread