What Are the Differences Between the Threading and Multiprocessing Modules

Multiprocessing vs Threading Python

The threading module uses threads, the multiprocessing module uses processes. The difference is that threads run in the same memory space, while processes have separate memory. This makes it a bit harder to share objects between processes with multiprocessing. Since threads use the same memory, precautions have to be taken or two threads will write to the same memory at the same time. This is what the global interpreter lock is for.

Spawning processes is a bit slower than spawning threads.

What are the differences between the threading and multiprocessing modules?

What Giulio Franco says is true for multithreading vs. multiprocessing in general.

However, Python^* has an added issue: There's a Global Interpreter Lock that prevents two threads in the same process from running Python code at the same time. This means that if you have 8 cores, and change your code to use 8 threads, it won't be able to use 800% CPU and run 8x faster; it'll use the same 100% CPU and run at the same speed. (In reality, it'll run a little slower, because there's extra overhead from threading, even if you don't have any shared data, but ignore that for now.)

There are exceptions to this. If your code's heavy computation doesn't actually happen in Python, but in some library with custom C code that does proper GIL handling, like a numpy app, you will get the expected performance benefit from threading. The same is true if the heavy computation is done by some subprocess that you run and wait on.

More importantly, there are cases where this doesn't matter. For example, a network server spends most of its time reading packets off the network, and a GUI app spends most of its time waiting for user events. One reason to use threads in a network server or GUI app is to allow you to do long-running "background tasks" without stopping the main thread from continuing to service network packets or GUI events. And that works just fine with Python threads. (In technical terms, this means Python threads give you concurrency, even though they don't give you core-parallelism.)

But if you're writing a CPU-bound program in pure Python, using more threads is generally not helpful.

Using separate processes has no such problems with the GIL, because each process has its own separate GIL. Of course you still have all the same tradeoffs between threads and processes as in any other languages—it's more difficult and more expensive to share data between processes than between threads, it can be costly to run a huge number of processes or to create and destroy them frequently, etc. But the GIL weighs heavily on the balance toward processes, in a way that isn't true for, say, C or Java. So, you will find yourself using multiprocessing a lot more often in Python than you would in C or Java.

Meanwhile, Python's "batteries included" philosophy brings some good news: It's very easy to write code that can be switched back and forth between threads and processes with a one-liner change.

If you design your code in terms of self-contained "jobs" that don't share anything with other jobs (or the main program) except input and output, you can use the concurrent.futures library to write your code around a thread pool like this:

with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:
    executor.submit(job, argument)
    executor.map(some_function, collection_of_independent_things)
    # ...

You can even get the results of those jobs and pass them on to further jobs, wait for things in order of execution or in order of completion, etc.; read the section on Future objects for details.

Now, if it turns out that your program is constantly using 100% CPU, and adding more threads just makes it slower, then you're running into the GIL problem, so you need to switch to processes. All you have to do is change that first line:

with concurrent.futures.ProcessPoolExecutor(max_workers=4) as executor:

The only real caveat is that your jobs' arguments and return values have to be pickleable (and not take too much time or memory to pickle) to be usable cross-process. Usually this isn't a problem, but sometimes it is.

But what if your jobs can't be self-contained? If you can design your code in terms of jobs that pass messages from one to another, it's still pretty easy. You may have to use threading.Thread or multiprocessing.Process instead of relying on pools. And you will have to create queue.Queue or multiprocessing.Queue objects explicitly. (There are plenty of other options—pipes, sockets, files with flocks, … but the point is, you have to do something manually if the automatic magic of an Executor is insufficient.)

But what if you can't even rely on message passing? What if you need two jobs to both mutate the same structure, and see each others' changes? In that case, you will need to do manual synchronization (locks, semaphores, conditions, etc.) and, if you want to use processes, explicit shared-memory objects to boot. This is when multithreading (or multiprocessing) gets difficult. If you can avoid it, great; if you can't, you will need to read more than someone can put into an SO answer.

From a comment, you wanted to know what's different between threads and processes in Python. Really, if you read Giulio Franco's answer and mine and all of our links, that should cover everything… but a summary would definitely be useful, so here goes:

Threads share data by default; processes do not.
As a consequence of (1), sending data between processes generally requires pickling and unpickling it.^**
As another consequence of (1), directly sharing data between processes generally requires putting it into low-level formats like Value, Array, and ctypes types.
Processes are not subject to the GIL.
On some platforms (mainly Windows), processes are much more expensive to create and destroy.
There are some extra restrictions on processes, some of which are different on different platforms. See Programming guidelines for details.
The threading module doesn't have some of the features of the multiprocessing module. (You can use multiprocessing.dummy to get most of the missing API on top of threads, or you can use higher-level modules like concurrent.futures and not worry about it.)

_{* It's not actually Python, the language, that has this issue, but CPython, the "standard" implementation of that language. Some other implementations don't have a GIL, like Jython.}

_{** If you're using the fork start method for multiprocessing—which you can on most non-Windows platforms—each child process gets any resources the parent had when the child was started, which can be another way to pass data to children.}

Comparison between threading module and multiprocessing module

Adding to @zmbq threading will be slower only when you are doing a computationally intensive task due to the presence of GIL. If your operations are I/O bound and few other similar operations then threading will be definitely faster since there is less overhead involved. Please refer to the following blog for a better understanding of the same.

Exploiting Multiprocessing and Multithreading in Python as a Data Scientist

Hope this helps!

What's the difference between ThreadPool vs Pool in the multiprocessing module?

The multiprocessing.pool.ThreadPool behaves the same as the multiprocessing.Pool with the only difference that uses threads instead of processes to run the workers logic.

The reason you see

hi outside of main()

being printed multiple times with the multiprocessing.Pool is due to the fact that the pool will spawn 5 independent processes. Each process will initialize its own Python interpreter and load the module resulting in the top level print being executed again.

Note that this happens only if the spawn process creation method is used (only method available on Windows). If you use the fork one (Unix), you will see the message printed only once as for the threads.

The multiprocessing.pool.ThreadPool is not documented as its implementation has never been completed. It lacks tests and documentation. You can see its implementation in the source code.

I believe the next natural question is: when to use a thread based pool and when to use a process based one?

The rule of thumb is:

IO bound jobs -> multiprocessing.pool.ThreadPool
CPU bound jobs -> multiprocessing.Pool
Hybrid jobs -> depends on the workload, I usually prefer the multiprocessing.Pool due to the advantage process isolation brings

On Python 3 you might want to take a look at the concurrent.future.Executor pool implementations.

deciding among subprocess, multiprocessing, and thread in Python?

multiprocessing is a great Swiss-army knife type of module. It is more general than threads, as you can even perform remote computations. This is therefore the module I would suggest you use.

The subprocess module would also allow you to launch multiple processes, but I found it to be less convenient to use than the new multiprocessing module.

Threads are notoriously subtle, and, with CPython, you are often limited to one core, with them (even though, as noted in one of the comments, the Global Interpreter Lock (GIL) can be released in C code called from Python code).

I believe that most of the functions of the three modules you cite can be used in a platform-independent way. On the portability side, note that multiprocessing only comes in standard since Python 2.6 (a version for some older versions of Python does exist, though). But it's a great module!

multiprocessing vs multithreading vs asyncio in Python 3

They are intended for (slightly) different purposes and/or requirements. CPython (a typical, mainline Python implementation) still has the global interpreter lock so a multi-threaded application (a standard way to implement parallel processing nowadays) is suboptimal. That's why multiprocessing may be preferred over threading. But not every problem may be effectively split into [almost independent] pieces, so there may be a need in heavy interprocess communications. That's why multiprocessing may not be preferred over threading in general.

asyncio (this technique is available not only in Python, other languages and/or frameworks also have it, e.g. Boost.ASIO) is a method to effectively handle a lot of I/O operations from many simultaneous sources w/o need of parallel code execution. So it's just a solution (a good one indeed!) for a particular task, not for parallel processing in general.

python multiprocessing vs threading for cpu bound work on windows and linux

Processes are much more lightweight under UNIX variants. Windows processes are heavy and take much more time to start up. Threads are the recommended way of doing multiprocessing on windows.