Does Python Support Multithreading? Can It Speed Up Execution Time

Does Python support multithreading? Can it speed up execution time?

The GIL does not prevent threading. All the GIL does is make sure only one thread is executing Python code at a time; control still switches between threads.

What the GIL prevents then, is making use of more than one CPU core or separate CPUs to run threads in parallel.

This only applies to Python code. C extensions can and do release the GIL to allow multiple threads of C code and one Python thread to run across multiple cores. This extends to I/O controlled by the kernel, such as select() calls for socket reads and writes, making Python handle network events reasonably efficiently in a multi-threaded multi-core setup.

What many server deployments then do, is run more than one Python process, to let the OS handle the scheduling between processes to utilize your CPU cores to the max. You can also use the multiprocessing library to handle parallel processing across multiple processes from one codebase and parent process, if that suits your use cases.

Note that the GIL is only applicable to the CPython implementation; Jython and IronPython use a different threading implementation (the native Java VM and .NET common runtime threads respectively).

To address your update directly: Any task that tries to get a speed boost from parallel execution, using pure Python code, will not see a speed-up as threaded Python code is locked to one thread executing at a time. If you mix in C extensions and I/O, however (such as PIL or numpy operations) and any C code can run in parallel with one active Python thread.

Python threading is great for creating a responsive GUI, or for handling multiple short web requests where I/O is the bottleneck more than the Python code. It is not suitable for parallelizing computationally intensive Python code, stick to the multiprocessing module for such tasks or delegate to a dedicated external library.

How to get a faster speed when using multi-threading in python

In many cases, python's threading doesn't improve execution speed very well... sometimes, it makes it worse. For more information, see David Beazley's PyCon2010 presentation on the Global Interpreter Lock / Pycon2010 GIL slides. This presentation is very informative, I highly recommend it to anyone considering threading...

Even though David Beazley's talk explains that network traffic improves the scheduling of Python threading module, you should use the multiprocessing module. I included this as an option in your code (see bottom of my answer).

Running this on one of my older machines (Python 2.6.6):

current_post.mode == "Process"  (multiprocessing)  --> 0.2609 seconds
current_post.mode == "Multiple" (threading) --> 0.3947 seconds
current_post.mode == "Simple" (serial execution) --> 1.650 seconds

I agree with TokenMacGuy's comment and the numbers above include moving the .join() to a different loop. As you can see, python's multiprocessing is significantly faster than threading.


from multiprocessing import Process
import threading
import time
import urllib
import urllib2

class Post:

def __init__(self, website, data, mode):
self.website = website
self.data = data

#mode is either:
# "Simple" (Simple POST)
# "Multiple" (Multi-thread POST)
# "Process" (Multiprocessing)
self.mode = mode
self.run_job()

def post(self):

#post data
req = urllib2.Request(self.website)
open_url = urllib2.urlopen(req, self.data)

if self.mode == "Multiple":
time.sleep(0.001)

#read HTMLData
HTMLData = open_url.read()

#print "OK"

def run_job(self):
"""This was refactored from the OP's code"""
origin_time = time.time()
if(self.mode == "Multiple"):

#multithreading POST
threads = list()
for i in range(0, 10):
thread = threading.Thread(target = self.post)
thread.start()
threads.append(thread)
for thread in threads:
thread.join()
#calculate the time interval
time_interval = time.time() - origin_time
print "mode - {0}: {1}".format(method, time_interval)

if(self.mode == "Process"):

#multiprocessing POST
processes = list()
for i in range(0, 10):
process = Process(target=self.post)
process.start()
processes.append(process)
for process in processes:
process.join()
#calculate the time interval
time_interval = time.time() - origin_time
print "mode - {0}: {1}".format(method, time_interval)

if(self.mode == "Simple"):

#simple POST
for i in range(0, 10):
self.post()
#calculate the time interval
time_interval = time.time() - origin_time
print "mode - {0}: {1}".format(method, time_interval)
return time_interval

if __name__ == "__main__":

for method in ["Process", "Multiple", "Simple"]:
Post("http://forum.xda-developers.com/login.php",
"vb_login_username=test&vb_login_password&securitytoken=guest&do=login",
method
)

Why is there no execution time difference between multithreading and singlethreading

CPython (the standard implementation of Python) does not support multithreading on different CPUs. So you can indeed have multiple threads, but they will all run on the same CPU, and you will not have speed improvements for CPU-bound processes (you would have for I/O bound processes).

The reason for that is the infamous GIL (global interpreter lock). Python's core is not thread safe because of the way it does garbage collection, so it uses a lock, which means threads accessing python objects run one after the other.

In your particular case, you are doing some I/O and some processing. There is significant overhead when doing multiprocessing in python, which is not compensated by the gain in speed in your I/O (the time to read your files is probably small compared to the time to process them).

If you need to do real multithreading look at Cython (not to be confused with CPython) and "no_gil", or c-extensions, or the multiprocessing module.

Improve execution time with Multithreading in python

The short answer: Not easily.

Here is an example of using a multiprocessing pool to speed up your code:

import random
import time
from multiprocessing import Pool

startTime = time.time()

def f(_):
number = str(random.randint(1, 9999))
data.write(number + '\n')

data = open("file2.txt", "a+")
with Pool() as p:
p.map(f, range(1000000))
data.close()

executionTime = (time.time() - startTime)
print(f'Execution time in seconds: {executionTime})')

Looks good? Wait! This is not a drop-in replacement as it lacks synchronization of the processes so not all 1000000 line will be written (some will be overwritten in the same buffer)!
See Python multiprocessing safely writing to a file

So we need to separate computing the numbers (in parallel) from writing them (in serial). We can do this as follows:

import random
import time
from multiprocessing import Pool

startTime = time.time()

def f(_):
return str(random.randint(1, 9999))

with Pool() as p:
output = p.map(f, range(1000000))

with open("file2.txt", "a+") as data:
data.write('\n'.join(output) + '\n')

executionTime = (time.time() - startTime)
print(f'Execution time in seconds: {executionTime})')

With that fixed, note that this is not multithreading but uses multiple processes. You can change it to multithreading with a different pool object:

from multiprocessing.pool import ThreadPool as Pool

On my system, I get from 1 second to 0.35 seconds with the processing pool. With the ThreadPool however it takes up to 2 seconds!

The reason is that Python's global interpreter lock prevents from multiple threads to process your task efficiently, see What is the global interpreter lock (GIL) in CPython?

To conclude, multithreading is not always the right answer:

  1. In your scenario, one limit is the file access, only one thread can write to the file, or you would need to introduce locking, making any performance gain moot
  2. Also in Python multithreading is only suitable for specific tasks, e.g. long computations that happen in a library below python and can therefore run in parallel. In your scenario, the overhead of multithreading negates the small potential for a performance benefit.

The upside: Yes, with multiprocessing instead of multithreading I got a 3x speedup on my system.

Python - Why doesn't multithreading increase the speed of my code?

The primary reason you aren't seeing any performance improvements with multiple threads is because your program only enables one thread to do anything useful at a time. The other thread is always blocked.

Two things:

Remove the print statement that's invoked inside the lock. print statements drastically impact performance and timing. Also, the I/O channel to stdout is essentially single threaded, so you've built another implicit lock into your code. So let's just remove the print statement.

Use a proper sleep technique instead of "spin locking" and counting up from 0 to 30000. That's just going to burn a core needlessly.

Try this as your main loop

while True:
arr_lock.acquire()
if len(arr) > 0:
item = arr.pop(0)
arr_lock.release()
time.sleep(0)
else:
arr_lock.release()
break

This should run slightly better... I would even advocate getting the sleep statement out altogether so you can just let each thread have a full quantum.

However, because each thread is either doing "nothing" (sleeping or blocked on acquire) or just doing a single pop call on the array while in the lock, the majority of the time spent is going to be in the acquire/release calls instead of actually operating on the array. Hence, multiple threads aren't going to make your program run faster.

multi-threading in python: is it really performance effiicient most of the time?

You're right about the GIL, there is no point to use multithreading to do CPU-bound computation, as the CPU will only be used by one thread.

But that previous statement may have enlighted you: If your computation is not CPU bound, you may take advantage of multithreading.

A typical example is when your application take most of its time waiting for something.

One of many many examples of not-CPU bound program:
Say you want to build a web crawler, you have to crawl many many websites, and store them in a database, what does cost times ? Waiting for the servers to send data, actually downloading the data, and storing it in the database, nothing CPU bound here. Here you may get a faster crawler using a pool of crawlers instead of one single crawler. Typically in the case one website is almost down and very slow to respond (~30s), during this time, a single-threaded application will wait for the website, you're stuck. In a multithreaded application, other threads will continue crawling, and that's cool.

On the other hand, as there is one GIL per process, you may use multiprocessing to do CPU-bound computation.

As a side note, it exists some more or less partial implementations of Python without the GIL, I'd like to mention one that I think is in a great way to achieve something cool: pypy STM. You'll easily find, searching "get rid of the GIL" a lot of threads about the subject.

Python multithreading and multiprocessing to speed up loops

You won't get any speedups in python using multi threading because of GIL. It's a mutex for interpreter. You need to use multiprocessing package. It's included in standard distribution.

from multiprocessing import Pool

pool = Pool()

Then just use map or starmap. You can find docs here. But first consider if you can vectorize your code using numpy, it'd be faster that way.

how to use Multithreading in python and speed up code

For this CPU-bound task you can use multiprocessing.pool.Pool to get parallelism. Here's a reduced example that saturates all four cores on my system:

import matplotlib.pyplot as plt          
from multiprocessing.pool import Pool
from types import SimpleNamespace
import numpy as np

def map_kg1_efit(arg):
data = arg[0]
chan = arg[1]
density = np.zeros(968)
for it in range(0,data.ntefit):
density[it] = it
for jj in range(0,data.ntkg1v):
density[it]=density[it]+jj
data.KG1LH_data.lid[chan] = density
return (data, chan)

if __name__ == "__main__":
data = SimpleNamespace()
data.KG1LH_data = SimpleNamespace()
data.ntkg1v = 30039
data.ntefit = 968
data.KG1LH_data.lid = [ [],[],[],[],[],[],[],[]]
with Pool(4) as pool:
results = pool.map(map_kg1_efit, [(data, chan) for chan in range(1, 8)])
for r in results:
plt.figure()
plt.plot(range(0,r[0].ntefit), r[0].KG1LH_data.lid[r[1]])
plt.show()


Related Topics



Leave a reply



Submit