Python multithreading multiple bash subprocesses. BUT with setting an environmental variable?
Here's the working code, thanks to Chepner AND JohanL comments
import subprocess, os
def make_tmp(tmp_path):
my_env = os.environ.copy()
my_env['TMP'] = tmp_path
dos = 'echo '+tmp_path[-1]+' > '+my_env['TMP']+'\\output.log'
# or run a bat script
# dos = 'C:\\Temp\\launch.bat'
subprocess.Popen(dos, env=my_env, stdout=subprocess.PIPE, shell=True)
from multiprocessing.dummy import Pool as ThreadPool
pool = ThreadPool(2)
path_array = ['C:\\Temp\\1', 'C:\\Temp\\2', 'C:\\Temp\\3']
results = pool.map(make_tmp, path_array)
note that the path_array has to be existing folders
How do I run multiple subprocesses in parallel and wait for them to finish in Python
You can still use Popen
which takes the same input parameters as subprocess.call
but is more flexible.
subprocess.call
: The full function signature is the same as that of the Popen constructor - this functions passes all supplied arguments directly through to that interface.
One difference is that subprocess.call
blocks and waits for the subprocess to complete (it is built on top of Popen
), whereas Popen
doesn't block and consequently allows you to launch other processes in parallel.
Try the following:
from subprocess import Popen
commands = ['command1', 'command2']
procs = [ Popen(i) for i in commands ]
for p in procs:
p.wait()
deciding among subprocess, multiprocessing, and thread in Python?
multiprocessing
is a great Swiss-army knife type of module. It is more general than threads, as you can even perform remote computations. This is therefore the module I would suggest you use.
The subprocess
module would also allow you to launch multiple processes, but I found it to be less convenient to use than the new multiprocessing module.
Threads are notoriously subtle, and, with CPython, you are often limited to one core, with them (even though, as noted in one of the comments, the Global Interpreter Lock (GIL) can be released in C code called from Python code).
I believe that most of the functions of the three modules you cite can be used in a platform-independent way. On the portability side, note that multiprocessing
only comes in standard since Python 2.6 (a version for some older versions of Python does exist, though). But it's a great module!
Python: execute cat subprocess in parallel
Another approach (rather than the other suggestion of putting shell processes in the background) is to use multithreading.
The run
method that you have would then do something like this:
thread.start_new_thread ( myFuncThatDoesZGrep)
To collect results, you can do something like this:
class MyThread(threading.Thread):
def run(self):
self.finished = False
# Your code to run the command here.
blahBlah()
# When finished....
self.finished = True
self.results = []
Run the thread as stated above in the link on multithreading. When your thread object has myThread.finished == True, then you can collect the results via myThread.results.
Related Topics
How to Get If a Key Is Pressed Pygame
What Is the Most Efficient Way of Finding All the Factors of a Number in Python
Python Max Function Using 'Key' and Lambda Expression
How to Check If One of the Following Items Is in a List
How to Draw Vertical Lines on a Given Plot
What's the Bad Magic Number Error
How to Implement an Ordered, Default Dict
How to Selectively Escape Percent (%) in Python Strings
How to Start a Python File While Windows Starts
Saving and Loading Multiple Objects in Pickle File
Insert Line at Middle of File with Python
Set Colorbar Range in Matplotlib
How to Get Attribute of Element from Selenium
Generate 'N' Unique Random Numbers Within a Range
Create a Directly-Executable Cross-Platform Gui App Using Python