Multiprocessing example giving AttributeError
This problem seems to be a design feature of multiprocessing.Pool. See https://bugs.python.org/issue25053. For some reason Pool does not always work with objects not defined in an imported module. So you have to write your function into a different file and import the module.
File: defs.py
def f(x):
return x*x
File: run.py
from multiprocessing import Pool
import defs
if __name__ == '__main__':
with Pool(5) as p:
print(p.map(defs.f, [1, 2, 3]))
If you use print or a different built-in function, the example should work. If this is not a bug (according to the link), the given example is chosen badly.
Multiprocessing example giving AttributeError on Mac
Python for MacOS changed the default "start method" in Python 3.8. This means you will now need to protect your code from execution on import
by placing all multiprocessing code (and anything else that should only run in the parent process) inside a if __name__ == "__main__":
block. You will also run into issues with interactive interpreters like jupyter, or IPython because there isn't always a "main" file to import. It is sometimes possible to configure your IDE to get around this somehow, but the most compatible solution is to run your code from a system terminal. Alternatively you could manually switch back to using "fork" which is faster and uses less memory in some instances, but can fall victim to deadlock when using certain threaded modules (like logging
).
Python multiprocessing.Pool: AttributeError
Error 1:
AttributeError: Can't pickle local object
'SomeClass.some_method..single'
You solved this error yourself by moving the nested target-function single()
out to the top-level.
Background:
Pool needs to pickle (serialize) everything it sends to its worker-processes (IPC). Pickling actually only saves the name of a function and unpickling requires re-importing the function by name. For that to work, the function needs to be defined at the top-level, nested functions won't be importable by the child and already trying to pickle them raises an exception (more).
Error 2:
AttributeError: Can't get attribute 'single' on module 'main' from
'.../test.py'
You are starting the pool before you define your function and classes, that way the child processes cannot inherit any code. Move your pool start up to the bottom and protect (why?) it with if __name__ == '__main__':
import multiprocessing
class OtherClass:
def run(self, sentence, graph):
return False
def single(params):
other = OtherClass()
sentences, graph = params
return [other.run(sentence, graph) for sentence in sentences]
class SomeClass:
def __init__(self):
self.sentences = [["Some string"]]
self.graphs = ["string"]
def some_method(self):
return list(pool.map(single, zip(self.sentences, self.graphs)))
if __name__ == '__main__': # <- prevent RuntimeError for 'spawn'
# and 'forkserver' start_methods
with multiprocessing.Pool(multiprocessing.cpu_count() - 1) as pool:
print(SomeClass().some_method())
Appendix
...I would like to spread the work over all of my cores.
Potentially helpful background on how multiprocessing.Pool
is chunking work:
Python multiprocessing: understanding logic behind chunksize
Multiprocessing with Python 2.7 throwing attribute error
- In Python 2.7,
multiprocessing.Pool
is not a context manager and thus it can't be used in awith
statement- Solution - create a pool using regular assignment to a variable:
my_pool = Pool(4)
my_pool.map(...)
- Solution - create a pool using regular assignment to a variable:
lambda
functions don't work withmultiprocessing.Pool
, even in Python 3.- Solution - emulate a closure using a solution in the link above:
from functors import partial
def run_test_function(x, fun_arg2, fun_arg3, fun_arg4):
# your code here
...
process_func = partial(run_test_function, fun_arg2=arg2, fun_arg3=arg3, fun_arg4=arg4)
- Solution - emulate a closure using a solution in the link above:
Putting this together:
from multiprocessing import Pool
from functools import partial
def run_test_function(x, fun_arg2, fun_arg3, fun_arg4):
# this is an example
print x, fun_arg2, fun_arg3, fun_arg4
if __name__ == "__main__":
arg1 = 1,2,3,4
arg2 = "hello"
arg3 = "world"
arg4 = "!"
process_func = partial(run_test_function, fun_arg2=arg2, fun_arg3=arg3, fun_arg4=arg4)
my_pool = Pool(4)
my_pool.map(process_func, arg1)
Output:
~/test $ python2.7 so10.py
1 hello world !
2 hello world !
3 hello world !
4 hello world !
Python multiprocessing returning AttributeError when following documentation code
You're in interactive mode. That basically doesn't work with multiprocessing
, because the workers have to import __main__
and get something that mostly resembles the main process's __main__
. This is one of the many ways in which the multiprocessing
API is horribly confusing.
Put your code in a script and run the script.
Related Topics
I Don't Understand This Python _Del_ Behaviour
Displaying Subprocess Output to Stdout and Redirecting It
Permanent Fix for Opencv Videocapture
Replace Characters Not Working in Python
What Does % Do to Strings in Python
Understand the Find() Function in Beautiful Soup
How Find Specific Data Attribute from HTML Tag in Beautifulsoup4
Using a Django Variable in a CSS File
Python Script for Minifying CSS
How to Edit Header Row in Pandas - Styling
Logger Configuration to Log to File and Print to Stdout
Difference in Boto3 Between Resource, Client, and Session
What Are Some Good Python Orm Solutions
Remove Xticks in a Matplotlib Plot
How to Convert SQL Query Result to Pandas Data Structure
Plotly: How to Define Colors in a Figure Using Plotly Graph Objects and Plotly Express