Multiprocessing Example Giving Attributeerror

Multiprocessing example giving AttributeError

This problem seems to be a design feature of multiprocessing.Pool. See https://bugs.python.org/issue25053. For some reason Pool does not always work with objects not defined in an imported module. So you have to write your function into a different file and import the module.

File: defs.py

def f(x):
return x*x

File: run.py

from multiprocessing import Pool
import defs

if __name__ == '__main__':
with Pool(5) as p:
print(p.map(defs.f, [1, 2, 3]))

If you use print or a different built-in function, the example should work. If this is not a bug (according to the link), the given example is chosen badly.

Multiprocessing example giving AttributeError on Mac

Python for MacOS changed the default "start method" in Python 3.8. This means you will now need to protect your code from execution on import by placing all multiprocessing code (and anything else that should only run in the parent process) inside a if __name__ == "__main__": block. You will also run into issues with interactive interpreters like jupyter, or IPython because there isn't always a "main" file to import. It is sometimes possible to configure your IDE to get around this somehow, but the most compatible solution is to run your code from a system terminal. Alternatively you could manually switch back to using "fork" which is faster and uses less memory in some instances, but can fall victim to deadlock when using certain threaded modules (like logging).

Python multiprocessing.Pool: AttributeError

Error 1:

AttributeError: Can't pickle local object
'SomeClass.some_method..single'

You solved this error yourself by moving the nested target-function single() out to the top-level.

Background:

Pool needs to pickle (serialize) everything it sends to its worker-processes (IPC). Pickling actually only saves the name of a function and unpickling requires re-importing the function by name. For that to work, the function needs to be defined at the top-level, nested functions won't be importable by the child and already trying to pickle them raises an exception (more).


Error 2:

AttributeError: Can't get attribute 'single' on module 'main' from
'.../test.py'

You are starting the pool before you define your function and classes, that way the child processes cannot inherit any code. Move your pool start up to the bottom and protect (why?) it with if __name__ == '__main__':

import multiprocessing

class OtherClass:
def run(self, sentence, graph):
return False

def single(params):
other = OtherClass()
sentences, graph = params
return [other.run(sentence, graph) for sentence in sentences]

class SomeClass:
def __init__(self):
self.sentences = [["Some string"]]
self.graphs = ["string"]

def some_method(self):
return list(pool.map(single, zip(self.sentences, self.graphs)))

if __name__ == '__main__': # <- prevent RuntimeError for 'spawn'
# and 'forkserver' start_methods
with multiprocessing.Pool(multiprocessing.cpu_count() - 1) as pool:
print(SomeClass().some_method())

Appendix

...I would like to spread the work over all of my cores.

Potentially helpful background on how multiprocessing.Pool is chunking work:

Python multiprocessing: understanding logic behind chunksize

Multiprocessing with Python 2.7 throwing attribute error

  1. In Python 2.7, multiprocessing.Pool is not a context manager and thus it can't be used in a with statement
    1. Solution - create a pool using regular assignment to a variable:
      my_pool = Pool(4)
      my_pool.map(...)
  2. lambda functions don't work with multiprocessing.Pool, even in Python 3.
    1. Solution - emulate a closure using a solution in the link above:
      from functors import partial

      def run_test_function(x, fun_arg2, fun_arg3, fun_arg4):
      # your code here
      ...

      process_func = partial(run_test_function, fun_arg2=arg2, fun_arg3=arg3, fun_arg4=arg4)

Putting this together:

from multiprocessing import Pool
from functools import partial

def run_test_function(x, fun_arg2, fun_arg3, fun_arg4):
# this is an example
print x, fun_arg2, fun_arg3, fun_arg4

if __name__ == "__main__":
arg1 = 1,2,3,4
arg2 = "hello"
arg3 = "world"
arg4 = "!"

process_func = partial(run_test_function, fun_arg2=arg2, fun_arg3=arg3, fun_arg4=arg4)

my_pool = Pool(4)
my_pool.map(process_func, arg1)

Output:

~/test $ python2.7 so10.py
1 hello world !
2 hello world !
3 hello world !
4 hello world !

Python multiprocessing returning AttributeError when following documentation code

You're in interactive mode. That basically doesn't work with multiprocessing, because the workers have to import __main__ and get something that mostly resembles the main process's __main__. This is one of the many ways in which the multiprocessing API is horribly confusing.

Put your code in a script and run the script.



Related Topics



Leave a reply



Submit