Can't Pickle ≪Type 'Instancemethod'≫ When Using Multiprocessing Pool.Map()

Can't pickle <type 'instancemethod'> when using multiprocessing Pool.map()

The problem is that multiprocessing must pickle things to sling them among processes, and bound methods are not picklable. The workaround (whether you consider it "easy" or not;-) is to add the infrastructure to your program to allow such methods to be pickled, registering it with the copy_reg standard library method.

For example, Steven Bethard's contribution to this thread (towards the end of the thread) shows one perfectly workable approach to allow method pickling/unpickling via copy_reg.

Python multiprocessing PicklingError: Can't pickle <type 'function'>

Here is a list of what can be pickled. In particular, functions are only picklable if they are defined at the top-level of a module.

This piece of code:

import multiprocessing as mp

class Foo():
@staticmethod
def work(self):
pass

if __name__ == '__main__':
pool = mp.Pool()
foo = Foo()
pool.apply_async(foo.work)
pool.close()
pool.join()

yields an error almost identical to the one you posted:

Exception in thread Thread-2:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 552, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 505, in run
self.__target(*self.__args, **self.__kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 315, in _handle_tasks
put(task)
PicklingError: Can't pickle <type 'function'>: attribute lookup __builtin__.function failed

The problem is that the pool methods all use a mp.SimpleQueue to pass tasks to the worker processes. Everything that goes through the mp.SimpleQueue must be pickable, and foo.work is not picklable since it is not defined at the top level of the module.

It can be fixed by defining a function at the top level, which calls foo.work():

def work(foo):
foo.work()

pool.apply_async(work,args=(foo,))

Notice that foo is pickable, since Foo is defined at the top level and foo.__dict__ is picklable.

Can't pickle <type 'instancemethod'> using python's multiprocessing Pool.apply_async()

This works, using copy_reg, as suggested by Alex Martelli in the first link you provided:

import copy_reg
import types
import multiprocessing


def _pickle_method(m):
if m.im_self is None:
return getattr, (m.im_class, m.im_func.func_name)
else:
return getattr, (m.im_self, m.im_func.func_name)

copy_reg.pickle(types.MethodType, _pickle_method)


class Controler(object):
def __init__(self):
nProcess = 10
pages = 10
self.__result = []
self.manageWork(nProcess, pages)

def BarcodeSearcher(self, x):
return x*x

def resultCollector(self, result):
self.__result.append(result)

def manageWork(self, nProcess, pages):
pool = multiprocessing.Pool(processes=nProcess)
for pag in range(pages):
pool.apply_async(self.BarcodeSearcher, args=(pag,),
callback=self.resultCollector)
pool.close()
pool.join()

print(self.__result)

if __name__ == '__main__':
Controler()

python multiprocessing Can't pickle <type 'function'>

Python's multiprocessing module can not deal with functions/methods which cannot be pickled, which means you cannot use class or instance methods without a lot of hassle. I would recommend to use multiprocess, which uses dill for serialization instead of pickle, and can deal with class or instance methods.

As far as I know, the interface is exactly the same as the one used in multiprocessing, so you can use it as a drop-in replacement.

See also https://stackoverflow.com/a/21345423/1170207

Pickling error while using pool.map in multiprocessing

Changed my function to accept a list of dates :

def func1(datelist):
date1 = datelist[0]
date2 = datelist[1]


if __name__=='__main__':
pool = Pool(processes=4)
dates = [[dt.datetime(2016,6,17),dt.datetime(2016,6,23)],[dt.datetime(2016,6,24),dt.datetime(2016,6,30)],[dt.datetime(2016,7,1),dt.datetime(2016,7,7)],[dt.datetime(2016,7,8),dt.datetime(2016,7,14)]]
result=pool.map(func1,dates)

Multiprocessing: How to use Pool.map on a function defined in a class?

I also was annoyed by restrictions on what sort of functions pool.map could accept. I wrote the following to circumvent this. It appears to work, even for recursive use of parmap.

from multiprocessing import Process, Pipe
from itertools import izip

def spawn(f):
def fun(pipe, x):
pipe.send(f(x))
pipe.close()
return fun

def parmap(f, X):
pipe = [Pipe() for x in X]
proc = [Process(target=spawn(f), args=(c, x)) for x, (p, c) in izip(X, pipe)]
[p.start() for p in proc]
[p.join() for p in proc]
return [p.recv() for (p, c) in pipe]

if __name__ == '__main__':
print parmap(lambda x: x**x, range(1, 5))


Related Topics



Leave a reply



Submit