Is there an easy way to pickle a python function (or otherwise serialize its code)?
You could serialise the function bytecode and then reconstruct it on the caller. The marshal module can be used to serialise code objects, which can then be reassembled into a function. ie:
import marshal
def foo(x): return x*x
code_string = marshal.dumps(foo.__code__)
Then in the remote process (after transferring code_string):
import marshal, types
code = marshal.loads(code_string)
func = types.FunctionType(code, globals(), "some_func_name")
func(10) # gives 100
A few caveats:
marshal's format (any python bytecode for that matter) may not be compatable between major python versions.
Will only work for cpython implementation.
If the function references globals (including imported modules, other functions etc) that you need to pick up, you'll need to serialise these too, or recreate them on the remote side. My example just gives it the remote process's global namespace.
You'll probably need to do a bit more to support more complex cases, like closures or generator functions.
How to pickle a python function with its dependencies?
I have tried basically the same approach to sending g over as f but f can still not see g. How do I get g into the global namespace so that it can be used by f in the receiving process?
Assign it to the global name g
. (I see you are assigning f
to func2
rather than to f
. If you are doing something like that with g
, then it is clear why f
can't find g
. Remember that name resolution happens at runtime -- g
isn't looked up until you call f
.)
Of course, I'm guessing since you didn't show the code you're using to do this.
It might be best to create a separate dictionary to use for the global namespace for the functions you're unpickling -- a sandbox. That way all their global variables will be separate from the module you're doing this in. So you might do something like this:
sandbox = {}
with open("functions.pickle", "rb") as funcfile:
while True:
try:
code = marshal.load(funcfile)
except EOFError:
break
sandbox[code.co_name] = types.FunctionType(code, sandbox, code.co_name)
In this example I assume that you've put the code objects from all your functions in one file, one after the other, and when reading them in, I get the code object's name and use it as the basis for both the function object's name and the name under which it's stored in the sandbox dictionary.
Inside the unpickled functions, the sandbox dictionary is their globals()
and so inside f()
, g
gets its value from sandbox["g"]
. To call f
then would be: sandbox["f"]("blah")
Python3: pickle a function without side effects
I solved it by recreating the function from the code giving and empty globals
dictionary.
In /my_project/module.py:
def f(n):
return n+1
In my_project, before pickling the function:
import dill
import types
import module
f = types.FunctionType(module.f.__code__,{})
with open("my_func.pkl", 'wb') as fs:
dill.dump(f, fs)
Somewhere else:
import dill
with open("my_func.pkl", 'rb') as fs:
f = dill.load(fs)
serialize / pickle a function that is defined in a string
I'm not sure why use used the dill
tag in your question, because you only used pickle
... but if you really need your code to be exactly how you wrote it, with the one caveat that you can replace pickle
with dill
... it works:
>>> import dill as pickle
>>> namespace = {}
>>> exec('def f(x): return x', namespace)
>>> _f = pickle.dumps(namespace['f'])
>>> _f
'\x80\x02cdill.dill\n_create_function\nq\x00(cdill.dill\n_load_type\nq\x01U\x08CodeTypeq\x02\x85q\x03Rq\x04(K\x01K\x01K\x01KCU\x04|\x00\x00Sq\x05N\x85q\x06)U\x01xq\x07\x85q\x08U\x08<string>q\tU\x01fq\nK\x01U\x00q\x0b))tq\x0cRq\r}q\x0e(U\x0c__builtins__q\x0fc__builtin__\n__dict__\nh\nh\x00(h\rh\x0eh\nNN}q\x10tq\x11Rq\x12uh\nNNh\x10tq\x13R0h\x12.'
>>> f = pickle.loads(_f)
>>> f(5)
5
>>>
Python: Pickle class object that has functions/callables as attributes
Edit: the solution is complex because partial
uses __setstate__()
Didn't test it, but you probably need to override the method partial.__reduce__()
in your CustomPartial
class to match its __new__()
signature with an extra argument.
This is the partial.__reduce__()
definition in Python 3.10:
def __reduce__(self):
return type(self), (self.func,), (self.func, self.args,
self.keywords or None, self.__dict__ or None)
You should include the extra argument/attribute in the second item of the returned tuple, which is passed as *args
to __new__()
when unpickling an object of this class. Plus, as partial
uses __setstate__()
to set its __dict__
attribute, you'll need to take care of that, otherwise the func_name
attribute will be erased. If you use at least Python 3.8, and if you want to preserve the original __setstate__()
method, you can use the sixth field of the reduce value to pass a callable that controls how the update is made.
Try to add this to your class:
def __reduce__(self):
return (
type(self),
(self.func_name, self.func),
(self.func, self.args, self.keywords or None, self.__dict__ or None),
None,
None,
self._setstate
)
@staticmethod
def _setstate(obj, state):
func_name = obj.func_name
obj.__setstate__(state) # erases func_name
obj.func_name = func_name
Reference: https://docs.python.org/3/library/pickle.html#object.__reduce__
Serializing object methods with Pickle
The short answer is no, you cannot pickle methods but you can pickle functions (built-in and user-defined) accessible from the top level of a module (using def, not lambda).
The Python documentation states the types that can be pickled and unpickled are:
- None, True, and False;
- integers, floating-point numbers, complex numbers;
- strings, bytes, bytearrays;
- tuples, lists, sets, and dictionaries containing only picklable objects;
- functions (built-in and user-defined) accessible from the top level of a module (using def, not lambda);
- classes accessible from the top level of a module;
- instances of such classes whose __dict__ or the result of calling __getstate__() is picklable (see section Pickling Class Instances for details).
As per Wikipedia definition, serialization is:
the process of translating a data structure or object state into a
format that can be stored (for example, in a file or memory data
buffer) or transmitted (for example, over a computer network) and
reconstructed later (possibly in a different computer environment).
The Pickle module simply wasn't built to serialize methods. The main purpose is to serialize state, the state of the attributes. The object is then instantiated when unpickling. As the method is part of the class definition, your code works fine but only with the later definition of the class. Thus etienne.name
value end up being "etienne..".
In the context of an instance, saving the class definition also is usually inadequate and undesirable as instructions on how to use the serialized data may change in the future.
why python's pickle is not serializing a method as default argument?
pickle
is loading your dictionary data before it has restored the attributes on your instance. As such the self.cond
attribute is not yet set when __setitem__
is called for the dictionary key-value pairs.
Note that pickle
will never call __init__
; instead it'll create an entirely blank instance and restore the __dict__
attribute namespace on that directly.
You have two options:
default to
cond=None
and ignore the condition if it is still set toNone
:class CustomDict(dict):
def __init__(self, cond=None):
super().__init__()
self.cond = cond
def __setitem__(self, key, value):
if getattr(self, 'cond', None) is None or self.cond(value):
dict.__setitem__(self, key, value)The
getattr()
there is needed because a blank instance has nocond
attribute at all (it is not set toNone
, the attribute is entirely missing). You could addcond = None
to the class:class CustomDict(dict):
cond = Noneand then just test for
if self.cond is None or self.cond(value):
.Define a custom
__reduce__
method to control how the initial object is created when restored:def _default_cond(v): return v is not None
class CustomDict(dict):
def __init__(self, cond=_default_cond):
super().__init__()
self.cond = cond
def __setitem__(self, key, value):
if self.cond(value):
dict.__setitem__(self, key, value)
def __reduce__(self):
return (CustomDict, (self.cond,), None, None, iter(self.items()))__reduce__
is expected to return a tuple with:- A callable that can be pickled directly (here the class does fine)
- A tuple of positional arguments for that callable; on unpickling the first element is called passing in the second as arguments, so by setting this to
(self.cond,)
we ensure that the new instance is created withcond
passed in as an argument and nowCustomDict.__init__()
will be called. - The next 2 positions are for a
__setstate__
method (ignored here) and for list-like types, so we set these toNone
. - The last element is an iterator for the key-value pairs that pickle then will restore for us.
Note that I replaced the default value for
cond
with a function here too so you don't have to rely ondill
for the pickling.
Related Topics
Python: What's the Difference Between Pythonbrew and Virtualenv
Scripting Http More Effeciently
Please Introduce a Multi-Processing Library in Perl or Ruby
Does Ruby Have Something Like Python's List Comprehensions
What Is the "Sys.Stdout.Write()" Equivalent in Ruby
Output Seckeycopyexternalrepresentation
Convert Uiimage from Bgr to Rgb
Dynamic Instantiation from String Name of a Class in Dynamically Imported Module
Automatically Initialize Instance Variables
How to Increment Datetime by Custom Months in Python Without Using Library
How to Install Python Windows Packages into Virtualenvs
Iterate Over Object Attributes in Python
Log All Requests from the Python-Requests Module
Deleting Multiple Elements from a List
Python Find Elements in One List That Are Not in the Other