How to Pickle a Python Function (Or Otherwise Serialize Its Code)

Is there an easy way to pickle a python function (or otherwise serialize its code)?

You could serialise the function bytecode and then reconstruct it on the caller. The marshal module can be used to serialise code objects, which can then be reassembled into a function. ie:

import marshal
def foo(x): return x*x
code_string = marshal.dumps(foo.__code__)

Then in the remote process (after transferring code_string):

import marshal, types

code = marshal.loads(code_string)
func = types.FunctionType(code, globals(), "some_func_name")

func(10) # gives 100

A few caveats:

  • marshal's format (any python bytecode for that matter) may not be compatable between major python versions.

  • Will only work for cpython implementation.

  • If the function references globals (including imported modules, other functions etc) that you need to pick up, you'll need to serialise these too, or recreate them on the remote side. My example just gives it the remote process's global namespace.

  • You'll probably need to do a bit more to support more complex cases, like closures or generator functions.

How to pickle a python function with its dependencies?

I have tried basically the same approach to sending g over as f but f can still not see g. How do I get g into the global namespace so that it can be used by f in the receiving process?

Assign it to the global name g. (I see you are assigning f to func2 rather than to f. If you are doing something like that with g, then it is clear why f can't find g. Remember that name resolution happens at runtime -- g isn't looked up until you call f.)

Of course, I'm guessing since you didn't show the code you're using to do this.

It might be best to create a separate dictionary to use for the global namespace for the functions you're unpickling -- a sandbox. That way all their global variables will be separate from the module you're doing this in. So you might do something like this:

sandbox = {}

with open("functions.pickle", "rb") as funcfile:
while True:
try:
code = marshal.load(funcfile)
except EOFError:
break
sandbox[code.co_name] = types.FunctionType(code, sandbox, code.co_name)

In this example I assume that you've put the code objects from all your functions in one file, one after the other, and when reading them in, I get the code object's name and use it as the basis for both the function object's name and the name under which it's stored in the sandbox dictionary.

Inside the unpickled functions, the sandbox dictionary is their globals() and so inside f(), g gets its value from sandbox["g"]. To call f then would be: sandbox["f"]("blah")

Python3: pickle a function without side effects

I solved it by recreating the function from the code giving and empty globals dictionary.

In /my_project/module.py:

def f(n):
return n+1

In my_project, before pickling the function:

import dill
import types
import module

f = types.FunctionType(module.f.__code__,{})

with open("my_func.pkl", 'wb') as fs:
dill.dump(f, fs)

Somewhere else:

import dill

with open("my_func.pkl", 'rb') as fs:
f = dill.load(fs)

serialize / pickle a function that is defined in a string

I'm not sure why use used the dill tag in your question, because you only used pickle... but if you really need your code to be exactly how you wrote it, with the one caveat that you can replace pickle with dill... it works:

>>> import dill as pickle
>>> namespace = {}
>>> exec('def f(x): return x', namespace)
>>> _f = pickle.dumps(namespace['f'])
>>> _f
'\x80\x02cdill.dill\n_create_function\nq\x00(cdill.dill\n_load_type\nq\x01U\x08CodeTypeq\x02\x85q\x03Rq\x04(K\x01K\x01K\x01KCU\x04|\x00\x00Sq\x05N\x85q\x06)U\x01xq\x07\x85q\x08U\x08<string>q\tU\x01fq\nK\x01U\x00q\x0b))tq\x0cRq\r}q\x0e(U\x0c__builtins__q\x0fc__builtin__\n__dict__\nh\nh\x00(h\rh\x0eh\nNN}q\x10tq\x11Rq\x12uh\nNNh\x10tq\x13R0h\x12.'
>>> f = pickle.loads(_f)
>>> f(5)
5
>>>

Python: Pickle class object that has functions/callables as attributes

Edit: the solution is complex because partial uses __setstate__()

Didn't test it, but you probably need to override the method partial.__reduce__() in your CustomPartial class to match its __new__() signature with an extra argument.

This is the partial.__reduce__() definition in Python 3.10:

def __reduce__(self):
return type(self), (self.func,), (self.func, self.args,
self.keywords or None, self.__dict__ or None)

You should include the extra argument/attribute in the second item of the returned tuple, which is passed as *args to __new__() when unpickling an object of this class. Plus, as partial uses __setstate__() to set its __dict__ attribute, you'll need to take care of that, otherwise the func_name attribute will be erased. If you use at least Python 3.8, and if you want to preserve the original __setstate__() method, you can use the sixth field of the reduce value to pass a callable that controls how the update is made.

Try to add this to your class:

def __reduce__(self):
return (
type(self),
(self.func_name, self.func),
(self.func, self.args, self.keywords or None, self.__dict__ or None),
None,
None,
self._setstate
)

@staticmethod
def _setstate(obj, state):
func_name = obj.func_name
obj.__setstate__(state) # erases func_name
obj.func_name = func_name

Reference: https://docs.python.org/3/library/pickle.html#object.__reduce__

Serializing object methods with Pickle

The short answer is no, you cannot pickle methods but you can pickle functions (built-in and user-defined) accessible from the top level of a module (using def, not lambda).

The Python documentation states the types that can be pickled and unpickled are:

  • None, True, and False;
  • integers, floating-point numbers, complex numbers;
  • strings, bytes, bytearrays;
  • tuples, lists, sets, and dictionaries containing only picklable objects;
  • functions (built-in and user-defined) accessible from the top level of a module (using def, not lambda);
  • classes accessible from the top level of a module;
  • instances of such classes whose __dict__ or the result of calling __getstate__() is picklable (see section Pickling Class Instances for details).

As per Wikipedia definition, serialization is:

the process of translating a data structure or object state into a
format that can be stored (for example, in a file or memory data
buffer) or transmitted (for example, over a computer network) and
reconstructed later (possibly in a different computer environment).

The Pickle module simply wasn't built to serialize methods. The main purpose is to serialize state, the state of the attributes. The object is then instantiated when unpickling. As the method is part of the class definition, your code works fine but only with the later definition of the class. Thus etienne.name value end up being "etienne..".

In the context of an instance, saving the class definition also is usually inadequate and undesirable as instructions on how to use the serialized data may change in the future.

why python's pickle is not serializing a method as default argument?

pickle is loading your dictionary data before it has restored the attributes on your instance. As such the self.cond attribute is not yet set when __setitem__ is called for the dictionary key-value pairs.

Note that pickle will never call __init__; instead it'll create an entirely blank instance and restore the __dict__ attribute namespace on that directly.

You have two options:

  • default to cond=None and ignore the condition if it is still set to None:

    class CustomDict(dict):
    def __init__(self, cond=None):
    super().__init__()
    self.cond = cond

    def __setitem__(self, key, value):
    if getattr(self, 'cond', None) is None or self.cond(value):
    dict.__setitem__(self, key, value)

    The getattr() there is needed because a blank instance has no cond attribute at all (it is not set to None, the attribute is entirely missing). You could add cond = None to the class:

    class CustomDict(dict):
    cond = None

    and then just test for if self.cond is None or self.cond(value):.

  • Define a custom __reduce__ method to control how the initial object is created when restored:

    def _default_cond(v): return v is not None

    class CustomDict(dict):
    def __init__(self, cond=_default_cond):
    super().__init__()
    self.cond = cond

    def __setitem__(self, key, value):
    if self.cond(value):
    dict.__setitem__(self, key, value)

    def __reduce__(self):
    return (CustomDict, (self.cond,), None, None, iter(self.items()))

    __reduce__ is expected to return a tuple with:

    • A callable that can be pickled directly (here the class does fine)
    • A tuple of positional arguments for that callable; on unpickling the first element is called passing in the second as arguments, so by setting this to (self.cond,) we ensure that the new instance is created with cond passed in as an argument and now CustomDict.__init__() will be called.
    • The next 2 positions are for a __setstate__ method (ignored here) and for list-like types, so we set these to None.
    • The last element is an iterator for the key-value pairs that pickle then will restore for us.

    Note that I replaced the default value for cond with a function here too so you don't have to rely on dill for the pickling.



Related Topics



Leave a reply



Submit