How to Load Files Using Pickle and Multiple Modules

Unable to load files using pickle and multiple modules

The issue is that you're pickling objects defined in Settings by actually running the 'Settings' module, then you're trying to unpickle the objects from the GUI module.

Remember that pickle doesn't actually store information about how a class/object is constructed, and needs access to the class when unpickling. See wiki on using Pickle for more details.

In the pkl data, you see that the object being referenced is __main__.Manager, as the 'Settings' module was main when you created the pickle file (i.e. you ran the 'Settings' module as the main script to invoke the addUser function).

Then, you try unpickling in 'Gui' - so that module has the name __main__, and you're importing Setting within that module. So of course the Manager class will actually be Settings.Manager. But the pkl file doesn't know this, and looks for the Manager class within __main__, and throws an AttributeError because it doesn't exist (Settings.Manager does, but __main__.Manager doesn't).

Here's a minimal code set to demonstrate.

The class_def.py module:

import pickle

class Foo(object):
def __init__(self, name):
self.name = name

def main():
foo = Foo('a')
with open('test_data.pkl', 'wb') as f:
pickle.dump([foo], f, -1)

if __name__=='__main__':
main()

You run the above to generate the pickle data.
The main_module.py module:

import pickle

import class_def

if __name__=='__main__':
with open('test_data.pkl', 'rb') as f:
users = pickle.load(f)

You run the above to attempt to open the pickle file, and this throws roughly the same error that you were seeing. (Slightly different, but I'm guessing that's because I'm on Python 2.7)

The solution is either:

  1. You make the class available within the namespace of the top-level module (i.e. GUI or main_module) through an explicit import, or
  2. You create the pickle file from the same top-level module as the one that you will open it in (i.e. call Settings.addUser from GUI, or class_def.main from main_module). This means that the pkl file will save the objects as Settings.Manager or class_def.Foo, which can then be found in the GUI`main_module` namespace.

Option 1 example:

import pickle

import class_def
from class_def import Foo # Import Foo into main_module's namespace explicitly

if __name__=='__main__':
with open('test_data.pkl', 'rb') as f:
users = pickle.load(f)

Option 2 example:

import pickle

import class_def

if __name__=='__main__':
class_def.main() # Objects are being pickled with main_module as the top-level
with open('test_data.pkl', 'rb') as f:
users = pickle.load(f)

Load pickled object in different file - Attribute error

in your class_def.py file you have this code:

if __name__ == '__main__':
doc = Document()
utils.save_document(doc)

This means that doc will be a __main__.Document object, so when it is pickled it is expecting to be able to get a Document class from the main module, to fix this you need to use the definition of Document from a module called class_def meaning you would add an import here:

(in general you can just do from <own module name> import * right inside the if __name__ == "__main__")

if __name__ == '__main__':
from class_def import Document
# ^ so that it is using the Document class defined under the class_def module
doc = Document()
utils.save_document(doc)

that way it will need to run the class_def.py file twice, once as __main__ and once as class_def but it does mean that the data will be pickled as a class_def.Document object so loading it will retrieve the class from the correct place. Otherwise if you have a way of constructing one document object from another you can do something like this in utils.py:

def save_document(doc):
if doc.__class__.__module__ == "__main__":
from class_def import Document #get the class from the reference-able module
doc = Document(doc) #convert it to the class we are able to use

write_file = open(file_path, 'wb')
pickle.dump(doc, write_file)

Although usually I'd prefer the first way.

Saving and loading multiple objects in pickle file?

Using a list, tuple, or dict is by far the most common way to do this:

import pickle
PIK = "pickle.dat"

data = ["A", "b", "C", "d"]
with open(PIK, "wb") as f:
pickle.dump(data, f)
with open(PIK, "rb") as f:
print pickle.load(f)

That prints:

['A', 'b', 'C', 'd']

However, a pickle file can contain any number of pickles. Here's code producing the same output. But note that it's harder to write and to understand:

with open(PIK, "wb") as f:
pickle.dump(len(data), f)
for value in data:
pickle.dump(value, f)
data2 = []
with open(PIK, "rb") as f:
for _ in range(pickle.load(f)):
data2.append(pickle.load(f))
print data2

If you do this, you're responsible for knowing how many pickles are in the file you write out. The code above does that by pickling the number of list objects first.

Pickling and Unpickling in different modules

The issue here is the lambda (anonymous function).

It is completely possible to pickle a self-contained object like the Vectorizer. However, the preprocessing function used in the example is scoped to the Updater class so the Updater class is required to unpickle.

Rather than having a preprocessor function, preprocess the data yourself and pass that in to fit the vectorizer. That will remove the need for the Updater class when unpickling.

Unable to load file via pickle

You need to dump it first, then load it. Try the following:

# dump df_train_train
file = open('df_train_train', 'wb')
pickle.dump(df_train_train, file)
file.close()

Then

file = open('df_train_train', 'rb')
df_train_train = pickle.load(file)
file.close()

How do I pickle a dictionary containing a module & class?

This works for single class. If you want to do this in multiple modules and classes, you can extend the following code.

module_class_writer.py

import module_example
from module_example import ClassExample

included_module = ["module_example"]
d = {}
for name, val in globals().items():
if name in included_module:
if "__module__" in dir(val):
d["module"] = val.__module__
d["class"] = name

#d = {'module': module_example, 'class': ClassExample}

import pickle
filehandler = open("imports.pkl","wb")
pickle.dump(d, filehandler)
filehandler.close()

module_class_reader.py

import pickle
filehandler = open("imports.pkl",'rb')
d = pickle.load(filehandler)
filehandler.close()

def reload_class(module_name, class_name):
mod = __import__(module_name, fromlist=[class_name])
reload(mod)
return getattr(mod, class_name)

if "class" in d and "module" in d:
reload(__import__(d["module"]))
ClassExample = reload_class(d["module"], d["class"])


Related Topics



Leave a reply



Submit