Unable to load files using pickle and multiple modules
The issue is that you're pickling objects defined in Settings by actually running the 'Settings' module, then you're trying to unpickle the objects from the GUI
module.
Remember that pickle doesn't actually store information about how a class/object is constructed, and needs access to the class when unpickling. See wiki on using Pickle for more details.
In the pkl data, you see that the object being referenced is __main__.Manager
, as the 'Settings' module was main when you created the pickle file (i.e. you ran the 'Settings' module as the main script to invoke the addUser
function).
Then, you try unpickling in 'Gui' - so that module has the name __main__
, and you're importing Setting within that module. So of course the Manager class will actually be Settings.Manager
. But the pkl file doesn't know this, and looks for the Manager class within __main__
, and throws an AttributeError because it doesn't exist (Settings.Manager
does, but __main__.Manager
doesn't).
Here's a minimal code set to demonstrate.
The class_def.py
module:
import pickle
class Foo(object):
def __init__(self, name):
self.name = name
def main():
foo = Foo('a')
with open('test_data.pkl', 'wb') as f:
pickle.dump([foo], f, -1)
if __name__=='__main__':
main()
You run the above to generate the pickle data.
The main_module.py
module:
import pickle
import class_def
if __name__=='__main__':
with open('test_data.pkl', 'rb') as f:
users = pickle.load(f)
You run the above to attempt to open the pickle file, and this throws roughly the same error that you were seeing. (Slightly different, but I'm guessing that's because I'm on Python 2.7)
The solution is either:
- You make the class available within the namespace of the top-level module (i.e. GUI or main_module) through an explicit import, or
- You create the pickle file from the same top-level module as the one that you will open it in (i.e. call
Settings.addUser
from GUI, orclass_def.main
from main_module). This means that the pkl file will save the objects asSettings.Manager
orclass_def.Foo
, which can then be found in theGUI
`main_module` namespace.
Option 1 example:
import pickle
import class_def
from class_def import Foo # Import Foo into main_module's namespace explicitly
if __name__=='__main__':
with open('test_data.pkl', 'rb') as f:
users = pickle.load(f)
Option 2 example:
import pickle
import class_def
if __name__=='__main__':
class_def.main() # Objects are being pickled with main_module as the top-level
with open('test_data.pkl', 'rb') as f:
users = pickle.load(f)
Load pickled object in different file - Attribute error
in your class_def.py
file you have this code:
if __name__ == '__main__':
doc = Document()
utils.save_document(doc)
This means that doc
will be a __main__.Document
object, so when it is pickled it is expecting to be able to get a Document
class from the main module, to fix this you need to use the definition of Document
from a module called class_def
meaning you would add an import here:
(in general you can just do from <own module name> import *
right inside the if __name__ == "__main__"
)
if __name__ == '__main__':
from class_def import Document
# ^ so that it is using the Document class defined under the class_def module
doc = Document()
utils.save_document(doc)
that way it will need to run the class_def.py file twice, once as __main__
and once as class_def
but it does mean that the data will be pickled as a class_def.Document
object so loading it will retrieve the class from the correct place. Otherwise if you have a way of constructing one document object from another you can do something like this in utils.py
:
def save_document(doc):
if doc.__class__.__module__ == "__main__":
from class_def import Document #get the class from the reference-able module
doc = Document(doc) #convert it to the class we are able to use
write_file = open(file_path, 'wb')
pickle.dump(doc, write_file)
Although usually I'd prefer the first way.
Saving and loading multiple objects in pickle file?
Using a list, tuple, or dict is by far the most common way to do this:
import pickle
PIK = "pickle.dat"
data = ["A", "b", "C", "d"]
with open(PIK, "wb") as f:
pickle.dump(data, f)
with open(PIK, "rb") as f:
print pickle.load(f)
That prints:
['A', 'b', 'C', 'd']
However, a pickle file can contain any number of pickles. Here's code producing the same output. But note that it's harder to write and to understand:
with open(PIK, "wb") as f:
pickle.dump(len(data), f)
for value in data:
pickle.dump(value, f)
data2 = []
with open(PIK, "rb") as f:
for _ in range(pickle.load(f)):
data2.append(pickle.load(f))
print data2
If you do this, you're responsible for knowing how many pickles are in the file you write out. The code above does that by pickling the number of list objects first.
Pickling and Unpickling in different modules
The issue here is the lambda (anonymous function).
It is completely possible to pickle a self-contained object like the Vectorizer. However, the preprocessing function used in the example is scoped to the Updater class so the Updater class is required to unpickle.
Rather than having a preprocessor function, preprocess the data yourself and pass that in to fit the vectorizer. That will remove the need for the Updater class when unpickling.
Unable to load file via pickle
You need to dump
it first, then load it. Try the following:
# dump df_train_train
file = open('df_train_train', 'wb')
pickle.dump(df_train_train, file)
file.close()
Then
file = open('df_train_train', 'rb')
df_train_train = pickle.load(file)
file.close()
How do I pickle a dictionary containing a module & class?
This works for single class. If you want to do this in multiple modules and classes, you can extend the following code.
module_class_writer.py
import module_example
from module_example import ClassExample
included_module = ["module_example"]
d = {}
for name, val in globals().items():
if name in included_module:
if "__module__" in dir(val):
d["module"] = val.__module__
d["class"] = name
#d = {'module': module_example, 'class': ClassExample}
import pickle
filehandler = open("imports.pkl","wb")
pickle.dump(d, filehandler)
filehandler.close()
module_class_reader.py
import pickle
filehandler = open("imports.pkl",'rb')
d = pickle.load(filehandler)
filehandler.close()
def reload_class(module_name, class_name):
mod = __import__(module_name, fromlist=[class_name])
reload(mod)
return getattr(mod, class_name)
if "class" in d and "module" in d:
reload(__import__(d["module"]))
ClassExample = reload_class(d["module"], d["class"])
Related Topics
Get Name of Current Script in Python
Check If a Number Is Int or Float
Cast Base Class to Derived Class Python (Or More Pythonic Way of Extending Classes)
Why Are Slice and Range Upper-Bound Exclusive
Parsing HTML in Python - Lxml or Beautifulsoup? Which of These Is Better for What Kinds of Purposes
Function with Varying Number of for Loops (Python)
How to Delete All Blank Lines in the File with the Help of Python
Progress Indicator During Pandas Operations
How Does This Input Work with the Python 'Any' Function
Python - Typeerror: 'Int' Object Is Not Iterable
Download Image with Selenium Python
Pyqt4 Wait in Thread for User Input from Gui
How to Force Python to Be 32-Bit on Snow Leopard and Other 32-Bit/64-Bit Questions
What Is the Most Efficient Way to Get First and Last Line of a Text File