How to save and load numpy.array() data properly?
The most reliable way I have found to do this is to use np.savetxt
with np.loadtxt
and not np.fromfile
which is better suited to binary files written with tofile
. The np.fromfile
and np.tofile
methods write and read binary files whereas np.savetxt
writes a text file.
So, for example:
a = np.array([1, 2, 3, 4])
np.savetxt('test1.txt', a, fmt='%d')
b = np.loadtxt('test1.txt', dtype=int)
a == b
# array([ True, True, True, True], dtype=bool)
Or:
a.tofile('test2.dat')
c = np.fromfile('test2.dat', dtype=int)
c == a
# array([ True, True, True, True], dtype=bool)
I use the former method even if it is slower and creates bigger files (sometimes): the binary format can be platform dependent (for example, the file format depends on the endianness of your system).
There is a platform independent format for NumPy arrays, which can be saved and read with np.save
and np.load
:
np.save('test3.npy', a) # .npy extension is added if not given
d = np.load('test3.npy')
a == d
# array([ True, True, True, True], dtype=bool)
How to save a list of numpy arrays into a single file and load file back to original form
I would go with np.save
and np.load
because it's platform-independent, faster than savetxt
and works with lists of arrays, for example:
import numpy as np
a = [
np.arange(100),
np.arange(200)
]
np.save('a.npy', a, allow_pickle=True)
b = np.load('a.npy', allow_pickle=True)
This is the documentation for np.save and np.load. And in this answer you can find a better discussion How to save and load numpy.array() data properly?
best way to preserve numpy arrays on disk
I'm a big fan of hdf5 for storing large numpy arrays. There are two options for dealing with hdf5 in python:
http://www.pytables.org/
http://www.h5py.org/
Both are designed to work with numpy arrays efficiently.
How can I save and load multiple NumPy arrays at a single url?
I think you just need to give np.load
the filename, not the open DataSource
object. This seems to work:
import numpy as np
url = "https://www.dropbox.com/s/1vpn5k3gt41nhtn/Test.npz"
file = np.DataSource().open(url)
data = np.load(file.name)
Now data['x']
is array([1, 2, 3])
and data['y']
is array([4, 5, 6])
.
By the way, I learned something. I thought that to get a nice plain file out of Dropbox you had to stick ?raw=1
at the end of the URL. Turns out that's not true.
Last thing, kudos for setting up your question and example so nicely. Hardly anyone does that.
Saving 4D array without losing its format
Usual np.save
, np.load
works
>>> P.shape
(100000, 8, 4, 4)
>>> np.save("P.npy", P)
>>> P2 = np.load("P.npy")
>>> P2.shape
(100000, 8, 4, 4)
>>> np.allclose(P, P2)
True
How do I save numpy arrays such that they can be loaded later appropriately?
Are you looking for something like np.savetxt
?
If you want to append data to an existing file, you can open the file with append mode.
with open('data.txt', 'a') as f:
np.savetxt(f, newdata)
Check out this post Appending a matrix to an existing file using numpy
You can read the text file using np.loadtxt
Related Topics
Can't Install New Packages for Python (Python 3.9.0, Windows 10)
How to Do a Not Equal in Django Queryset Filtering
Valueerror: Numpy.Dtype Has the Wrong Size, Try Recompiling
How to Use PDFminer as a Library
Making an Asynchronous Task in Flask
How to Access Function Variables in Another Function
Typeerror: Unsupported Operand Type(S) for /: 'Str' and 'Str'
How to Make Urllib2 Requests Through Tor in Python
How to Use Virtualenv with Python
Python Regex Escape Operator \ in Substitutions & Raw Strings
How to Equalize the Scales of X-Axis and Y-Axis in Matplotlib
Python Dictionary:Typeerror: Unhashable Type: 'List'
Installing Numpy on 64Bit Windows 7 with Python 2.7.3
Script Using Multiprocessing Module Does Not Terminate
How to Get a Raw, Compiled SQL Query from a SQLalchemy Expression