How to Save and Load Numpy.Array() Data Properly

How to save and load numpy.array() data properly?

The most reliable way I have found to do this is to use np.savetxt with np.loadtxt and not np.fromfile which is better suited to binary files written with tofile. The np.fromfile and np.tofile methods write and read binary files whereas np.savetxt writes a text file.
So, for example:

a = np.array([1, 2, 3, 4])
np.savetxt('test1.txt', a, fmt='%d')
b = np.loadtxt('test1.txt', dtype=int)
a == b
# array([ True, True, True, True], dtype=bool)

Or:

a.tofile('test2.dat')
c = np.fromfile('test2.dat', dtype=int)
c == a
# array([ True, True, True, True], dtype=bool)

I use the former method even if it is slower and creates bigger files (sometimes): the binary format can be platform dependent (for example, the file format depends on the endianness of your system).

There is a platform independent format for NumPy arrays, which can be saved and read with np.save and np.load:

np.save('test3.npy', a)    # .npy extension is added if not given
d = np.load('test3.npy')
a == d
# array([ True, True, True, True], dtype=bool)

How to save a list of numpy arrays into a single file and load file back to original form

I would go with np.save and np.load because it's platform-independent, faster than savetxt and works with lists of arrays, for example:

import numpy as np

a = [
np.arange(100),
np.arange(200)
]
np.save('a.npy', a, allow_pickle=True)
b = np.load('a.npy', allow_pickle=True)

This is the documentation for np.save and np.load. And in this answer you can find a better discussion How to save and load numpy.array() data properly?

best way to preserve numpy arrays on disk

I'm a big fan of hdf5 for storing large numpy arrays. There are two options for dealing with hdf5 in python:

http://www.pytables.org/

http://www.h5py.org/

Both are designed to work with numpy arrays efficiently.

How can I save and load multiple NumPy arrays at a single url?

I think you just need to give np.load the filename, not the open DataSource object. This seems to work:

import numpy as np

url = "https://www.dropbox.com/s/1vpn5k3gt41nhtn/Test.npz"
file = np.DataSource().open(url)
data = np.load(file.name)

Now data['x'] is array([1, 2, 3]) and data['y'] is array([4, 5, 6]).

By the way, I learned something. I thought that to get a nice plain file out of Dropbox you had to stick ?raw=1 at the end of the URL. Turns out that's not true.

Last thing, kudos for setting up your question and example so nicely. Hardly anyone does that.

Saving 4D array without losing its format

Usual np.save, np.load works

>>> P.shape
(100000, 8, 4, 4)
>>> np.save("P.npy", P)
>>> P2 = np.load("P.npy")
>>> P2.shape
(100000, 8, 4, 4)
>>> np.allclose(P, P2)
True

How do I save numpy arrays such that they can be loaded later appropriately?

Are you looking for something like np.savetxt?

If you want to append data to an existing file, you can open the file with append mode.

with open('data.txt', 'a') as f:
np.savetxt(f, newdata)

Check out this post Appending a matrix to an existing file using numpy

You can read the text file using np.loadtxt



Related Topics



Leave a reply



Submit