How to Create a View Onto a Numpy Array

How do I create a view onto a NumPy array?

Sure, just index it as you normally would. E.g. y = x[:k, :] This will return a view into the original array. No data will be copied, and any updates made to y will be reflected in x and vice versa.


Edit:

I commonly work with >10GB 3D arrays of uint8's, so I worry about this a lot... Numpy can be very efficient at memory management if you keep a few things in mind.
Here are a few tips on avoiding making copies of arrays in memory:

Use +=, -=, *=, etc to avoid making a copy of the array. E.g. x += 10 will modify the array in place, while x = x + 10 will make a copy and modify it. (also, have a look at numexpr)

If you do want to make a copy with x = x + 10, be aware that x = x + 10.0 will cause x to automatically be up-casted to a floating point array, if it wasn't already. However, x += 10.0, where x is an integer array, will cause the 10.0 to be down-casted to an int of the same precision as the array, instead.

Additionally, many numpy functions take an out parameter, so you can do things like np.abs(x, x) to take the absolute value of x in-place.


As a second edit, here's few more tips on views vs. copies with numpy arrays:

Unlike python lists, y = x[:] does not return a copy, it returns a view. If you do want a copy (which will, of course, double the amount of memory you're using) use y = x.copy()

You'll often hear about "fancy indexing" of numpy arrays. Using a list (or integer array) as an index is "fancy indexing". It can be very useful, but copies the data.

As an example of this: y = x[[0, 1, 2], :] returns a copy, while y = x[:3,:] would return a view.

Even really crazy indexing like x[4:100:5, :-10:-1, None] is "normal" indexing and will return a view, though, so don't be afraid to use all kinds of slicing tricks on large arrays.

x.astype(<dtype>) will return a copy of the data as the new type, whilex.view(<dtype>) will return a view.

Be careful with this, however... It's extremely powerful and useful, but you need to understand how the underlying data is stored in memory. If you have an array of floats, and view them as ints, (or vice versa) numpy will interpret the underlying bits of the array as ints.

For example, this means that 1.0 as a 64bit float on a little-endian system will be 4607182418800017408 when viewed as a 64bit int, and an array of [ 0, 0, 0, 0, 0, 0, 240, 63] if viewed as a uint8. This is really nice when you need to do bit-twiddling of some sort on large arrays, though... You have low level control over how the memory buffer is interpreted.

How can I create numpy array of views?

You can do it using the as_strided function:

import numpy as np
from numpy.lib.stride_tricks import as_strided
N=10
L=4*N
H=3*N
step=5
a=(np.arange(3*H*L)%256).reshape(3,H,L)
(k,j,i)=a.strides
b=as_strided (a,shape=(H/step,L/step,3,step,step),strides=(j*step,i*step,k,j,i))

b then address each bloc without copy.

In [29]: np.all(b[1,2]==a[:,5:10,10:15])
Out[29]: True

In [30]: a[:,5,10]=0 # modification of a

In [31]: np.all(b[1,2]==a[:,5:10,10:15])
Out[31]: True # b also modified

Is there a way to get a view into a python array.array()?

Numpy is incredibly flexible and powerful when it comes to views into arrays whilst minimising copies. For example:

import numpy
a = numpy.random.randint(0, 10, size=10)
b = numpy.a[3:10]

b is now a view of the original array that was created.

Numpy arrays allow all manner of access directly to the data buffers, and can be trivially typecast. For example:

a = numpy.random.randint(0, 10, size=10)
b = numpy.frombuffer(a.data, dtype='int8')

b is now view into the memory with the data all as 8-bit integers (the data itself remains unchanged, so that each 64-bit int now becomes 8 8-bit ints). These buffer objects (from a.data) are standard python buffer objects and so can be used in all the places that are defined to work with buffers.

The same is true for multi-dimensional arrays. However, you have to bear in mind how the data lies in memory. For example:

a = numpy.random.randint(0, 10, size=(10, 10))
b = numpy.frombuffer(a[3,:].data, dtype='int8')

will work, but

b = numpy.frombuffer(a[:,3].data, dtype='int8')

returns an error about being unable to get single-segment buffer for discontiguous arrays. This problem is not obvious because simply allocating that same view to a variable using

b  = a[:,3]

returns a perfectly adequate numpy array. However, it is not contiguous in memory as it's a view into the other array, which need not be (and in this case isn't) a view of contiguous memory. You can get info about the array using the flags attribute on an array:

a[:,3].flags

which returns (among other things) both C_CONTIGUOUS (C order, row major) and F_CONTIGUOUS (Fortran order, column major) as False, but

a[3,:].flags

returns them both as True (in 2D arrays, at most one of them can be true).

How do I convert a numpy array to (and display) an image?

You could use PIL to create (and display) an image:

from PIL import Image
import numpy as np

w, h = 512, 512
data = np.zeros((h, w, 3), dtype=np.uint8)
data[0:256, 0:256] = [255, 0, 0] # red patch in upper left
img = Image.fromarray(data, 'RGB')
img.save('my.png')
img.show()

How to create view of an irregularly spaced slice of a numpy array?

The attributes of an array consist of shape, strides and the data.

a[3:10] is a view because it can use the original data buffer, and just use a different shape (7,) and a different start point in the buffer.

a[some_ind] cannot be a view because [0,5,6,24] is not a regular pattern. It can't be expressed as shape, strides and data pointer. There for is has to have its own data copy.

In [534]: a=np.zeros(25,int)
In [535]: np.info(a)
class: ndarray
shape: (25,)
strides: (4,)
itemsize: 4
aligned: True
contiguous: True
fortran: True
data pointer: 0xafa5cc0
...
In [536]: np.info(a[3:10])
class: ndarray
shape: (7,)
strides: (4,)
itemsize: 4
aligned: True
contiguous: True
fortran: True
data pointer: 0xafa5ccc # ccc v cc0
....
In [537]: np.info(a[[0,5,6,24]])
class: ndarray
shape: (4,)
strides: (4,)
itemsize: 4
aligned: True
contiguous: True
fortran: True
data pointer: 0xae9c038 # different
...

or looking at the data buffer pointer in decimal format:

In [538]: a.__array_interface__['data']
Out[538]: (184179904, False)
In [539]: a[3:10].__array_interface__['data']
Out[539]: (184179916, False) # 12 bytes larger
In [540]: a[[0,5,6]].__array_interface__['data']
Out[540]: (181099224, False)

Another way to put it is: the only alternative to copying elements of a, is to hang on the indexing array (or mask), and apply it each time you need those elements.

How to make a `numpy` array view that repeats

With strides tricks we can make an 4d view:

In [18]: x = numpy.array([[1, 2], [3, 4]])
In [19]: as_strided = np.lib.stride_tricks.as_strided
In [20]: X = as_strided(x, shape=(2,2,2,2), strides=(0,16,0,8))
In [21]: X
Out[21]:
array([[[[1, 2],
[1, 2]],

[[3, 4],
[3, 4]]],

[[[1, 2],
[1, 2]],

[[3, 4],
[3, 4]]]])

Which can be reshaped into your desired array:

In [22]: X.reshape(4,4)
Out[22]:
array([[1, 2, 1, 2],
[3, 4, 3, 4],
[1, 2, 1, 2],
[3, 4, 3, 4]])

But that reshaping will create a copy of X.

That (2,2) array can be used in calculations as (1,1,2,2) array, which if needed is expanded to (2,2,2,2):

In [25]: x[None,None,:,:]
Out[25]:
array([[[[1, 2],
[3, 4]]]])
In [26]: np.broadcast_to(x,(2,2,2,2))
Out[26]:
array([[[[1, 2],
[3, 4]],

[[1, 2],
[3, 4]]],

[[[1, 2],
[3, 4]],

[[1, 2],
[3, 4]]]])

Thus broadcasting lets us use a view of an array in larger calculations.



Related Topics



Leave a reply



Submit