How to Extend an Array In-Place in Numpy

How to extend an array in-place in Numpy?

Imagine a numpy array as occupying one contiguous block of memory. Now imagine other objects, say other numpy arrays, which are occupying the memory just to the left and right of our numpy array. There would be no room to append to or extend our numpy array. The underlying data in a numpy array always occupies a contiguous block of memory.

So any request to append to or extend our numpy array can only be satisfied by allocating a whole new larger block of memory, copying the old data into the new block and then appending or extending.

So:

It will not occur in-place.
It will not be efficient.

What's the simplest way to extend a numpy array in 2 dimensions?

The shortest in terms of lines of code i can think of is for the first question.

>>> import numpy as np
>>> p = np.array([[1,2],[3,4]])

>>> p = np.append(p, [[5,6]], 0)
>>> p = np.append(p, [[7],[8],[9]],1)

>>> p
array([[1, 2, 7],
   [3, 4, 8],
   [5, 6, 9]])

And the for the second question

    p = np.array(range(20))
>>> p.shape = (4,5)
>>> p
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])
>>> n = 2
>>> p = np.append(p[:n],p[n+1:],0)
>>> p = np.append(p[...,:n],p[...,n+1:],1)
>>> p
array([[ 0,  1,  3,  4],
       [ 5,  6,  8,  9],
       [15, 16, 18, 19]])

numpy - Append to array without making a copy

Based on your updated question, it looks like you can handily solve the problem by keeping a dictionary of numpy arrays:

x = np.array([])
y = np.array([])
Arrays = {"x": x, "y": y}

with open("./data.txt", "r") as f:
    for line in f:
        if re.match('x values', line):
            print "reading x values"
            key = "x"
        elif re.match('y', line):
            print "reading y values"
            key = "y"
        else:
            values = re.match("^\s+((?:[0-9.E+-]+\s*)*)", line)
            if values:
                Arrays[key] = np.append(Arrays[key], values.groups()[0].split())

As Sven Marnach points out in comments both here and your question, this is an inefficient use of numpy arrays.

A better approach (again, as Sven points out) would be:

Arrays = {"x": [], "y": []}

with open("./data.txt", "r") as f:
    for line in f:
        if re.match('x values', line):
            print "reading x values"
            key = "x"
        elif re.match('y', line):
            print "reading y values"
            key = "y"
        else:
            values = re.match("^\s+((?:[0-9.E+-]+\s*)*)", line)
            if values:
                Arrays[key].append(values.groups()[0].split())

Arrays = {key: np.array(Arrays[key]) for key in Arrays}

How to extend numpy arrray

Use -

np.concatenate((a, b), axis=0)

Or -

np.vstack((a,b))

Or -

a.append(b) # appends in-place, a will get modified directly

Output

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9],
       [0, 0, 0],
       [1, 1, 1]])

numpy extend/append ndarray

The problem in your approach is that the shape of each element varies and hence you cannot have a fixed shape. However you can define each element as type object and achieve what you are trying to do.

import numpy as np

tf = np.empty((500, 4, 1), dtype= object)

will produce

array([[[None],
        [None],
        [None],
        [None]],

       [[None],
        [None],
        [None],
        [None]],

       [[None],
        [None],
        [None],
        [None]],

       ...,
       [[None],
        [None],
        [None],
        [None]],

       [[None],
        [None],
        [None],
        [None]],

       [[None],
        [None],
        [None],
        [None]]], dtype=object)

Now add your constant initial element as a list to each of these array elements. You might be tempted to use fill() here, but that assigns a single object to each array element and modifying individual array elements will change the entire array. To initialize, you cannot avoid iterating through the entire array.

for i,v in enumerate(tf):
    for j,w in enumerate(v):
        tf[i][j][0] = [[500.0,1.0]]

will produce

array([[[list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])]],

       [[list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])]],

       [[list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])]],

       ...,
       [[list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])]],

       [[list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])]],

       [[list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])]]], dtype=object)

Now you can access each element separately. Use append or extend as you prefer.

 tf[0][0][0].append([100,0.33])

will give

array([[[list([[500.0, 1.0], [100, 0.33]])],
        [list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])]],

       [[list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])]],

       [[list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])]],

       ...,
       [[list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])]],

       [[list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])]],

       [[list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])],
        [list([[500.0, 1.0]])]]], dtype=object)

Only the initialization requires iterating through the array.

Good ways to expand a numpy ndarray?

There are the index tricks r_ and c_.

>>> import numpy as np
>>> a = np.array([[1, 2], [3, 4]])
>>> z = np.zeros((2, 3), dtype=a.dtype)
>>> np.c_[a, z]
array([[1, 2, 0, 0, 0],
       [3, 4, 0, 0, 0]])

If this is performance critical code, you might prefer to use the equivalent np.concatenate rather than the index tricks.

>>> np.concatenate((a,z), axis=1)
array([[1, 2, 0, 0, 0],
       [3, 4, 0, 0, 0]])

There are also np.resize and np.ndarray.resize, but they have some limitations (due to the way numpy lays out data in memory) so read the docstring on those ones. You will probably find that simply concatenating is better.

By the way, when I've needed to do this I usually just do it the basic way you've already mentioned (create an array of zeros and assign the smaller array inside it), I don't see anything wrong with that!

python how to pad numpy array with zeros

Very simple, you create an array containing zeros using the reference shape:

result = np.zeros(b.shape)
# actually you can also use result = np.zeros_like(b) 
# but that also copies the dtype not only the shape

and then insert the array where you need it:

result[:a.shape[0],:a.shape[1]] = a

and voila you have padded it:

print(result)
array([[ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 1.,  1.,  1.,  1.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.]])

You can also make it a bit more general if you define where your upper left element should be inserted

result = np.zeros_like(b)
x_offset = 1  # 0 would be what you wanted
y_offset = 1  # 0 in your case
result[x_offset:a.shape[0]+x_offset,y_offset:a.shape[1]+y_offset] = a
result

array([[ 0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  1.,  1.,  1.,  1.],
       [ 0.,  1.,  1.,  1.,  1.,  1.],
       [ 0.,  1.,  1.,  1.,  1.,  1.]])

but then be careful that you don't have offsets bigger than allowed. For x_offset = 2 for example this will fail.

If you have an arbitary number of dimensions you can define a list of slices to insert the original array. I've found it interesting to play around a bit and created a padding function that can pad (with offset) an arbitary shaped array as long as the array and reference have the same number of dimensions and the offsets are not too big.

def pad(array, reference, offsets):
    """
    array: Array to be padded
    reference: Reference array with the desired shape
    offsets: list of offsets (number of elements must be equal to the dimension of the array)
    """
    # Create an array of zeros with the reference shape
    result = np.zeros(reference.shape)
    # Create a list of slices from offset to offset + shape in each dimension
    insertHere = [slice(offset[dim], offset[dim] + array.shape[dim]) for dim in range(a.ndim)]
    # Insert the array in the result at the specified offsets
    result[insertHere] = a
    return result

And some test cases:

import numpy as np

# 1 Dimension
a = np.ones(2)
b = np.ones(5)
offset = [3]
pad(a, b, offset)

# 3 Dimensions

a = np.ones((3,3,3))
b = np.ones((5,4,3))
offset = [1,0,0]
pad(a, b, offset)

Add single element to array in numpy

append() creates a new array which can be the old array with the appended element.

I think it's more normal to use the proper method for adding an element:

a = numpy.append(a, a[0])

Concatenate a NumPy array to another NumPy array

In [1]: import numpy as np

In [2]: a = np.array([[1, 2, 3], [4, 5, 6]])

In [3]: b = np.array([[9, 8, 7], [6, 5, 4]])

In [4]: np.concatenate((a, b))
Out[4]: 
array([[1, 2, 3],
       [4, 5, 6],
       [9, 8, 7],
       [6, 5, 4]])

or this:

In [1]: a = np.array([1, 2, 3])

In [2]: b = np.array([4, 5, 6])

In [3]: np.vstack((a, b))
Out[3]: 
array([[1, 2, 3],
       [4, 5, 6]])

How to Extend an Array In-Place in Numpy