How to Add an Extra Column to a Numpy Array

How do I add an extra column to a NumPy array?

I think a more straightforward solution and faster to boot is to do the following:

import numpy as np
N = 10
a = np.random.rand(N,N)
b = np.zeros((N,N+1))
b[:,:-1] = a

And timings:

In [23]: N = 10

In [24]: a = np.random.rand(N,N)

In [25]: %timeit b = np.hstack((a,np.zeros((a.shape[0],1))))
10000 loops, best of 3: 19.6 us per loop

In [27]: %timeit b = np.zeros((a.shape[0],a.shape[1]+1)); b[:,:-1] = a
100000 loops, best of 3: 5.62 us per loop

How to add column to numpy array

I think that your problem is that you are expecting np.append to add the column in-place, but what it does, because of how numpy data is stored, is create a copy of the joined arrays

Returns
-------
append : ndarray
A copy of `arr` with `values` appended to `axis`. Note that `append`
does not occur in-place: a new array is allocated and filled. If
`axis` is None, `out` is a flattened array.

so you need to save the output all_data = np.append(...):

my_data = np.random.random((210,8)) #recfromcsv('LIAB.ST.csv', delimiter='\t')
new_col = my_data.sum(1)[...,None] # None keeps (n, 1) shape
new_col.shape
#(210,1)
all_data = np.append(my_data, new_col, 1)
all_data.shape
#(210,9)

Alternative ways:

all_data = np.hstack((my_data, new_col))
#or
all_data = np.concatenate((my_data, new_col), 1)

I believe that the only difference between these three functions (as well as np.vstack) are their default behaviors for when axis is unspecified:

  • concatenate assumes axis = 0
  • hstack assumes axis = 1 unless inputs are 1d, then axis = 0
  • vstack assumes axis = 0 after adding an axis if inputs are 1d
  • append flattens array

Based on your comment, and looking more closely at your example code, I now believe that what you are probably looking to do is add a field to a record array. You imported both genfromtxt which returns a structured array and recfromcsv which returns the subtly different record array (recarray). You used the recfromcsv so right now my_data is actually a recarray, which means that most likely my_data.shape = (210,) since recarrays are 1d arrays of records, where each record is a tuple with the given dtype.

So you could try this:

import numpy as np
from numpy.lib.recfunctions import append_fields
x = np.random.random(10)
y = np.random.random(10)
z = np.random.random(10)
data = np.array( list(zip(x,y,z)), dtype=[('x',float),('y',float),('z',float)])
data = np.recarray(data.shape, data.dtype, buf=data)
data.shape
#(10,)
tot = data['x'] + data['y'] + data['z'] # sum(axis=1) won't work on recarray
tot.shape
#(10,)
all_data = append_fields(data, 'total', tot, usemask=False)
all_data
#array([(0.4374783740738456 , 0.04307289878861764, 0.021176067323686598, 0.5017273401861498),
# (0.07622262416466963, 0.3962146058689695 , 0.27912715826653534 , 0.7515643883001745),
# (0.30878532523061153, 0.8553768789387086 , 0.9577415585116588 , 2.121903762680979 ),
# (0.5288343561208022 , 0.17048864443625933, 0.07915689716226904 , 0.7784798977193306),
# (0.8804269791375121 , 0.45517504750917714, 0.1601389248542675 , 1.4957409515009568),
# (0.9556552723429782 , 0.8884504475901043 , 0.6412854758843308 , 2.4853911958174133),
# (0.0227638618687922 , 0.9295332854783015 , 0.3234597575660103 , 1.275756904913104 ),
# (0.684075052174589 , 0.6654774682866273 , 0.5246593820025259 , 1.8742119024637423),
# (0.9841793718333871 , 0.5813955915551511 , 0.39577520705133684 , 1.961350170439875 ),
# (0.9889343795296571 , 0.22830104497714432, 0.20011292764078448 , 1.4173483521475858)],
# dtype=[('x', '<f8'), ('y', '<f8'), ('z', '<f8'), ('total', '<f8')])
all_data.shape
#(10,)
all_data.dtype.names
#('x', 'y', 'z', 'total')

How can I add a column to a numpy array

  • You can use the np.insert

    new_x = np.insert(x, 0, 1, axis=1)
  • You can use the np.append method to add your array at the right of a column of 1 values

    x = np.array([[1, 2], [3, 4], [5, 6]])
    ones = np.array([[1]] * len(x))
    new_x = np.append(ones, x, axis=1)

Both will give you the expected result

[[1 1 2]
[1 3 4]
[1 5 6]]

Add an extra column to ndarray in python

This solved my problem. I used np.column_stack.

feature_matrix = [[0.1, 0.3], [0.7, 0.8], [0.8, 0.8]]
position = [10, 20, 30]
feature_matrix = np.column_stack((position, feature_matrix))

How I can add column to matrix in numpy

You will not need to predefine a ones array. You can use numpy.insert function directly:

arr = np.array(range(25)).reshape(5,5)
arr_with_ones = np.insert(arr, 0, 1, axis=1)

np.insert(arr, 0, 1, axis=1) inserts value=1 in index 0 along axis=1(which is columns in 2D array) of array arr.

output:

[[ 1  0  1  2  3  4]
[ 1 5 6 7 8 9]
[ 1 10 11 12 13 14]
[ 1 15 16 17 18 19]
[ 1 20 21 22 23 24]]

Add new column to Numpy Array as a function of the rows

You can generally append arrays to each other using np.concatenate when they have similar dimensionality. You can guarantee that sum will retain dimensionality regardless of axis using the keepdims argument:

np.concatenate((M.sum(axis=1, keepdims=True), M), axis=1)

This is equivalent to

np.concatenate((M.sum(1)[:, None], M), axis=1)

Python: Add a column to numpy 2d array

Let me just throw in a very simple example with much smaller size. The principle should be the same.

a = np.zeros((6,2))
array([[ 0., 0.],
[ 0., 0.],
[ 0., 0.],
[ 0., 0.],
[ 0., 0.],
[ 0., 0.]])
b = np.ones((6,1))
array([[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 1.]])

np.hstack((a,b))
array([[ 0., 0., 1.],
[ 0., 0., 1.],
[ 0., 0., 1.],
[ 0., 0., 1.],
[ 0., 0., 1.],
[ 0., 0., 1.]])

How to add multiple extra columns to a NumPy array

An alternative to concatenate approach is to make a recipient array, and copy values to it:

In [483]: a = np.arange(300).reshape(100,3)
In [484]: b=np.array([8,9])
In [485]: res = np.zeros((100,5),int)
In [486]: res[:,:3]=a
In [487]: res[:,3:]=b

sample timings

In [488]: %%timeit
...: res = np.zeros((100,5),int)
...: res[:,:3]=a
...: res[:,3:]=b
...:
...:
6.11 µs ± 20.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [491]: timeit np.concatenate((a, b.repeat(100).reshape(2,-1).T),1)
7.74 µs ± 15.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [164]: timeit np.concatenate([a, np.ones([a.shape[0],1], dtype=int).dot(np.array([b]))], axis=1)
8.58 µs ± 160 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

How to insert new column using numpy and conditionally add values to it

rgb = np.random.uniform(0, 255, (100, 100, 3)).astype("uint")
# example position with [0,0,0]
rgb[0, 0, :] = 0

# mask where condition is satisfied
msk = np.sum(rgb, axis=2)==0

# Prep the alpha layer
nrows, ncols, _ = rgb.shape
alpha = np.ones((nrows, ncols), dtype="uint") * 255

# Make positions that satisfy condition equal to 0
alpha[msk]= 0

# add extra alpha layer
rgba = np.dstack([rgb, alpha])


Related Topics



Leave a reply



Submit