How do I add an extra column to a NumPy array?
I think a more straightforward solution and faster to boot is to do the following:
import numpy as np
N = 10
a = np.random.rand(N,N)
b = np.zeros((N,N+1))
b[:,:-1] = a
And timings:
In [23]: N = 10
In [24]: a = np.random.rand(N,N)
In [25]: %timeit b = np.hstack((a,np.zeros((a.shape[0],1))))
10000 loops, best of 3: 19.6 us per loop
In [27]: %timeit b = np.zeros((a.shape[0],a.shape[1]+1)); b[:,:-1] = a
100000 loops, best of 3: 5.62 us per loop
How to add column to numpy array
I think that your problem is that you are expecting np.append
to add the column in-place, but what it does, because of how numpy data is stored, is create a copy of the joined arrays
Returns
-------
append : ndarray
A copy of `arr` with `values` appended to `axis`. Note that `append`
does not occur in-place: a new array is allocated and filled. If
`axis` is None, `out` is a flattened array.
so you need to save the output all_data = np.append(...)
:
my_data = np.random.random((210,8)) #recfromcsv('LIAB.ST.csv', delimiter='\t')
new_col = my_data.sum(1)[...,None] # None keeps (n, 1) shape
new_col.shape
#(210,1)
all_data = np.append(my_data, new_col, 1)
all_data.shape
#(210,9)
Alternative ways:
all_data = np.hstack((my_data, new_col))
#or
all_data = np.concatenate((my_data, new_col), 1)
I believe that the only difference between these three functions (as well as np.vstack
) are their default behaviors for when axis
is unspecified:
concatenate
assumesaxis = 0
hstack
assumesaxis = 1
unless inputs are 1d, thenaxis = 0
vstack
assumesaxis = 0
after adding an axis if inputs are 1dappend
flattens array
Based on your comment, and looking more closely at your example code, I now believe that what you are probably looking to do is add a field to a record array. You imported both genfromtxt
which returns a structured array and recfromcsv
which returns the subtly different record array (recarray
). You used the recfromcsv
so right now my_data
is actually a recarray
, which means that most likely my_data.shape = (210,)
since recarrays are 1d arrays of records, where each record is a tuple with the given dtype.
So you could try this:
import numpy as np
from numpy.lib.recfunctions import append_fields
x = np.random.random(10)
y = np.random.random(10)
z = np.random.random(10)
data = np.array( list(zip(x,y,z)), dtype=[('x',float),('y',float),('z',float)])
data = np.recarray(data.shape, data.dtype, buf=data)
data.shape
#(10,)
tot = data['x'] + data['y'] + data['z'] # sum(axis=1) won't work on recarray
tot.shape
#(10,)
all_data = append_fields(data, 'total', tot, usemask=False)
all_data
#array([(0.4374783740738456 , 0.04307289878861764, 0.021176067323686598, 0.5017273401861498),
# (0.07622262416466963, 0.3962146058689695 , 0.27912715826653534 , 0.7515643883001745),
# (0.30878532523061153, 0.8553768789387086 , 0.9577415585116588 , 2.121903762680979 ),
# (0.5288343561208022 , 0.17048864443625933, 0.07915689716226904 , 0.7784798977193306),
# (0.8804269791375121 , 0.45517504750917714, 0.1601389248542675 , 1.4957409515009568),
# (0.9556552723429782 , 0.8884504475901043 , 0.6412854758843308 , 2.4853911958174133),
# (0.0227638618687922 , 0.9295332854783015 , 0.3234597575660103 , 1.275756904913104 ),
# (0.684075052174589 , 0.6654774682866273 , 0.5246593820025259 , 1.8742119024637423),
# (0.9841793718333871 , 0.5813955915551511 , 0.39577520705133684 , 1.961350170439875 ),
# (0.9889343795296571 , 0.22830104497714432, 0.20011292764078448 , 1.4173483521475858)],
# dtype=[('x', '<f8'), ('y', '<f8'), ('z', '<f8'), ('total', '<f8')])
all_data.shape
#(10,)
all_data.dtype.names
#('x', 'y', 'z', 'total')
How can I add a column to a numpy array
You can use the
np.insert
new_x = np.insert(x, 0, 1, axis=1)
You can use the
np.append
method to add your array at the right of a column of1
valuesx = np.array([[1, 2], [3, 4], [5, 6]])
ones = np.array([[1]] * len(x))
new_x = np.append(ones, x, axis=1)
Both will give you the expected result
[[1 1 2]
[1 3 4]
[1 5 6]]
Add an extra column to ndarray in python
This solved my problem. I used np.column_stack.
feature_matrix = [[0.1, 0.3], [0.7, 0.8], [0.8, 0.8]]
position = [10, 20, 30]
feature_matrix = np.column_stack((position, feature_matrix))
How I can add column to matrix in numpy
You will not need to predefine a ones
array. You can use numpy.insert
function directly:
arr = np.array(range(25)).reshape(5,5)
arr_with_ones = np.insert(arr, 0, 1, axis=1)
np.insert(arr, 0, 1, axis=1)
inserts value=1
in index 0
along axis=1
(which is columns in 2D array) of array arr
.
output:
[[ 1 0 1 2 3 4]
[ 1 5 6 7 8 9]
[ 1 10 11 12 13 14]
[ 1 15 16 17 18 19]
[ 1 20 21 22 23 24]]
Add new column to Numpy Array as a function of the rows
You can generally append arrays to each other using np.concatenate
when they have similar dimensionality. You can guarantee that sum
will retain dimensionality regardless of axis using the keepdims
argument:
np.concatenate((M.sum(axis=1, keepdims=True), M), axis=1)
This is equivalent to
np.concatenate((M.sum(1)[:, None], M), axis=1)
Python: Add a column to numpy 2d array
Let me just throw in a very simple example with much smaller size. The principle should be the same.
a = np.zeros((6,2))
array([[ 0., 0.],
[ 0., 0.],
[ 0., 0.],
[ 0., 0.],
[ 0., 0.],
[ 0., 0.]])
b = np.ones((6,1))
array([[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 1.],
[ 1.]])
np.hstack((a,b))
array([[ 0., 0., 1.],
[ 0., 0., 1.],
[ 0., 0., 1.],
[ 0., 0., 1.],
[ 0., 0., 1.],
[ 0., 0., 1.]])
How to add multiple extra columns to a NumPy array
An alternative to concatenate
approach is to make a recipient array, and copy values to it:
In [483]: a = np.arange(300).reshape(100,3)
In [484]: b=np.array([8,9])
In [485]: res = np.zeros((100,5),int)
In [486]: res[:,:3]=a
In [487]: res[:,3:]=b
sample timings
In [488]: %%timeit
...: res = np.zeros((100,5),int)
...: res[:,:3]=a
...: res[:,3:]=b
...:
...:
6.11 µs ± 20.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [491]: timeit np.concatenate((a, b.repeat(100).reshape(2,-1).T),1)
7.74 µs ± 15.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [164]: timeit np.concatenate([a, np.ones([a.shape[0],1], dtype=int).dot(np.array([b]))], axis=1)
8.58 µs ± 160 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
How to insert new column using numpy and conditionally add values to it
rgb = np.random.uniform(0, 255, (100, 100, 3)).astype("uint")
# example position with [0,0,0]
rgb[0, 0, :] = 0
# mask where condition is satisfied
msk = np.sum(rgb, axis=2)==0
# Prep the alpha layer
nrows, ncols, _ = rgb.shape
alpha = np.ones((nrows, ncols), dtype="uint") * 255
# Make positions that satisfy condition equal to 0
alpha[msk]= 0
# add extra alpha layer
rgba = np.dstack([rgb, alpha])
Related Topics
Pandas: Adding New Column to Dataframe Which Is a Copy of the Index Column
Convert Excel Style Date with Pandas
Different Behaviour for List._Iadd_ and List._Add_
Where Is Python's Sys.Path Initialized From
Why Is Python 3.X's Super() Magic
What Are Logits? Differencebetween Softmax and Softmax_Cross_Entropy_With_Logits
The 'Is' Operator Behaves Unexpectedly with Non-Cached Integers
Python3 --Version Shows "Nameerror: Name 'Python3' Is Not Defined"
Python Analog of PHP's Natsort Function (Sort a List Using a "Natural Order" Algorithm)
Is It Pythonic: Naming Lambdas
Intuition and Idea Behind Reshaping 4D Array to 2D Array in Numpy