Adding a New Column to Matrix Error

Adding a new column to matrix error

As @thelatemail pointed out, the $ operator cannot be used to subset a matrix. This is because a matrix is just a single vector with a dimension attribute. When you used $ to try to add a new column, R converted your matrix to the lowest structure where $ can be used on the vector, which is a list.

The function you want is cbind() (column bind). Suppose I have the matrix m

(m <- matrix(51:70, 4))
# [,1] [,2] [,3] [,4] [,5]
# [1,] 51 55 59 63 67
# [2,] 52 56 60 64 68
# [3,] 53 57 61 65 69
# [4,] 54 58 62 66 70

To add the a new column from a vector called labels, we can do

labels <- 1:4
cbind(m, newColumn = labels)
# newColumn
# [1,] 51 55 59 63 67 1
# [2,] 52 56 60 64 68 2
# [3,] 53 57 61 65 69 3
# [4,] 54 58 62 66 70 4

Error when adding a new column: requires numeric/complex matrix/vector arguments

You can build your dataset as a matrix or a dataframe:

Dataframe:

df <- data.frame(var1 = c(1:5))

In this case, you can add a column by:

df$var2 <- c(6:10)
df

var1 var2
1 1 6
2 2 7
3 3 8
4 4 9
5 5 10

Matrix:

mx <- matrix(1:12, 4, 3)

For a matrix, you should:

mx <- cbind(mx, 13:16)
mx

[,1] [,2] [,3] [,4]
[1,] 1 5 9 13
[2,] 2 6 10 14
[3,] 3 7 11 15
[4,] 4 8 12 16

The main difference between them, in few words, is that a matrix can hold only one class of data. For example, every observation has to be numeric or character is checked with the function class(). More than one class cannot exist in the same matrix.

Dataframes instead have not this issue. You use data frames if columns (variables) can be expected to be of different types (numeric/character/logical etc.)

Matrices are better when you want to math operations. Data frames can be more useful if your columns have often names that you use to (es. df$var2)

You can convert a dataframe in a matrix, and the headers of the dataframe will be saved in the matrix. Please remember a difference: a dataframe you can do an operation (es a mean) on the second column with mean(df$var2). With a matrix, you have to use indexing mean(mx2[, 2]).

mx2 <- as.matrix(df)
mx2
var1 var2
[1,] 1 6
[2,] 2 7
[3,] 3 8
[4,] 4 9
[5,] 5 10

class(mx2)
"matrix"

When converting from dataframe to matrix with as.matrix, just be aware of coercion: it returns the matrix obtained by converting all the variables of your dataframe to numeric mode and then binding them together as the columns of a matrix.

Add new column conditional in Matrix

If I understand you correctly you can do the same code with pandas str processing methods:

df = pd.DataFrame({'genre':['Action', 'Drama', 'Drama ', 
' Drama', 'Western', 'Other Drama', 10]})

df['Drama_or_not'] = df['genre'].str.find('Drama')>0

This should address your error as well:

"argument of type 'float' is not iterable".

This error arises in your fourth line, I imagine, because genres is a float rather than an iterable object (e.g., strings or lists).

You should be careful though, if you have float values in a column which is meant to be only for strings - you should preferentially clean up and examine the data first so you understand why this is the case.

How to append a vector as a column in R matrix?

use cbind

cbind(c(1,2), matrix(1:6, nrow=2))

So in case you work with bigger data, imagine your matrix is saved as m and you have a vector my_vector you want to add as a column in front of this matrix, the command would be

new_m <- cbind(my_vector, m)

Make sure the dimension of your vector fit the number of rows in your matrix.

In case you want to add rows instead of columns, the command is called rbind and is used in exactly the same way.

Add column name to the column of a matrix

test_matrix <- matrix(ncol = 10, nrow = 100)

colnames(test_matrix) <- paste0("Column", seq(ncol(test_matrix)))

> head(test_matrix)
Column1 Column2 Column3 Column4 Column5 Column6 Column7 Column8 Column9 Column10
[1,] NA NA NA NA NA NA NA NA NA NA
[2,] NA NA NA NA NA NA NA NA NA NA
[3,] NA NA NA NA NA NA NA NA NA NA
[4,] NA NA NA NA NA NA NA NA NA NA
[5,] NA NA NA NA NA NA NA NA NA NA
[6,] NA NA NA NA NA NA NA NA NA NA

How do I add an extra column to a NumPy array?

I think a more straightforward solution and faster to boot is to do the following:

import numpy as np
N = 10
a = np.random.rand(N,N)
b = np.zeros((N,N+1))
b[:,:-1] = a

And timings:

In [23]: N = 10

In [24]: a = np.random.rand(N,N)

In [25]: %timeit b = np.hstack((a,np.zeros((a.shape[0],1))))
10000 loops, best of 3: 19.6 us per loop

In [27]: %timeit b = np.zeros((a.shape[0],a.shape[1]+1)); b[:,:-1] = a
100000 loops, best of 3: 5.62 us per loop


Related Topics



Leave a reply



Submit