Convert a Row of a Data Frame to a Simple Vector in R

Convert a row of a data frame to vector

When you extract a single row from a data frame you get a one-row data frame. Convert it to a numeric vector:

as.numeric(df[1,])

As @Roland suggests, unlist(df[1,]) will convert the one-row data frame to a numeric vector without dropping the names. Therefore unname(unlist(df[1,])) is another, slightly more explicit way to get to the same result.

As @Josh comments below, if you have a not-completely-numeric (alphabetic, factor, mixed ...) data frame, you need as.character(df[1,]) instead.

Convert a dataframe to a vector (by rows)

You can try as.vector(t(test)). Please note that, if you want to do it by columns you should use unlist(test).

Getting a row from a data frame as a vector in R

Data.frames created by importing data from a external source will have their data transformed to factors by default. If you do not want this set stringsAsFactors=FALSE

In this case to extract a row or a column as a vector you need to do something like this:

as.numeric(as.vector(DF[1,]))

or like this

as.character(as.vector(DF[1,]))

Convert data.frame column to a vector?

I'm going to attempt to explain this without making any mistakes, but I'm betting this will attract a clarification or two in the comments.

A data frame is a list. When you subset a data frame using the name of a column and [, what you're getting is a sublist (or a sub data frame). If you want the actual atomic column, you could use [[, or somewhat confusingly (to me) you could do aframe[,2] which returns a vector, not a sublist.

So try running this sequence and maybe things will be clearer:

avector <- as.vector(aframe['a2'])
class(avector)

avector <- aframe[['a2']]
class(avector)

avector <- aframe[,2]
class(avector)

Using R convert data.frame to simple vector

see ?unlist

Given a list structure x, unlist simplifies it to produce a vector
which contains all the atomic components which occur in x.

unlist(v.row)
[1] 177 165 177 177 177 177 145 132 126 132 132 132 126 120 145 167 167 167
167 165 177 177 177 177

EDIT

You can do it with as.vector also, but you need to provide the correct mode:

 as.vector(v.row,mode='numeric')
[1] 177 165 177 177 177 177 145 132 126 132 132 132 126 120 145 167 167
167 167 165 177 177 177 177

Get a row in data.frame as a vector where each element is a string

Use unlist and then as.character

as.character(unlist(test[1, ]))
#[1] "no" "no" "no" "yes" "no" "no" "yes"

test[1, ] is still a dataframe and applying as.character on data frame doesn't work. We use unlist to make dataframe to vector and then use as.character to convert it into character.

data

test <- structure(list(A = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "no", class = "factor"), 
T = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "no", class = "factor"),
L = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "no", class = "factor"),
B = structure(c(2L, 1L, 1L, 2L, 1L, 1L), .Label = c("no",
"yes"), class = "factor"), E = structure(c(1L, 1L, 1L, 1L,
1L, 1L), .Label = "no", class = "factor"), X = structure(c(1L,
1L, 1L, 1L, 1L, 1L), .Label = "no", class = "factor"), D = structure(c(2L,
1L, 1L, 2L, 1L, 1L), .Label = c("no", "yes"), class = "factor")),
class = "data.frame", row.names = c("4", "7", "11", "12", "17", "27"))

Convert a row into a combine, c() as a vector in r and then use vectors to calculate the cosine similarity

Another approach would be to use apply over each row, which allows you to set the environment directly:

apply(df, 1, function(x) assign(x[1], tail(x, -1), envir = globalenv()))

However I agree with @danlooo's comment: I can't think of any reason that you would want to do this.

Edit: how to calculate cosine similarity matrix (following comment)

If you want to calculate a cosine similarity matrix it's better to start off with a matrix than to clutter up your global environment, and then have to do a potentially large combination of pairwise calculations.

First get the data into the right format, a numeric matrix with column names which are the first column of your data frame:

data_matrix  <- tail(t(df), -1) |>
sapply(as.numeric) |>
matrix(
nrow = ncol(df) - 1,
ncol = nrow(df),
dimnames = list(
seq_len(ncol(df)-1), # rows
df[,1] # columns
)
)

data_matrix
# i1 i10 i11
# 1 0.11 0.07 0.114
# 2 0.07 0.08 0.030

Then it is straightforward to calculate the cosine similarity:


library(lsa)
cosine(data_matrix)

# i1 i10 i11
# i1 1.0000000 0.9595950 0.9525148
# i10 0.9595950 1.0000000 0.8283488
# i11 0.9525148 0.8283488 1.0000000

convert a row of a data frame to a simple vector in R

Example from mtcars data

mydata<-mtcars
k<-mydata[1,]
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21 6 160 110 3.9 2.62 16.46 0 1 4 4
names(k)<-NULL

unlist(c(k))
[1] 21.00 6.00 160.00 110.00 3.90 2.62 16.46 0.00 1.00 4.00 4.00

Updated as per @Ananda: unlist(mydata[1, ], use.names = FALSE)

R: avoid turning one-row data frames into a vector when using apply functions

You can solve your problem by using lapply instead of sapply, and then combine the result using do.call as follows

new_df <- as.data.frame(lapply(mydf[,-1,drop=F], function(x) gsub("\\s+","_",x)))
new_df <- do.call(cbind, new_df)
new_df
# value1 value2
#[1,] "A_1" "Z_1"

new_df <- cbind(mydf[,1,drop=F], new_df)
#new_df
# ID value1 value2
#1 A A_1 Z_1

As for your question about unpredictable behavior of sapply, it is because s in sapply represent simplification, but the simplified result is not guaranteed to be a data frame. It can be a data frame, a matrix, or a vector.

According to the documentation of sapply:

sapply is a user-friendly version and wrapper of lapply by default
returning a vector, matrix or, if simplify = "array", an array if
appropriate, by applying simplify2array().

On the simplify argument:

logical or character string; should the result be simplified
to a vector, matrix or higher dimensional array if possible? For
sapply it must be named and not abbreviated. The default value, TRUE,
returns a vector or matrix if appropriate, whereas if simplify =
"array" the result may be an array of “rank” (=length(dim(.))) one
higher than the result of FUN(X[[i]]).

The Details part explain its behavior that loos similar with what you experienced (emphasis is from me) :

Simplification in sapply is only attempted if X has length greater
than zero and if the return values from all elements of X are all of
the same (positive) length. If the common length is one the result is
a vector
, and if greater than one is a matrix with a column
corresponding to each element of X.

Hadley Wickham also recommend not to use sapply:

I recommend that you avoid sapply() because it tries to simplify the
result, so it can return a list, a vector, or a matrix. This makes it
difficult to program with, and it should be avoided in non-interactive
settings

He also recommends not to use apply with a data frame. See Advanced R for further explanation.



Related Topics



Leave a reply



Submit