Convert a Dataframe to a Vector (By Rows)

Convert a row of a data frame to vector

When you extract a single row from a data frame you get a one-row data frame. Convert it to a numeric vector:

as.numeric(df[1,])

As @Roland suggests, unlist(df[1,]) will convert the one-row data frame to a numeric vector without dropping the names. Therefore unname(unlist(df[1,])) is another, slightly more explicit way to get to the same result.

As @Josh comments below, if you have a not-completely-numeric (alphabetic, factor, mixed ...) data frame, you need as.character(df[1,]) instead.

Convert a dataframe to a vector (by rows)

You can try as.vector(t(test)). Please note that, if you want to do it by columns you should use unlist(test).

Convert data.frame column to a vector?

I'm going to attempt to explain this without making any mistakes, but I'm betting this will attract a clarification or two in the comments.

A data frame is a list. When you subset a data frame using the name of a column and [, what you're getting is a sublist (or a sub data frame). If you want the actual atomic column, you could use [[, or somewhat confusingly (to me) you could do aframe[,2] which returns a vector, not a sublist.

So try running this sequence and maybe things will be clearer:

avector <- as.vector(aframe['a2'])
class(avector)

avector <- aframe[['a2']]
class(avector)

avector <- aframe[,2]
class(avector)

Convert pandas dataframe to vector

You can use method values on a series.
This returns a numpy array.

import pandas as pd

df = pd.DataFrame({
'Col1': ['Place', 'Country'],
'Col2': ['This', 'That'],
})

vector = df['Col1'].values

print(vector)
print(type(vector))

Output

['Place' 'Country']
<class 'numpy.ndarray'>

Convert a row into a combine, c() as a vector in r and then use vectors to calculate the cosine similarity

Another approach would be to use apply over each row, which allows you to set the environment directly:

apply(df, 1, function(x) assign(x[1], tail(x, -1), envir = globalenv()))

However I agree with @danlooo's comment: I can't think of any reason that you would want to do this.

Edit: how to calculate cosine similarity matrix (following comment)

If you want to calculate a cosine similarity matrix it's better to start off with a matrix than to clutter up your global environment, and then have to do a potentially large combination of pairwise calculations.

First get the data into the right format, a numeric matrix with column names which are the first column of your data frame:

data_matrix  <- tail(t(df), -1) |>
sapply(as.numeric) |>
matrix(
nrow = ncol(df) - 1,
ncol = nrow(df),
dimnames = list(
seq_len(ncol(df)-1), # rows
df[,1] # columns
)
)

data_matrix
# i1 i10 i11
# 1 0.11 0.07 0.114
# 2 0.07 0.08 0.030

Then it is straightforward to calculate the cosine similarity:


library(lsa)
cosine(data_matrix)

# i1 i10 i11
# i1 1.0000000 0.9595950 0.9525148
# i10 0.9595950 1.0000000 0.8283488
# i11 0.9525148 0.8283488 1.0000000

Getting a row from a data frame as a vector in R

Data.frames created by importing data from a external source will have their data transformed to factors by default. If you do not want this set stringsAsFactors=FALSE

In this case to extract a row or a column as a vector you need to do something like this:

as.numeric(as.vector(DF[1,]))

or like this

as.character(as.vector(DF[1,]))


Related Topics



Leave a reply



Submit