Convert a row of a data frame to vector
When you extract a single row from a data frame you get a one-row data frame. Convert it to a numeric vector:
as.numeric(df[1,])
As @Roland suggests, unlist(df[1,])
will convert the one-row data frame to a numeric vector without dropping the names. Therefore unname(unlist(df[1,]))
is another, slightly more explicit way to get to the same result.
As @Josh comments below, if you have a not-completely-numeric (alphabetic, factor, mixed ...) data frame, you need as.character(df[1,])
instead.
Convert data.frame column to a vector?
I'm going to attempt to explain this without making any mistakes, but I'm betting this will attract a clarification or two in the comments.
A data frame is a list. When you subset a data frame using the name of a column and [
, what you're getting is a sublist (or a sub data frame). If you want the actual atomic column, you could use [[
, or somewhat confusingly (to me) you could do aframe[,2]
which returns a vector, not a sublist.
So try running this sequence and maybe things will be clearer:
avector <- as.vector(aframe['a2'])
class(avector)
avector <- aframe[['a2']]
class(avector)
avector <- aframe[,2]
class(avector)
Converting the row of a data.table to a vector
The problem with extracting rows as vectors is that vectors are homogeneous while rows of data frames or data tables are not.
However, you can convert the data to a matrix then extract the row:
> x <- iris[1:10,1:4]
> as.matrix(x)[1,]
Sepal.Length Sepal.Width Petal.Length Petal.Width
5.1 3.5 1.4 0.2
Getting a row from a data frame as a vector in R
Data.frames created by importing data from a external source will have their data transformed to factors by default. If you do not want this set stringsAsFactors=FALSE
In this case to extract a row or a column as a vector you need to do something like this:
as.numeric(as.vector(DF[1,]))
or like this
as.character(as.vector(DF[1,]))
Get a row in data.frame as a vector where each element is a string
Use unlist
and then as.character
as.character(unlist(test[1, ]))
#[1] "no" "no" "no" "yes" "no" "no" "yes"
test[1, ]
is still a dataframe and applying as.character
on data frame doesn't work. We use unlist
to make dataframe to vector and then use as.character
to convert it into character.
data
test <- structure(list(A = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "no", class = "factor"),
T = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "no", class = "factor"),
L = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "no", class = "factor"),
B = structure(c(2L, 1L, 1L, 2L, 1L, 1L), .Label = c("no",
"yes"), class = "factor"), E = structure(c(1L, 1L, 1L, 1L,
1L, 1L), .Label = "no", class = "factor"), X = structure(c(1L,
1L, 1L, 1L, 1L, 1L), .Label = "no", class = "factor"), D = structure(c(2L,
1L, 1L, 2L, 1L, 1L), .Label = c("no", "yes"), class = "factor")),
class = "data.frame", row.names = c("4", "7", "11", "12", "17", "27"))
Convert a dataframe to a vector (by rows)
You can try as.vector(t(test))
. Please note that, if you want to do it by columns you should use unlist(test)
.
Convert a row into a combine, c() as a vector in r and then use vectors to calculate the cosine similarity
Another approach would be to use apply
over each row, which allows you to set the environment directly:
apply(df, 1, function(x) assign(x[1], tail(x, -1), envir = globalenv()))
However I agree with @danlooo's comment: I can't think of any reason that you would want to do this.
Edit: how to calculate cosine similarity matrix (following comment)
If you want to calculate a cosine similarity matrix it's better to start off with a matrix than to clutter up your global environment, and then have to do a potentially large combination of pairwise calculations.
First get the data into the right format, a numeric matrix with column names which are the first column of your data frame:
data_matrix <- tail(t(df), -1) |>
sapply(as.numeric) |>
matrix(
nrow = ncol(df) - 1,
ncol = nrow(df),
dimnames = list(
seq_len(ncol(df)-1), # rows
df[,1] # columns
)
)
data_matrix
# i1 i10 i11
# 1 0.11 0.07 0.114
# 2 0.07 0.08 0.030
Then it is straightforward to calculate the cosine similarity:
library(lsa)
cosine(data_matrix)
# i1 i10 i11
# i1 1.0000000 0.9595950 0.9525148
# i10 0.9595950 1.0000000 0.8283488
# i11 0.9525148 0.8283488 1.0000000
convert a row of a data frame to a simple vector in R
Example from mtcars
data
mydata<-mtcars
k<-mydata[1,]
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21 6 160 110 3.9 2.62 16.46 0 1 4 4
names(k)<-NULL
unlist(c(k))
[1] 21.00 6.00 160.00 110.00 3.90 2.62 16.46 0.00 1.00 4.00 4.00
Updated as per @Ananda: unlist(mydata[1, ], use.names = FALSE)
Convert pandas dataframe to vector
You can use method values
on a series.
This returns a numpy array.
import pandas as pd
df = pd.DataFrame({
'Col1': ['Place', 'Country'],
'Col2': ['This', 'That'],
})
vector = df['Col1'].values
print(vector)
print(type(vector))
Output
['Place' 'Country']
<class 'numpy.ndarray'>
Related Topics
How to Create, Structure, Maintain and Update Data Codebooks in R
Ggplot2: Is There a Fix for Jagged, Poor-Quality Text Produced by Geom_Text()
How to Print R Variables in Middle of String
How to Get Geom_Vline to Honor Facet_Wrap
In R, How to Subset a Data.Frame by Values from Another Data.Frame
Create Sections Through a Loop with Knitr
How to Run an 'R' Script Without Suppressing Output
Interpolate Zoo Object with Missing Dates
Display Y-Axis for Each Subplot When Faceting
What Are 'User' and 'System' Times Measuring in R System.Time(Exp) Output
Find the Index Position of the First Non-Na Value in an R Vector
Parallel Execution of Random Forest in R
Cartogram + Choropleth Map in R
Apply Function to Each Column in a Data Frame Observing Each Columns Existing Data Type
How to Put the Labels Outside of Piechart
Applying the Same Factor Levels to Multiple Variables in an R Data Frame