How to use a character variable to refer to a data.frame in R?
You can use get()
to fetch the object, i.e.,
p <-lapply(t,function(x){x <- get(x);x[x$X == "a",]})
such that
> p
[[1]]
X Y
1 a 5.550481
[[2]]
X Y
1 a 5.365116
[[3]]
X Y
1 a 5.783017
[[4]]
X Y
1 a 2.782952
[[5]]
X Y
1 a 2.123357
Get data frame from character variable
I think you'll need a get
in there if the input is a string. Also, depending on your usage of the function, the explicit print
might not be necessary:
printdf <- function(dataframe) {
get(dataframe)
# print(get(dataframe))
}
head(printdf("mtcars"))
# mpg cyl disp hp drat wt qsec vs am gear carb
# Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
# Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
# Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
# Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
# Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
# Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
Dataframe from a character vector where variable name and its data were stored jointly
We may use read.dcf
from base R
out <- type.convert(as.data.frame(read.dcf(
textConnection(paste(gsub("\\s+\\|\\s+", "\n", foo$vars),
collapse="\n\n")))), as.is = TRUE)
-output
> out
animal wks site PI GI
1 mouse 12 cage 78 NA
2 dog 32 <NA> NA 0.2
3 cat 8 wild 13 NA
> str(out)
'data.frame': 3 obs. of 5 variables:
$ animal: chr "mouse" "dog" "cat"
$ wks : int 12 32 8
$ site : chr "cage" NA "wild"
$ PI : int 78 NA 13
$ GI : num NA 0.2 NA
Create a character variable with data.frame function
stringsAsFactors
is your friend. Namely:
df = data.frame(var1 = c("1","2","3","4"),var2 = c(1,2,3,4),stringsAsFactors = F)
yielding:
> str(df)
'data.frame': 4 obs. of 2 variables:
$ var1: chr "1" "2" "3" "4"
$ var2: num 1 2 3 4
Using a character vector to refer to a data frame
It would be get
to return the value of the object name as string
C <- get(B)
If there are more than objects, use mget
to return the values in a list
collapse a dataframe in R that contains both numeric and character variables
Is this what you are looking for?
library(dplyr)
data %>%
dplyr::group_by(ag, date) %>%
summarise(across(everything(), ~
if(is.numeric(.x)) mean(.x) else first(.x)))
#> `summarise()` has grouped output by 'ag'. You can override using the `.groups` argument.
#> # A tibble: 12 x 6
#> # Groups: ag [4]
#> ag date num_var1 num_var2 alpha_var1 alpha_var2
#> <chr> <int> <dbl> <dbl> <chr> <chr>
#> 1 A 1 3 22 A Y
#> 2 A 2 11 14 I Q
#> 3 A 3 19 6 Q I
#> 4 B 1 4 21 B X
#> 5 B 2 12 13 J P
#> 6 B 3 20 5 R H
#> 7 C 1 5 20 C W
#> 8 C 2 13 12 K O
#> 9 C 3 21 4 S G
#> 10 D 1 6 19 D V
#> 11 D 2 14 11 L N
#> 12 D 3 22 3 T F
Created on 2022-03-03 by the reprex package (v2.0.1)
Dynamically select data frame columns using $ and a character value
You can't do that kind of subsetting with $
. In the source code (R/src/main/subset.c
) it states:
/*The $ subset operator.
We need to be sure to only evaluate the first argument.
The second will be a symbol that needs to be matched, not evaluated.
*/
Second argument? What?! You have to realise that $
, like everything else in R, (including for instance (
, +
, ^
etc) is a function, that takes arguments and is evaluated. df$V1
could be rewritten as
`$`(df , V1)
or indeed
`$`(df , "V1")
But...
`$`(df , paste0("V1") )
...for instance will never work, nor will anything else that must first be evaluated in the second argument. You may only pass a string which is never evaluated.
Instead use [
(or [[
if you want to extract only a single column as a vector).
For example,
var <- "mpg"
#Doesn't work
mtcars$var
#These both work, but note that what they return is different
# the first is a vector, the second is a data.frame
mtcars[[var]]
mtcars[var]
You can perform the ordering without loops, using do.call
to construct the call to order
. Here is a reproducible example below:
# set seed for reproducibility
set.seed(123)
df <- data.frame( col1 = sample(5,10,repl=T) , col2 = sample(5,10,repl=T) , col3 = sample(5,10,repl=T) )
# We want to sort by 'col3' then by 'col1'
sort_list <- c("col3","col1")
# Use 'do.call' to call order. Seccond argument in do.call is a list of arguments
# to pass to the first argument, in this case 'order'.
# Since a data.frame is really a list, we just subset the data.frame
# according to the columns we want to sort in, in that order
df[ do.call( order , df[ , match( sort_list , names(df) ) ] ) , ]
col1 col2 col3
10 3 5 1
9 3 2 2
7 3 2 3
8 5 1 3
6 1 5 4
3 3 4 4
2 4 3 4
5 5 1 4
1 2 5 5
4 5 3 5
Related Topics
In R, How to Plot into a Memory Buffer Instead of a File
Fastest Way to Sort Each Row of a Large Matrix in R
Are Data Tables with More Than 2^31 Rows Supported in R with the Data Table Package Yet
How to Pass Individual 'Curvature' Arguments in 'Ggplot2' 'Geom_Curve' Function
How to Pad a Vector with Na from the Front
From Long to Wide Data with Multiple Columns
Combine Multiple PDF Plots into One File
How to Calculate Total Least Squares in R? (Orthogonal Regression)
Rhtml: Warning: Conversion Failure on '<Var>' in 'Mbcstosbcs': Dot Substituted for <Var>
Unexpected Symbol Error in Parse(Text = Str) with Hyphen After a Digit
How to Fill Histogram with Color Gradient
How to Convert Unix Timestamp (Milliseconds) and Timezone in R
Dplyr - Mutate Dynamically Named Variables Using Other Dynamically Named Variables