Get Data Frame from Character Variable

How to use a character variable to refer to a data.frame in R?

You can use get() to fetch the object, i.e.,

p <-lapply(t,function(x){x <- get(x);x[x$X == "a",]})

such that

> p
[[1]]
X Y
1 a 5.550481

[[2]]
X Y
1 a 5.365116

[[3]]
X Y
1 a 5.783017

[[4]]
X Y
1 a 2.782952

[[5]]
X Y
1 a 2.123357

Get data frame from character variable

I think you'll need a get in there if the input is a string. Also, depending on your usage of the function, the explicit print might not be necessary:

printdf <- function(dataframe) {
get(dataframe)
# print(get(dataframe))
}
head(printdf("mtcars"))
# mpg cyl disp hp drat wt qsec vs am gear carb
# Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
# Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
# Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
# Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
# Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
# Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1

Dataframe from a character vector where variable name and its data were stored jointly

We may use read.dcf from base R

out <- type.convert(as.data.frame(read.dcf(
textConnection(paste(gsub("\\s+\\|\\s+", "\n", foo$vars),
collapse="\n\n")))), as.is = TRUE)

-output

> out
animal wks site PI GI
1 mouse 12 cage 78 NA
2 dog 32 <NA> NA 0.2
3 cat 8 wild 13 NA
> str(out)
'data.frame': 3 obs. of 5 variables:
$ animal: chr "mouse" "dog" "cat"
$ wks : int 12 32 8
$ site : chr "cage" NA "wild"
$ PI : int 78 NA 13
$ GI : num NA 0.2 NA

Create a character variable with data.frame function

stringsAsFactors is your friend. Namely:

df = data.frame(var1 = c("1","2","3","4"),var2 = c(1,2,3,4),stringsAsFactors = F) 

yielding:

> str(df)
'data.frame': 4 obs. of 2 variables:
$ var1: chr "1" "2" "3" "4"
$ var2: num 1 2 3 4

Using a character vector to refer to a data frame

It would be get to return the value of the object name as string

C <- get(B)

If there are more than objects, use mget to return the values in a list

collapse a dataframe in R that contains both numeric and character variables

Is this what you are looking for?

library(dplyr)

data %>%
dplyr::group_by(ag, date) %>%
summarise(across(everything(), ~
if(is.numeric(.x)) mean(.x) else first(.x)))

#> `summarise()` has grouped output by 'ag'. You can override using the `.groups` argument.
#> # A tibble: 12 x 6
#> # Groups: ag [4]
#> ag date num_var1 num_var2 alpha_var1 alpha_var2
#> <chr> <int> <dbl> <dbl> <chr> <chr>
#> 1 A 1 3 22 A Y
#> 2 A 2 11 14 I Q
#> 3 A 3 19 6 Q I
#> 4 B 1 4 21 B X
#> 5 B 2 12 13 J P
#> 6 B 3 20 5 R H
#> 7 C 1 5 20 C W
#> 8 C 2 13 12 K O
#> 9 C 3 21 4 S G
#> 10 D 1 6 19 D V
#> 11 D 2 14 11 L N
#> 12 D 3 22 3 T F

Created on 2022-03-03 by the reprex package (v2.0.1)

Dynamically select data frame columns using $ and a character value

You can't do that kind of subsetting with $. In the source code (R/src/main/subset.c) it states:

/*The $ subset operator.

We need to be sure to only evaluate the first argument.

The second will be a symbol that needs to be matched, not evaluated.

*/

Second argument? What?! You have to realise that $, like everything else in R, (including for instance ( , + , ^ etc) is a function, that takes arguments and is evaluated. df$V1 could be rewritten as

`$`(df , V1)

or indeed

`$`(df , "V1")

But...

`$`(df , paste0("V1") )

...for instance will never work, nor will anything else that must first be evaluated in the second argument. You may only pass a string which is never evaluated.

Instead use [ (or [[ if you want to extract only a single column as a vector).

For example,

var <- "mpg"
#Doesn't work
mtcars$var
#These both work, but note that what they return is different
# the first is a vector, the second is a data.frame
mtcars[[var]]
mtcars[var]

You can perform the ordering without loops, using do.call to construct the call to order. Here is a reproducible example below:

#  set seed for reproducibility
set.seed(123)
df <- data.frame( col1 = sample(5,10,repl=T) , col2 = sample(5,10,repl=T) , col3 = sample(5,10,repl=T) )

# We want to sort by 'col3' then by 'col1'
sort_list <- c("col3","col1")

# Use 'do.call' to call order. Seccond argument in do.call is a list of arguments
# to pass to the first argument, in this case 'order'.
# Since a data.frame is really a list, we just subset the data.frame
# according to the columns we want to sort in, in that order
df[ do.call( order , df[ , match( sort_list , names(df) ) ] ) , ]

col1 col2 col3
10 3 5 1
9 3 2 2
7 3 2 3
8 5 1 3
6 1 5 4
3 3 4 4
2 4 3 4
5 5 1 4
1 2 5 5
4 5 3 5


Related Topics



Leave a reply



Submit