Dplyr Without Hard-Coding the Variable Names

dplyr: how to avoid hard coding variable names when I need them all?

It really seems like everything() (newly fully exported) should do the trick, but it doesn't. Especially if you're going to be doing a lot of operations on all your columns, it may be worth it to make a list column with a vector of each row, on which you can easily call unique, max, etc. Here assembled with purrr, though you could do the same with apply(df, 1, list) %>% lapply(unlist):

library(purrr)

df1 <- df %>%
mutate(data = df %>% transpose() %>% map(unlist)) %>%
rowwise() %>%
filter(length(unique(data)) == 3)

df1
# Source: local data frame [210 x 4]
# Groups: <by row>
#
# X1 X2 X3 data
# <int> <int> <int> <list>
# 1 3 2 1 <int [3]>
# 2 4 2 1 <int [3]>
# 3 5 2 1 <int [3]>
# 4 6 2 1 <int [3]>
# 5 7 2 1 <int [3]>
# 6 2 3 1 <int [3]>
# 7 4 3 1 <int [3]>
# 8 5 3 1 <int [3]>
# 9 6 3 1 <int [3]>
# 10 7 3 1 <int [3]>
# .. ... ... ... ...

df1 %>%
rowwise() %>%
filter(max(data) - min(data) == 2) %>%
ungroup() %>%
summarise(res = n() / nrow(df1)) %>%
unlist %>%
as.fractions()
# res
# 1/7

arranging columns using dplyr::select without hardcoding

By the sounds of it, you want to use select_ along with the .dots argument:

> table %>% select_(.dots = col.string)
empcnt1 wage1 empcnt2 wage2 empcnt3 wage3 cnty
1 200 40 200 40 200 40 1
2 300 50 300 50 300 50 2
3 400 60 400 60 400 60 3

piping dplyr mutate with unknown variable name

You don't need enquo there. That's for turning a value passed as a parameter into a quosure. Instead you need to turn a string into a symbol. For that you can use as.name() or rlang::sym()

ff <- function(tt){

# find the variable name
if(any(colnames(tt)=="AD")){
vv <- quo(AD)
} else {
vv <- colnames(tt) %>% .[.!="f1"]
vv <- as.name(vv)
}

# make the mutate
tt %>% mutate(!!quo_name(vv) := as.factor(!!vv))
}

Using `dplyr::mutate()` to create several new variables from names specified in a vector

You don't need any quosures here because you are not dealing with quoted expressions. Just create vectors with appropriate names and splice them in. Here I use the fact that vectors of length 1 are recycled, but you could also create vectors of full length:

add_columns <- function(df, columns){
new <- rep(NA_character_, length(columns))
names(new) <- columns
mutate(df, !!!new)
}

Casting unique features in column to variable names and dummy coding original features into variables in R

You can use cSplit_e from my "splitstackshape" package, like this:

library(splitstackshape)
cSplit_e(mydata, "NAMES", sep = ",", type = "character", fill = 0)
# ID NAMES NAMES_333 NAMES_4444 NAMES_456 NAMES_765
# 1 1 4444, 333, 456 1 1 1 0
# 2 2 333 1 0 0 0
# 3 3 456, 765 0 0 1 1

If you want to see the underlying function that is called when you use those arguments, you can look at splitstackshape:::charMat, which takes a list generated by strsplit and creates a matrix from it.

Calling the function directly would give you something like this:

splitstackshape:::charMat(
lapply(strsplit(as.character(mydata$NAMES), ","),
function(x) gsub("^\\s+|\\s$", "", x)))
# 333 4444 456 765
# [1,] 1 1 1 NA
# [2,] 1 NA NA NA
# [3,] NA NA 1 1

How to apply function row-by-row into a data frame using dplyr without hardcoding the column names

A concise base R option would be

dat$score <- do.call(pmax, dat)/rowSums(dat)

In tidyverse we can do

library(tidyverse)
dat %>%
mutate(score = do.call(pmax, .)/reduce(., `+`))
# setosa versicolor virginica score
#1 50 0 0 1.0000000
#2 0 11 36 0.7659574
#3 0 39 14 0.7358491

Variable name as argument in function - ggplot R

If you call plot_var_aantal with a string like "people" you need to evalute vari as a symbol on the righthand side with !!sym(vari) and on the lefthand side you need to put it into a glue specification "{vari}". The following code should work:

plot_var_aantal<-function(vari){
eval(substitute(vari), MASTERDATA)
MASTERDATA%>%
group_by(COMPANY)%>%
summarize(AMOUNT_COMP = (sum(AMOUNT, na.rm=TRUE)),
Type=Type,
"{vari}" := !!sym(vari)) %>% # changed this line
filter(!is.na(Type)) %>%
ggplot(aes(!! sym(vari),AMOUNT_COMP ), na.rm = TRUE) + # changed this line
geom_point()+
facet_wrap(~Type,nrow=4)

}

However, without seeing your data it is hard to figure out what COMP_VAR1 = COMP_VAR1 is doing in your dplyr::summarise call. You are not using an aggregating function (like mean or paste(..., collapse = ",")) so the whole summarise is probably not summarising but returning data in the original length. Similarly the line "{vari}" := !!sym(vari) doesn't seem to make sense (although the non-standard evaluation when vari is a string, is correct).



Related Topics



Leave a reply



Submit