dplyr: how to avoid hard coding variable names when I need them all?
It really seems like everything()
(newly fully exported) should do the trick, but it doesn't. Especially if you're going to be doing a lot of operations on all your columns, it may be worth it to make a list column with a vector of each row, on which you can easily call unique
, max
, etc. Here assembled with purrr
, though you could do the same with apply(df, 1, list) %>% lapply(unlist)
:
library(purrr)
df1 <- df %>%
mutate(data = df %>% transpose() %>% map(unlist)) %>%
rowwise() %>%
filter(length(unique(data)) == 3)
df1
# Source: local data frame [210 x 4]
# Groups: <by row>
#
# X1 X2 X3 data
# <int> <int> <int> <list>
# 1 3 2 1 <int [3]>
# 2 4 2 1 <int [3]>
# 3 5 2 1 <int [3]>
# 4 6 2 1 <int [3]>
# 5 7 2 1 <int [3]>
# 6 2 3 1 <int [3]>
# 7 4 3 1 <int [3]>
# 8 5 3 1 <int [3]>
# 9 6 3 1 <int [3]>
# 10 7 3 1 <int [3]>
# .. ... ... ... ...
df1 %>%
rowwise() %>%
filter(max(data) - min(data) == 2) %>%
ungroup() %>%
summarise(res = n() / nrow(df1)) %>%
unlist %>%
as.fractions()
# res
# 1/7
arranging columns using dplyr::select without hardcoding
By the sounds of it, you want to use select_
along with the .dots
argument:
> table %>% select_(.dots = col.string)
empcnt1 wage1 empcnt2 wage2 empcnt3 wage3 cnty
1 200 40 200 40 200 40 1
2 300 50 300 50 300 50 2
3 400 60 400 60 400 60 3
piping dplyr mutate with unknown variable name
You don't need enquo
there. That's for turning a value passed as a parameter into a quosure. Instead you need to turn a string into a symbol. For that you can use as.name()
or rlang::sym()
ff <- function(tt){
# find the variable name
if(any(colnames(tt)=="AD")){
vv <- quo(AD)
} else {
vv <- colnames(tt) %>% .[.!="f1"]
vv <- as.name(vv)
}
# make the mutate
tt %>% mutate(!!quo_name(vv) := as.factor(!!vv))
}
Using `dplyr::mutate()` to create several new variables from names specified in a vector
You don't need any quosures here because you are not dealing with quoted expressions. Just create vectors with appropriate names and splice them in. Here I use the fact that vectors of length 1 are recycled, but you could also create vectors of full length:
add_columns <- function(df, columns){
new <- rep(NA_character_, length(columns))
names(new) <- columns
mutate(df, !!!new)
}
Casting unique features in column to variable names and dummy coding original features into variables in R
You can use cSplit_e
from my "splitstackshape" package, like this:
library(splitstackshape)
cSplit_e(mydata, "NAMES", sep = ",", type = "character", fill = 0)
# ID NAMES NAMES_333 NAMES_4444 NAMES_456 NAMES_765
# 1 1 4444, 333, 456 1 1 1 0
# 2 2 333 1 0 0 0
# 3 3 456, 765 0 0 1 1
If you want to see the underlying function that is called when you use those arguments, you can look at splitstackshape:::charMat
, which takes a list
generated by strsplit
and creates a matrix
from it.
Calling the function directly would give you something like this:
splitstackshape:::charMat(
lapply(strsplit(as.character(mydata$NAMES), ","),
function(x) gsub("^\\s+|\\s$", "", x)))
# 333 4444 456 765
# [1,] 1 1 1 NA
# [2,] 1 NA NA NA
# [3,] NA NA 1 1
How to apply function row-by-row into a data frame using dplyr without hardcoding the column names
A concise base R
option would be
dat$score <- do.call(pmax, dat)/rowSums(dat)
In tidyverse
we can do
library(tidyverse)
dat %>%
mutate(score = do.call(pmax, .)/reduce(., `+`))
# setosa versicolor virginica score
#1 50 0 0 1.0000000
#2 0 11 36 0.7659574
#3 0 39 14 0.7358491
Variable name as argument in function - ggplot R
If you call plot_var_aantal
with a string like "people"
you need to evalute vari
as a symbol on the righthand side with !!sym(vari)
and on the lefthand side you need to put it into a glue
specification "{vari}"
. The following code should work:
plot_var_aantal<-function(vari){
eval(substitute(vari), MASTERDATA)
MASTERDATA%>%
group_by(COMPANY)%>%
summarize(AMOUNT_COMP = (sum(AMOUNT, na.rm=TRUE)),
Type=Type,
"{vari}" := !!sym(vari)) %>% # changed this line
filter(!is.na(Type)) %>%
ggplot(aes(!! sym(vari),AMOUNT_COMP ), na.rm = TRUE) + # changed this line
geom_point()+
facet_wrap(~Type,nrow=4)
}
However, without seeing your data it is hard to figure out what COMP_VAR1 = COMP_VAR1
is doing in your dplyr::summarise
call. You are not using an aggregating function (like mean
or paste(..., collapse = ",")
) so the whole summarise
is probably not summarising but returning data in the original length. Similarly the line "{vari}" := !!sym(vari)
doesn't seem to make sense (although the non-standard evaluation when vari
is a string, is correct).
Related Topics
Convert Accented Characters into Ascii Character
Stop Lapply from Printing to Console
Why Is Subsetting on a "Logical" Type Slower Than Subsetting on "Numeric" Type
Find Consecutive Values in Vector in R
Assign Names to Data Frame with As.Data.Frame Function
Handle Continuous Missing Values in Time-Series Data
Use Rollapply and Zoo to Calculate Rolling Average of a Column of Variables
Loess Regression on Each Group with Dplyr::Group_By()
Update a Dataset After Putting a New Value in the Dt::Datatable
Annotate Values Above Bars (Ggplot Faceted)
What Are the Differences Between Concatenating Strings with Cat() and Paste()
Legend of a Raster Map with Categorical Data
Change the Number of Breaks Using Facet_Grid in Ggplot2
How to Turn Gpclibpermit() to True
Trouble Passing on an Argument to Function Within Own Function