dplyr::select() with some variables that may not exist in the data frame?
Another option is select_if
:
d2 %>% select_if(names(.) %in% c('taxon', 'model', 'z'))
# # A tibble: 1 x 2
# taxon z
# <dbl> <dbl>
# 1 2 3
select_if
is superseded. Use any_of
instead:
d2 %>% select(any_of(c('taxon', 'model', 'z')))
# # A tibble: 1 x 2
# taxon z
# <dbl> <dbl>
# 1 2 3
type ?dplyr::select
in R and you will find this:
These helpers select variables from a character vector:
all_of(): Matches variable names in a character vector. All names must
be present, otherwise an out-of-bounds error is thrown.any_of(): Same as all_of(), except that no error is thrown for names
that don't exist.
How do I select columns that may or may not exist?
In the devel version of dplyr
df %>%
select(year, contains("boo"))
# year
#1 2000
#2 2001
#3 2002
#4 2003
#5 2004
#6 2005
#7 2006
#8 2007
#9 2008
#10 2009
#11 2010
gives the expected output
Otherwise one option would be to use one_of
df %>%
select(one_of("year", "boo"))
It returns a warning message if the column is not available
Other option is matches
df %>%
select(matches("year|boo"))
dplyr::select() to reorder columns which may not exist
We can use intersect
library(dplyr)
tibby %>%
select(intersect(col_order, names(.)))
# A tibble: 10 x 3
# a b d
# <dbl> <dbl> <dbl>
# 1 -0.0449 0.935 -0.626
# 2 -0.0162 0.212 0.184
# 3 0.944 0.652 -0.836
# 4 0.821 0.126 1.60
# 5 0.594 0.267 0.330
# 6 0.919 0.386 -0.820
# 7 0.782 0.0134 0.487
# 8 0.0746 0.382 0.738
# 9 -1.99 0.870 0.576
#10 0.620 0.340 -0.305
dplyr: select all variables except for those contained in vector
select(df, -any_of(excluded_vars))
is now the safest way to do this (the code will not break if a variable name that doesn't exist in df is included in excluded_vars
)
dplyr r : selecting columns whose names are in an external vector
We could use any_of
with select
library(dplyr)
data %>%
select(any_of(col_names))
-output
a b
1 1 e
2 4 e
3 13 f
4 8 m
5 10 z
6 3 y
...
dplyr::select - Including All Other Columns at End of New Data Frame (or Beginning or Middle)
Update: using dplyr::relocate()
flights %>%
relocate(carrier, tailnum, year, month, day)
flights %>%
relocate(carrier, tailnum, year, month, day, .after = last_col())
Old answer
>If you want to **reorder the columns**select(flights, carrier, tailnum, year, month, day, everything())
Or in two steps, to select variables provided in a character vector, one_of("x", "y", "z")
:
col <- c("carrier", "tailnum", "year", "month", "day")
select(flights, one_of(col), everything())
select(flights, -one_of(col), one_of(col))
If you want to add all the data frame again using
dplyr
:
bind_cols(select(flights, one_of(col)), flights)
bind_cols(flights, select(flights, one_of(col)))
dplyr::select Object not found in self-made function
There are two issues with your function. The first error arises because calendario
is no column of the df
passed to the function. Simply remove the df$
when specifying the aesthetics. Second. Even when removing the df$
you set the y-aesthetic equal the string in variable dato
, i.e. "indice_covid" in your example. That is for every date you have the same value "indice_covid". That's why you get a flat line. To tell ggplot2 that you want a the column dato
of the df you have to convert it to a symbol using sym
and the bang-bang-operator !!
, i.e. !!sym(dato)
. Try this:
library(ggplot2)
library(dplyr)
plot_by_reg <- function(df, reg, dato) {
df %>%
dplyr::filter(denominazione_regione == reg) %>%
dplyr::mutate(calendario = format(as.Date(paste(mese,giorno , sep = "-" ) , format = "%m-%d" ), "%m-%d")) %>%
dplyr::select(c(denominazione_regione, calendario, all_of(dato))) %>%
#ggplot(aes(x=df$calendario, y=df$dato)) +
ggplot(aes(x = calendario, y = !!sym(dato))) +
geom_line(aes(group = 1)) +
theme_dark()
}
plot_by_reg(df = data.moving, reg = "Toscana", dato = "indice_covid")
Created on 2020-05-25 by the reprex package (v0.3.0)
How can I select only the dummy variable columns?
You can pass a function (or rlang-tilde function) to select_if
, and look for columns that only contain 0:1
.
tribble(
~id, ~gender, ~height, ~smoking,
1, 1, 170, 0,
2, 0, 150, 0,
3, 1, 160, 1
) %>%
select_if(~ all(. %in% 0:1))
# # A tibble: 3 x 2
# gender smoking
# <dbl> <dbl>
# 1 1 0
# 2 0 0
# 3 1 1
If you may have NA
in a dummy-variable column, you may want to instead use %in% c(0:1, NA)
in the predicate.
Related Topics
How to Remove Leading "0." in a Numeric R Variable
Insert Function Variable into Graph Title
R - How to Get a Value of a Multi-Dimensional Array by a Vector of Indices
"Unpacking" a Factor List from a Data.Frame
Error in Bind_Rows_(X, .Id):Argument 1 Must Have Names
How to Save the Wordcloud in R
Calculate Percentage for Each Time Series Observations Per Group in R
Flattening a Delimited Composite Column
Ggplot2 PDF Import in Adobe Illustrator Missing Font Adobepistd
Embedding Googlevis Charts into a Web Site
What's the Difference Between Substitute and Quote in R
Linking Intel's Math Kernel Library (Mkl) to R on Windows
Rmarkdown in Shiny Application
Clickable Links in Shiny Datatable
Significance Level Added to Matrix Correlation Heatmap Using Ggplot2
How to Label Histogram Bars with Data Values or Percents in R
How to Use Black-And-White Fill Patterns Instead of Color Coding on Calendar Heatmap