Filter dataframe using global variable with the same name as column name
You can do:
df %>% filter(y == .GlobalEnv$y)
or:
df %>% filter(y == .GlobalEnv[["y"]])
or:
both of which work in this context, but won't if all this is going on inside a function. But get
will:
df %>% filter(y == get("y"))
f = function(df, y){df %>% filter(y==get("y"))}
So use get
.
Or just use df[df$y==y,]
instead of dplyr
.
R: How to use a global variable that clashes with column name a dplyr workflow?
We can use (!!
)
library(dplyr)
data.frame(x = 1, y = 2) %>%
mutate(xy = x+ !!y)
-output
# x y xy
#1 1 2 2
Or extract directly from the .GlobalEnv
data.frame(x = 1, y = 2) %>%
mutate(xy = x+ .GlobalEnv$y)
-output
# x y xy
#1 1 2 2
Referring to columns and variables with the same name in dplyr filter
This could be achieved via the .env
pronoun from rlang
:
See e.g. this blog post.
library(dplyr)
id = "a"
df <- tibble(
id = c("a", "b", "c"),
value = c(1, 2, 3)
)
df %>%
dplyr::filter(id == .env$id)
#> # A tibble: 1 × 2
#> id value
#> <chr> <dbl>
#> 1 a 1
Search for does-not-contain on a DataFrame in pandas
You can use the invert (~) operator (which acts like a not for boolean data):
new_df = df[~df["col"].str.contains(word)]
where new_df
is the copy returned by RHS.
contains also accepts a regular expression...
If the above throws a ValueError or TypeError, the reason is likely because you have mixed datatypes, so use na=False
:
new_df = df[~df["col"].str.contains(word, na=False)]
Or,
new_df = df[df["col"].str.contains(word) == False]
data.table: column name same as variable name
To use get
and avoid .SD
, you need to set the environment
dt <- data.table::data.table(myvar = 1:10)
myvar <- 2
dt[myvar %in% get("myvar", envir = parent.env(environment()))]
#> myvar
#> 1: 2
Using parent.env(environment())
instead of globalenv()
is more stable. Consider its usage in a function where looking in the Global Environment would not work
myfun <- function() {
dt <- data.table::data.table(myvar = 1:10)
myvar <- 2
dt[myvar %in% get("myvar", envir = parent.env(environment()))]
}
myfun()
#> myvar
#> 1: 2
Subsetting a data.table with a variable (when varname identical to colname)
If you don't mind doing it in 2 steps, you can just subset out of the scope of your data.table
(though it's usually not what you want to do when working with data.table...):
wh_v1 <- my_data_table[, V1]==V1
my_data_table[wh_v1]
# V1 V2
#1: A 1
#2: A 4
filtering with a variable does not give the same results as with a constant - R
The problem is that your data contains a column i. And in tidyverse pipes, the functions will always look within the data first, so what you essentially trying to do with patch_sparse %>% filter(period==i)
is to filter on rows where period is equal to the column i of your data.
So if you want to filter based on an external scalar, make sure the name of the scalar is different from your data's column names, e.g. something like:
filter_i <- 0
patch_sparse %>% filter(period==filter_i)
Related Topics
Modifying Plot in Ggplot2 Using As.Yearmon from Zoo
Mlogit: Missing Value Where True/False Needed
How to Create a Continuous Legend (Color Bar Style) for Scale_Alpha
Using Inst/Extdata with Vignette During Package Checking R 2.14.0
Existing Function to Combine Standard Deviations in R
How to Use Stat_Bin2D() to Compute Counts Labels in Ggplot2
Include Non-Cran Package in Cran Package
How to Install/Locate R.H and Rmath.H Header Files
Force Ggplot to Evaluate Counter Variable
Rselenium on Docker: Where Are Files Downloaded
Error in Xj[I]: Invalid Subscript Type 'List'
R: Remove Repeating Row Entries in Gridextra Table
R -Apply- Convert Many Columns from Numeric to Factor
Passing Ellipsis Arguments to Map Function Purrr Package, R
Flag First By-Group in R Data Frame
Ggplot2 Ggsave Function Causes Graphics Device to Not Display Plots
Center Error Bars (Geom_Errorbar) Horizontally on Bars (Geom_Bar)