filter function in dplyr errors: object 'name' not found
It does seem like you are getting the stats::filter
function and not the dplyr
one. To make sure you get the right one, use the notation dplyr::filter
.
d = data.frame(x=1:10,
name=c("foo","bar","baz","bar","bar","baz","fnord","qar","qux","quux"))
filter(d, !grepl("ar|ux", name))
Error in grepl("ar|ux", name) : object 'name' not found
dplyr::filter(d, !grepl("ar|ux", name))
x name
1 1 foo
2 3 baz
3 6 baz
4 7 fnord
You don't even need to do library(dplyr)
for this to work - you do need dplyr
installed though.
This works for functions from any package.
What is causing 'object not found' error in filter() with the across() function?
Sorry for omitting this in my previous suggestion to you. Unfortunately, your original question was closed before I could post it as an answer:
If you want your function to resemble
dplyr
, here's a few
modifications you can make. Write your function header asfunction(filename, opp, ...)
verbatim. Then, replace!is.na(ID)
withacross(..., ~ !is.na(.x))
verbatim. Now, you can callextract_ids()
and, just as you would with anydplyr
verb, you can
specify any selection of columns you want to filter outNA
s:extract_ids(filename = "farmers.csv", opp = "farmers", rid, another_column_you_want_without_NAs)
.
Object Not Found
As MrFlick rightly suggests in their comment, you should wrap ...
with c()
, so everything you pass in ...
is interpreted as the first argument to across()
: a single tidy-select
ion of columns from df
:
extract_ids <- function(filename, opp, ...) {
# ...
# Filter and select
df_id <- df %>%
# This format is preferred for dplyr workflows with pipes (%>%).
filter(across(c(...), ~ !is.na(.x)) & gc == 1) %>%
select(...)
# ...
}
Without this precaution, R interprets rid
and cintid
as multiple arguments to across()
, rather than as simply columns named by the first argument (the tidy-select
ion).
Variable Names in the Filepath
To get those variable names within your filepath, use
extract_ids <- function(filename, opp, ...) {
# ...
# Expand the '...' into a list of given variable names, which will get pasted.
path <- c("/Users/stephenpoole/Downloads/", opp, "_", match.call(expand.dots = FALSE)$`...`, ".csv")
# ...
}
though you might want to consider replacing match.call(expand.dots = FALSE)$`...`
, which currently mushes together the variable names:
"/Users/stephenpoole/Downloads/farmers_ridcintid.csv"
In exactly the same place, you might use the expression paste(match.call(expand.dots = FALSE)$`...`, collapse = "-")
, which will separate those variable names using -
"/Users/stephenpoole/Downloads/farmers_rid-cintid.csv"
or any other separator of your choice that gives a valid filename.
filter does not work properly in dplyr (Object not found)
In this line
df %>% select(Number) %>% filter(Letter == 'a')
The first call to select
leaves you with a data table containing only one column (Number
). Which is exactly why filter
is complaining - you threw the Letter
column away.
In the second call, you filter on Letter
first, then throw the column away.
So, filter
is working exactly as it should. There is no "general rule" for this beyond either "do things in a sensible order" or "garbage in garbage out"
object not found in filtering
As @Edward mentioned you are probably using stats::filter
. We can reproduce the same error message using sample mtcars
dataset.
stats::filter(mtcars, cyl > 12)
Error in stats::filter(mtcars, cyl > 12) : object 'cyl' not found
Use dplyr::filter
and you don't need to specify four conditions separately. Use :
data_filtered <- dplyr::filter(data, !Item %in% c("INFUSION SET (L.E 15)",
"INFUSION SET (L.E 4)","SYRINGE 3ML","CANNULA (22) BLUE"))
which is same as using subset
in base R:
data_filtered <- subset(data, !Item %in% c("INFUSION SET (L.E 15)",
"INFUSION SET (L.E 4)","SYRINGE 3ML","CANNULA (22) BLUE"))
Or
data_filtered <- data[!data$Item %in% c("INFUSION SET (L.E 15)",
"INFUSION SET (L.E 4)","SYRINGE 3ML","CANNULA (22) BLUE"), ]
dplyr::select Object not found in self-made function
There are two issues with your function. The first error arises because calendario
is no column of the df
passed to the function. Simply remove the df$
when specifying the aesthetics. Second. Even when removing the df$
you set the y-aesthetic equal the string in variable dato
, i.e. "indice_covid" in your example. That is for every date you have the same value "indice_covid". That's why you get a flat line. To tell ggplot2 that you want a the column dato
of the df you have to convert it to a symbol using sym
and the bang-bang-operator !!
, i.e. !!sym(dato)
. Try this:
library(ggplot2)
library(dplyr)
plot_by_reg <- function(df, reg, dato) {
df %>%
dplyr::filter(denominazione_regione == reg) %>%
dplyr::mutate(calendario = format(as.Date(paste(mese,giorno , sep = "-" ) , format = "%m-%d" ), "%m-%d")) %>%
dplyr::select(c(denominazione_regione, calendario, all_of(dato))) %>%
#ggplot(aes(x=df$calendario, y=df$dato)) +
ggplot(aes(x = calendario, y = !!sym(dato))) +
geom_line(aes(group = 1)) +
theme_dark()
}
plot_by_reg(df = data.moving, reg = "Toscana", dato = "indice_covid")
Created on 2020-05-25 by the reprex package (v0.3.0)
My dplyr::filter function keeps displaying different errors
From the error message it seems AF2
is a matrix, you can change it to dataframe and then your code should work.
AF2_MAF <- AF2 %>% as.data.frame() %>% filter (Frequency>=0.1)
Alternatively, you can also subset the matrix.
AF2_MAF <- AF2[AF2[, "Frequency"] >= 0.1, ]
Is there something wrong with this Filter function code in R?
You need library(dplyr)
after install.packages("dplyr")
.
Related Topics
Delete "" from CSV Values and Change Column Names When Writing to a CSV
How to Count True Values in a Logical Vector
How to Directly Select the Same Column from All Nested Lists Within a List
How to Change the Color Value of Just One Value in Ggplot2's Scale_Fill_Brewer
Remove Grid, Background Color, and Top and Right Borders from Ggplot2
What Is the Significance of the New Reference Classes
Adding Labels to Ggplot Bar Chart
Merge Many Data Frames from CSV Files, When Id Column Is Implied
How Convert Decimal to Posix Time
Plot One Numeric Variable Against N Numeric Variables in N Plots
How to Arrange an Arbitrary Number of Ggplots Using Grid.Arrange
Using Dynamic Column Names in 'Data.Table'
Same Function Over Multiple Data Frames in R
Saving Multiple Ggplots from Ls into One and Separate Files in R
Replacing Occurrences of a Number in Multiple Columns of Data Frame with Another Value in R