How to select non-numeric columns using dplyr::select_if
You can use purrr
's negate()
which is included if you use library(tidyverse)
rather than just library(dplyr)
library(tidyverse)
iris %>% select_if(negate(is.numeric))
Extract all columns except numeric in R data frame
purrr
package from tidyverse
serves exactly what you want by purrr::keep
and purrr::discard
library(purrr)
x <- iris %>% keep(is.numeric)
by these piece of code, you set a logical test in keep
function and only the columns which passed the test stays.
to reverse that operation and achieve to your wish, you can use discard
from purrr
also;
x <- iris %>% discard(is.numeric)
you can think discard
as keep
but with !is.numeric
or alternatively by dplyr
x <- iris %>% select_if(~!is.numeric(.))
How to select_if in dplyr, where the logical condition is negated
Negating a predicate function can be done with the dedicated Negate()
or purrr::negate()
functions (rather than the !
operator, that negates a vector):
library(dplyr)
mtcars %>%
mutate(foo = "bar") %>%
select_if(Negate(is.numeric)) %>%
head()
# foo
# 1 bar
# 2 bar
# 3 bar
# 4 bar
# 5 bar
# 6 bar
Or (purrr::negate()
(lower-case) has slightly different behavior, see the respective help pages):
library(purrr)
library(dplyr)
mtcars %>%
mutate(foo = "bar") %>%
select_if(negate(is.numeric)) %>%
head()
# foo
# 1 bar
# 2 bar
# 3 bar
# 4 bar
# 5 bar
# 6 bar
Selecting only numeric columns from a data frame
EDIT: updated to avoid use of ill-advised sapply
.
Since a data frame is a list we can use the list-apply functions:
nums <- unlist(lapply(x, is.numeric), use.names = FALSE)
Then standard subsetting
x[ , nums]
## don't use sapply, even though it's less code
## nums <- sapply(x, is.numeric)
For a more idiomatic modern R I'd now recommend
x[ , purrr::map_lgl(x, is.numeric)]
Less codey, less reflecting R's particular quirks, and more straightforward, and robust to use on database-back-ended tibbles:
dplyr::select_if(x, is.numeric)
Newer versions of dplyr, also support the following syntax:
x %>% dplyr::select(where(is.numeric))
How to exclude non-numeric columns in dplyr statement
You can use dplyr's select_if
function:
df %>% select_if(is.numeric)
or as Mislav suggested in comments, go straight to a summary using summarise_if
.
df %>%
group_by(Pop_Size_Group) %>%
summarise_if(is.numeric, mean, na.rm = TRUE)
How do I remove all integer columns from a dataframe in R with dplyr?
Use the select_if
out <- mydata %>%
select_if(Negate(is.integer))
str(out)
#'data.frame': 50 obs. of 2 variables:
# $ Murder: num 13.2 10 8.1 8.8 9 7.9 3.3 5.9 15.4 17.4 ...
# $ Rape : num 21.2 44.5 31 19.5 40.6 38.7 11.1 15.8 31.9 25.8 ...
If we want to select more than one type, then use
mydata %>%
select_if(~ !(is.integer(.x)) | is.numeric(.x))
Filtering in dplyr based on two non-numeric values
Try changing Variable == "System" & "System (U.S.)"
for Variable == "System" | Variable == "System (U.S.)"
. That should work.
How to use select() only on columns of a certain type without loosing columns of other types?
Perhaps an option could be to create your own custom function, and use that as the predicate
in the select_if
function. Something like this:
check_cond <- function(x) is.character(x) | is.numeric(x) && sum(x) > 12
tibbly %>%
select_if(check_cond)
y z
<chr> <dbl>
1 a 9
2 b 8
3 c 7
4 d 6
Related Topics
Programming-Safe Version of Subset - to Evaluate Its Condition While Called from Another Function
Plotting Ordiellipse Function from Vegan Package Onto Nmds Plot Created in Ggplot2
How to Format Data for Plotly Sunburst Diagram
How to Add a Page Break in Word Document Generated by Rstudio & Markdown
How to Change the Now Deprecated Dplyr::Funs() Which Includes an Ifelse Argument
Subset Rows According to a Range of Time
Merge Dataframes on Matching A, B and *Closest* C
Legend of a Raster Map with Categorical Data
Rounding Time to Nearest Quarter Hour
How to Sort a Character Vector According to a Specific Order
Finding Non-Numeric Data in a Data Frame or Vector
Dplyr Group by Colnames Described as Vector of Strings