Extracting columns having greater than certain values in R dataframe
We could use colSums
to subset columns using base R
df[colSums(df > 0.6) > 0]
# Jux Gyno
#1 0.67 0.89
#2 0.11 0.65
#3 0.60 0.67
#4 0.09 0.01
Or with dplyr
, select_if
library(dplyr)
df %>% select_if(~any(. > 0.6))
R: Select only Rows where value greater than a certain value and Mapped to another column where value is Yes or No
Turns out it was pretty easy.
x = df[df$Answer == "Yes"]
x = df[df$Age >= 40]
x$Age
Extract all rows with any value greater than x
You can use filter_all
:
library(tidyverse)
df <- mtcars %>%
select_if(is.numeric) %>%
cor() %>%
round(digits = 2) %>%
as.data.frame() %>%
filter_all(all_vars(abs(.) > 0.4))
Result:
mpg cyl disp hp drat wt qsec vs am gear carb
1 1.00 -0.85 -0.85 -0.78 0.68 -0.87 0.42 0.66 0.60 0.48 -0.55
2 -0.85 1.00 0.90 0.83 -0.70 0.78 -0.59 -0.81 -0.52 -0.49 0.53
To select columns where all values are greater than 0.4, use select_if
:
df <- mtcars %>%
select_if(is.numeric) %>%
cor() %>%
round(digits = 2) %>%
as.data.frame() %>%
select_if(funs(all(abs(.) > 0.4)))
Result:
mpg cyl
mpg 1.00 -0.85
cyl -0.85 1.00
disp -0.85 0.90
hp -0.78 0.83
drat 0.68 -0.70
wt -0.87 0.78
qsec 0.42 -0.59
vs 0.66 -0.81
am 0.60 -0.52
gear 0.48 -0.49
carb -0.55 0.53
Note:
If you want rows or columns with any value greater than 0.4, just switch out all_vars
or all
with any_vars
or any
respectively:
filter_all(any_vars(abs(.) > 0.4))
select_if(funs(any(abs(.) > 0.4)))
Getting rows whose value are greater than the group mean
You can just group
and then filter
:
mydf %>%
group_by(A) %>%
filter(B > mean(B, na.rm = TRUE)) %>%
ungroup()
How to compare variables to a number in a column (dataframe) in R
You shouldn't need to use a loop. You can subset a data frame directly using a logical test on the specified column. verbs[verbs$LengthOfTheme > 1.609438,]
Select rows of a matrix that meet a condition
This is easier to do if you convert your matrix to a data frame using as.data.frame(). In that case the previous answers (using subset or m$three) will work, otherwise they will not.
To perform the operation on a matrix, you can define a column by name:
m[m[, "three"] == 11,]
Or by number:
m[m[,3] == 11,]
Note that if only one row matches, the result is an integer vector, not a matrix.
Extract rows for the first occurrence of a variable in a data frame
t.first <- species[match(unique(species$Taxa), species$Taxa),]
should give you what you're looking for. match
returns indices of the first match in the compared vectors, which give you the rows you need.
Related Topics
Mean of a Column in a Data Frame, Given the Column's Name
Dynamic Column Names in Data.Table
Remove Null Elements from List of Lists
Add Max Value to a New Column in R
What Do the %Op% Operators in Mean? for Example "%In%"
Pretty Ticks for Log Normal Scale Using Ggplot2 (Dynamic Not Manual)
Why Is Using '<<-' Frowned Upon and How to Avoid It
Set Margin Size When Converting from Markdown to PDF with Pandoc
Is It a Good Practice to Call Functions in a Package via ::
Delete Columns/Rows with More Than X% Missing
Delete "" from CSV Values and Change Column Names When Writing to a CSV
Understanding Dates and Plotting a Histogram with Ggplot2 in R
Data.Table and Parallel Computing
How to Override a Non-Visible Function in the Package Namespace
How to Subset Data in R Without Losing Na Rows