Select rows from a data frame based on values in a vector
Have a look at ?"%in%"
.
dt[dt$fct %in% vc,]
fct X
1 a 2
3 c 3
5 c 5
7 a 7
9 c 9
10 a 1
12 c 2
14 c 4
You could also use ?is.element
:
dt[is.element(dt$fct, vc),]
Subset rows in a data frame based on a vector of values
This will give you what you want:
eg2011cleaned <- eg2011[!eg2011$ID %in% bg2011missingFromBeg, ]
The error in your second attempt is because you forgot the ,
In general, for convenience, the specification object[index]
subsets columns for a 2d object
. If you want to subset rows and keep all columns you have to use the specificationobject[index_rows, index_columns]
, while index_cols
can be left blank, which will use all columns by default.
However, you still need to include the ,
to indicate that you want to get a subset of rows instead of a subset of columns.
Select rows of data frame based on a vector with duplicated values
Another method of doing the same without a loop:
sample_df = data.frame(x=1:6, y=c(1,1,2,2,3,3))
row_names <- split(1:nrow(sample_df),sample_df$y)
select_y = c(1,3,3)
row_num <- unlist(row_names[as.character(select_y)])
ans <- sample_df[row_num,]
How to select rows that match the values in a vector
Another solution is by using match
:
Data:
set.seed(123)
all <- data.frame(match = sample(LETTERS, 10),
otherStuff = rnorm(10))
index <- data.frame(match = sample(LETTERS, 10),
moreStuff = rnorm(10))
Solution:
all[match(index$match, all$match, nomatch = 0),]
match otherStuff
10 Z -0.5558411
5 W -0.4456620
8 Q 0.4007715
6 A 1.2240818
7 K 0.3598138
Select rows based on condition and set values from a vector
You can specify columns in list for set:
df.loc[df['one']=='b', ['two', 'three']] = vector[df['one']=='b']
print(df)
one two three
0 a 1 1
1 a 1 1
2 b 4 4
Or if need more dynamic solution - select all numeric columns:
df.loc[df['one']=='b', df.select_dtypes(np.number).columns] = vector[df['one']=='b']
Or compare only once and assign to variable:
m = df['one']=='b'
df.loc[m, df.select_dtypes(np.number).columns] = vector[m]
R select rows in dataframe by external vector as index
It is easier to filter by the gene names, if you keep them as a column,
instead of making them rownames
.
The following changes to your code will get you the result you are lookin for.
library(tidyverse)
df <-data.frame("Names" = c("TIGIT", "ABCB1", "CD8B", "CD8A", "CD1C", "F2RL1", "LCP1", "LAG3", "ABL1", "CD2", "IL12A", "PSEN2", "CD3G", "CD28", "PSEN1", "ITGA1"),"1S" = c("5", "6", "8", "99", "5", "0", "1", "3", "15", "15", "34", "62", "54", "6", "8", "9"), "1T" = c("6", "4", "6", "9", "5", "11", "33", "7", "8", "24", "34", "62", "66", "4", "78", "44"))
genes_to_select <- c("TIGIT", "CD8B", "CD8A", "CD1C", "F2RL1", "LCP1", "LAG3", "CD2", "PSEN2", "CD3G", "CD28", "PSEN1") # genes I want to select
df <-
df %>%
filter(Names %in% genes_to_select) %>%
column_to_rownames("Names") %>%
mutate(across(.fns = as.numeric)) %>%
as.matrix()
df
#> X1S X1T
#> [1,] 5 6
#> [2,] 8 6
#> [3,] 99 9
#> [4,] 5 5
#> [5,] 0 11
#> [6,] 1 33
#> [7,] 3 7
#> [8,] 15 24
#> [9,] 62 62
#> [10,] 54 66
#> [11,] 6 4
#> [12,] 8 78
Select rows from Data Frame based on listed values in R Programming
Does
subset(WC_Grounds, WC_Grounds$Country=="England" | WC_Grounds$Ground %in% WC_Grounds_List)
Work for you?
|| and && - These operators are “short-circuiting”: as soon as || sees the first TRUE it returns TRUE without computing anything else. You should instead use
|
which is vectorized thus applying to the multiple values in your dataset.
Here is an example using the abbreviated sample data I added to your question:
WC_Grounds <- data.frame(
Ground = c('Hambantota', 'Benoni', 'Benoni', 'Hambantota', 'Hambantota',
'Pallekele', 'Pallekele'),
Country = c('Bangladesh', 'SouthAfrica', 'Pakistan', 'SriLanka', 'Bangladesh',
'SriLanka', 'Bangladesh')
)
List = c('Hambantota', 'Benoni')
subset(WC_Grounds, WC_Grounds$Country == "SriLanka" | WC_Grounds$Ground %in% List)
#> Ground Country
#> 1 Hambantota Bangladesh
#> 2 Benoni SouthAfrica
#> 3 Benoni Pakistan
#> 4 Hambantota SriLanka
#> 5 Hambantota Bangladesh
#> 6 Pallekele SriLanka
Created on 2021-03-20 by the reprex package (v1.0.0)
Related Topics
Calculate Max Value Across Multiple Columns by Multiple Groups
Find All Combinations of a Set of Numbers That Add Up to a Certain Total
Add X and Y Axis to All Facet_Wrap
Rstudio Does Not Display Any Output in Console After Entering Code
Count Number of Rows Within Each Group
Is R'S Apply Family More Than Syntactic Sugar
Group by Multiple Columns and Sum Other Multiple Columns
Reorder Bars in Geom_Bar Ggplot2 by Value
Formula With Dynamic Number of Variables
R: Pulling Data from One Column to Create New Columns
Deleting Rows in R Based on Values Over Multiple Columns
Gsub a Every Element After a Keyword in R
Setting Individual Axis Limits With Facet_Wrap and Scales = "Free" in Ggplot2
How to Find the Closest Date to a Given Date
Select/Assign to Data.Table When Variable Names Are Stored in a Character Vector