Removing one table from another in R
We can use anti_join
library(dplyr)
anti_join(A, B, by = c('Col1', 'Col2'))
How to delete rows in one table, based on the values of another table
A possible approach with base R:
tab1[tab1$UserID %in% tab2$UserID[!tab2$Admin],]
which gives:
UserID AssigID Score TimeStamp TimeOnTask
2 12254 23956 22 2017-11-18 13:16:00 256
3 12644 23956 74 2012-12-17 13:18:00 365
4 11257 23957 45 2012-10-10 13:29:00 102
5 12667 23958 25 2012-11-10 13:40:00 109
What this does:
tab2$UserID[!tab2$Admin]
gives a vector of user ID's that are not an Admin. The!tab2$Admin
part makes sure only the ID's that are not an Admin are selected.- with
tab1$UserID %in% ...
you select only the user ID's fromtab1
that are in the vector from the first step. This returns a logical vector with which you subsequently subsettab1
Used data:
tab1 <- structure(list(UserID = c(14532L, 12254L, 12644L, 11257L, 12667L),
AssigID = c(23956L, 23956L, 23956L, 23957L, 23958L),
Score = c(52L, 22L, 74L, 45L, 25L),
TimeStamp = structure(c(1510402260, 1511007360, 1355746680, 1349868540, 1352551200), class = c("POSIXct", "POSIXt"), tzone = ""),
TimeOnTask = c(401L, 256L, 365L, 102L, 109L)),
.Names = c("UserID", "AssigID", "Score", "TimeStamp", "TimeOnTask"), row.names = c(NA, -5L), class = "data.frame")
tab2 <- structure(list(UserID = c(14532L, 12254L, 12644L, 11257L, 12667L),
Admin = c(TRUE, FALSE, FALSE, FALSE, FALSE)),
.Names = c("UserID", "Admin"), class = "data.frame", row.names = c(NA, -5L))
R: How to remove values from a table that appear in another table?
There's a bunch of ways to do this.
Base R subset solution (as noted by Balter above):
M4M3.new <- M4M3[!(M4M3$gene_id %in% M4F4$gene_id),]
Base R set union solution:
M4M3.new <- setdiff(M4M3, M4F4)
Dplyr solution
M4M3.new <- dplyr::anti_join(M4M3,
M4F4,
by = c("gene_id" = "gene_id"))
Edit: All appeared to work tested on the following dataset:
tst1 <- data.frame(gene_id = seq(1:10),
sample_1 = rep("M4", 10),
sample_2 = c(rep("M3", 6), rep("F4", 4)),
other_values = sample(1:10, 10, replace = T),
other_values2 = rep("OK", 10))
M4M3 <- tst1[tst1$sample_1 == "M4" & tst1$sample_2 == "M3",]
M4F4 <- tst1[tst1$sample_1 == "M4" & tst1$sample_2 == "F4",]
Remove rows in data.table according to another data.table
Use an anti-join:
dtA[!dtB, on=.(date, company, value)]
This matches all records in dtA
that are not found in dtB
using the columns in on
.
Delete rows that exist in another data frame?
You need the %in%
operator. So,
df1[!(df1$name %in% df2$name),]
should give you what you want.
df1$name %in% df2$name
tests whether the values indf1$name
are indf2$name
- The
!
operator reverses the result.
Removing specific groups in R in data.table
Just use your group_vector
with %in%
operator.
data[group %in% group_vector]
group values
1: 1001 10
2: 2800 23
3: 3230 32
4: 4600 34
Removing data from one dataframe that exists in another dataframe R
Base R Solution
list_one[!list_one$letters %in% list_two$letters2,]
gives you:
letters numbers
2 b 2
5 e 5
Explanation:
> list_one$letters %in% list_two$letters2
[1] TRUE FALSE TRUE TRUE FALSE
This gives you a vector of LENGTH == length(list_one$letters)
with TRUE/FALSE Values. !
negates this vector. So you end up with FALSE/TRUE values if the value is present in list_two$letters2.
If you have questions about how to select rows from a data.frame enter
?`[.data.frame`
to the console and read it.
Remove rows in one dataframe if they are present in another dataframe
In Base R
df[-match(df2$ASV, df$ASV),]
or even
dplyr::anti_join(df, df2)
Related Topics
Compute Projection/Hat Matrix via Qr Factorization, Svd (And Cholesky Factorization)
How to Add Expressions to Labels in Facet_Wrap
How to Prep Transaction Data into Basket for Arules
Manipulating Files with Non-English Names in R
Dplyr - Mutate Dynamically Named Variables Using Other Dynamically Named Variables
Plot a Character Vector Against a Numeric Vector in R
How to Split Data Frame by Column Names in R
Can Sparklyr Be Used with Spark Deployed on Yarn-Managed Hadoop Cluster
Harvest (Rvest) Multiple HTML Pages from a List of Urls
Web Scraping of Key Stats in Yahoo! Finance with R
Force Facet_Wrap to Fill Bottom Row (And Leave Any "Gaps" in the Top Row)
Rselenium, Chrome, How to Set Download Directory, File Download Error
How to Apply Dplyr's Select(,Starts_With()) on Rows, Not Columns