Conditionally Remove Rows from a Database Using R

Remove rows conditionally from a data.table in R

In this scenario it is not so different than data.frame

data <- data[ menuitem != 'coffee' | amount > 0] 

Delete/add row by reference it is to be implemented. You find more info in this question

Regarding speed:

1 You can benefit from keys by doing something like:

setkey(data, menuitem)
data <- data[!"coffee"]

which will be faster than data <- data[ menuitem != 'coffee']. However to apply the same filters you asked in the question you'll need a rolling join (I've finished my lunch break I can add something later :-)).

2 Even without key data.table is much faster for relatively big table (similar speed for handful amount of rows)

dt<-data.table(id=sample(letters,1000000,T),var=rnorm(1000000))
df<-data.frame(id=sample(letters,1000000,T),var=rnorm(1000000))
library(microbenchmark)
> microbenchmark(dt[ id == "a"], df[ df$id == "a",])
Unit: milliseconds
expr min lq median uq max neval
dt[id == "a"] 24.42193 25.74296 26.00996 26.35778 27.36355 100
df[df$id == "a", ] 138.17500 146.46729 147.38646 149.06766 154.10051 100

Delete row from data.frame based on condition

Thanks @Simon for the suggestions. One criteria I wanted was that the code made sense as I "read" it. As I thought more, another criteria is that I wanted to be deliberate about what changes to make. So I incorporated Simon's recommendation to make a separate column and then use dplyr::filter() to exclude those variables. Here's what an example segment of code looked like:

#Change pre/post entries
data[data$UserID == 52118254, "Prepost"][2] <- 2

#Mark rows to delete
data$toDelete <- NA #Makes new empty column for marking deletions
data[data$UserID == 52118284,][2, "toDelete"] <- 1 #Marks row for deletion

#Filter to exclude rows
data %>% filter(is.na(toDelete))
#Optionally add "%>% select(-toDelete)" to remove the extra column

In my context, advantages here are that everything is deliberate rather than automatic and changes are anchored to data rather than row numbers that might change. I'd still welcome any feedback or other ways of achieving this (maybe in a single step).

Remove Rows From Data Frame where a Row matches a String

Just use the == with the negation symbol (!). If dtfm is the name of your data.frame:

dtfm[!dtfm$C == "Foo", ]

Or, to move the negation in the comparison:

dtfm[dtfm$C != "Foo", ]

Or, even shorter using subset():

subset(dtfm, C!="Foo")

Removing rows from a data frame until a condition is met

Your while-loop doesn't redefine block2_df. This should work:

while (dim(block_2_df)[1]>1) {
block_2_df <- remove_fun(block_2_df)
}

Delete rows that exist in another data frame?

You need the %in% operator. So,

df1[!(df1$name %in% df2$name),]

should give you what you want.

  • df1$name %in% df2$name tests whether the values in df1$name are in df2$name
  • The ! operator reverses the result.

R: Deleting rows based on a value in a column from a large data set in R

I suggest you learn how to use dplyr, and other packages in the tidyverse. I find them to be an indispensable tool in cleaning data.

Here's how I would use dplyr to filter out both Texas and New York in your data set:

library(dplyr)
customers = filter(customers, State != "TX" & State != "NY")

Alternatively,

customers = filter(customers, !(State %in% c("TX", "NY")))

How to remove row if it has a NA value in one certain column

The easiest solution is to use is.na():

df[!is.na(df$B), ]

which gives you:

   A B  C
1 NA 2 NA
2 1 2 3
4 1 2 3

R- Remove several rows based on a value

by(df,df$Year,function(x)x[!colSums(is.na(x))])
df$Year: 1980
Year Month stn1
1 1980 1 8
2 1980 2 4
3 1980 3 6
4 1980 4 3
5 1980 5 0
6 1980 6 1
7 1980 7 3
8 1980 8 6
9 1980 9 1
10 1980 10 2
11 1980 11 1
12 1980 12 4
------------------------------------------------------------------
df$Year: 1981
Year Month stn2
13 1981 1 4
14 1981 2 7
15 1981 3 9
16 1981 4 1
17 1981 5 2
18 1981 6 6
19 1981 7 9
20 1981 8 8
21 1981 9 5
22 1981 10 1
23 1981 11 3
24 1981 12 2


Related Topics



Leave a reply



Submit