Remove rows conditionally from a data.table in R
In this scenario it is not so different than data.frame
data <- data[ menuitem != 'coffee' | amount > 0]
Delete/add row by reference it is to be implemented. You find more info in this question
Regarding speed:
1 You can benefit from keys by doing something like:
setkey(data, menuitem)
data <- data[!"coffee"]
which will be faster than data <- data[ menuitem != 'coffee']
. However to apply the same filters you asked in the question you'll need a rolling join (I've finished my lunch break I can add something later :-)).
2 Even without key data.table is much faster for relatively big table (similar speed for handful amount of rows)
dt<-data.table(id=sample(letters,1000000,T),var=rnorm(1000000))
df<-data.frame(id=sample(letters,1000000,T),var=rnorm(1000000))
library(microbenchmark)
> microbenchmark(dt[ id == "a"], df[ df$id == "a",])
Unit: milliseconds
expr min lq median uq max neval
dt[id == "a"] 24.42193 25.74296 26.00996 26.35778 27.36355 100
df[df$id == "a", ] 138.17500 146.46729 147.38646 149.06766 154.10051 100
DataTable, How to conditionally delete rows
You could query the dataset and then loop the selected rows to set them as delete.
var rows = dt.Select("col1 > 5");
foreach (var row in rows)
{ row.Delete(); }
dt.AcceptChanges();
... and you could also create some extension methods to make it easier ...
myTable.Delete("col1 > 5");
public static DataTable Delete(this DataTable table, string filter)
{
table.Select(filter).Delete();
return table;
}
public static void Delete(this IEnumerable<DataRow> rows)
{
foreach (var row in rows)
row.Delete();
}
Delete rows conditionally in data table
We may use %chin%
(or %in%
) with negate (!
)
library(data.table)
exclude <- c("A", "C", "E")
dt[!customerID %chin% exclude]
-output
customerID V1 V2
<char> <int> <char>
1: B 42 GS
2: B 43 XC
3: B 46 XZ
4: D 34 XZ
5: D 19 RF
6: F 44 ZS
7: G 23 AA
==
or !=
are elementwise operators which works best when the length of the lhs/rhs are the same or the rhs value is of length 1 (which recycles) or else the recycling will check on rows that gives undesriable results i.e. i.e. first element of 'exclude' will compare to first element of customerID, 2nd element to 2nd element,..., 1st element again to 3rd element of customerID and so on..
data
dt <- structure(list(customerID = c("A", "A", "B", "B", "B", "C", "C",
"D", "D", "E", "E", "F", "G"), V1 = c(24L, 56L, 42L, 43L, 46L,
42L, 25L, 34L, 19L, 19L, 37L, 44L, 23L), V2 = c("RT", "ES", "GS",
"XC", "XZ", "GE", "WD", "XZ", "RF", "DW", "XS", "ZS", "AA")),
class = c("data.table",
"data.frame"), row.names = c(NA, -13L))
Remove rows from data.table that meet condition
You can do an anti join:
mDT = DT[(condition), !"condition"][, rbind(.SD, rev(.SD), use.names = FALSE)]
DT[!mDT, on=names(mDT)]
# col1 col2 condition
# 1: c c FALSE
Remove Row from DataTable Depending on Condition
Using LINQ you can create a new DataTable
like:
DataTable newDataTable = dt.AsEnumerable()
.Where(r=> !ListLinkedIds.Contains(r.Field<string>("IDCOLUMN")))
.CopyToDataTable();
Related Topics
ASP.NET MVC - Passing Parameters to the Controller
Which Is Better Between a Readonly Modifier and a Private Setter
How to Write Super-Fast File-Streaming Code in C#
Can a Dbcontext Enforce a Filter Policy
How to Connect to a Usb Webcam in .Net
Flip the Graphicspath That Draws the Text/String
MVC 4 Edit Modal Form Using Bootstrap
Connecting to Oracle Database Through C#
Why Method Overloading Is Not Allowed in Wcf
How to Get Mx Records for a Dns Name with System.Net.Dns
C# Keep Session Id Over Httpwebrequest
Xunit.Net: Global Setup + Teardown
How to Create a Directory on Ftp Server Using C#
Algorithm for Intersection of 2 Lines
Bind Multiple Combobox to a Single List - Issue: When I Select an Item, All Combo Boxes Change