Create Columns from Column of List in Data.Table

R data.table add new column with values from other columns by referencing

With fcase:

cols <- unique(dt$Label)
dt[,newCol:=eval(parse(text=paste('fcase(',paste0("Label=='",cols,"',Col_",cols,collapse=','),')')))][]

    Label Col_A Col_B Col_C newCol
   <char> <num> <num> <num>  <num>
1:      A     2     1     2      2
2:      B     3     4     0      4
3:      C     5     3     4      4
4:      A     0     5     1      0
5:      B     2     2     5      2
6:      C     7     0     6      6
7:      A     6     7     7      6
8:      B     8     5     3      5
9:      C     9     8     0      0

Create columns from column of list in data.table

Try this:

DT2 <- DT[ , as.list(quantile(x,probs=probs)),by=y]
setnames(DT2, c("y", paste0("q", seq(10, 100, by=10))))

#    y       q10        q20        q30        q40          q50       q60       q70       q80
# 1: b -1.281704 -0.8402934 -0.5251957 -0.2595748 -0.001625739 0.2526686 0.5251940 0.8379979
# 2: c -1.269750 -0.8323597 -0.5133207 -0.2478633  0.003413041 0.2598378 0.5353759 0.8477539
# 3: a -1.281899 -0.8389189 -0.5224092 -0.2573562  0.001186281 0.2542550 0.5244238 0.8401411
#         q90     q100
# 1: 1.284773 3.856234
# 2: 1.283465 4.322815
# 3: 1.273615 3.921410

How to do operations on list columns in an R data.table to output another list column?

Another solution using mapply:

dt[, absvals := mapply(listcol, numericcol, FUN = function(x, y) abs(x-y))]

#output
dt
   numericcol        listcol        absvals
1:         42        1,22, 3       41,20,39
2:         42              6             36
3:         42              1             41
4:         42             12             30
5:         42    5,   6,1123   37,  36,1081
6:         42              3             39
7:         42             42              0
8:         42              1             41

create list from columns of data table expression

Get the data in long format and then aggregate by group.

library(data.table)

dt_long <- melt(dt, measure.vars = c('a', 'b'))
dt_long[, .N, .(variable, value)]

#   variable value N
#1:        a     1 2
#2:        a     2 1
#3:        a     3 1
#4:        a     7 1
#5:        b     4 3
#6:        b     5 1
#7:        b     6 1

In tidyverse -

library(dplyr)
library(tidyr)

dt %>%
  pivot_longer(cols = everything()) %>%
  count(name, value)

R data table - create a new column where each element is a list of values

If we need a list column in the dataset, wrap it with list

DT[, UniqueCats := list(list(sort(unique(Category)))) , by = UserID]
str(DT)
#Classes ‘data.table’ and 'data.frame':  4 obs. of  6 variables:
# $ UserID      : chr  "aaa" "bbb" "aaa" "aaa"
# $ Time        : chr  "7:50" "5:05" "8:40" "10:00"
# $ ArticleID   : chr  "x" "x" "y" "z"
# $ Category    : chr  "sports" "sports" "politics" "sports"
# $ NumOfReading: int  1 1 2 3
# $ UniqueCats  :List of 4
#  ..$ : chr  "politics" "sports"
#  ..$ : chr "sports"
#  ..$ : chr  "politics" "sports"
#  ..$ : chr  "politics" "sports"

We can also create a string column by concatenating the elements together with paste

DT[, uniqueCats := toString(sort(unique(Category))), by = UserID]

How to create a new column in data.table based on values of other columns

Another option is to use indexing to find the rows that fits the condition and update those rows only:

#for each group of ID and Cycle, 
#find the row indices where Cycle_Date equals the last Positive_Test_Date 
idxDT <- DT[, .I[Cycle_Date==Positive_Test_Date[.N]], .(ID, Cycle)]

#for those row indices, set the LH_Date to be Cycle_Date 
#(NA rows or excluded rows defaults to NA by design in data.table)
DT[idxDT$V1, LH_Date := Cycle_Date]

idxDT looks like this and idxDT$V1 extracts the column V1:

   ID Cycle V1
1:  1     1  2
2:  1     1 NA
3:  1     2  7
4:  1     2 NA
5:  2     1  9
6:  2     1 NA
7:  2     2 14
8:  2     2 NA

.I contains the row index within a data.table. From ?.I:

.I is an integer vector equal to seq_len(nrow(x)). While grouping, it holds for each item in the group, its row location in x. This is useful to subset in j; e.g. DT[, .I[which.max(somecol)], by=grp].

output:

    ID Cycle Cycle_Day Cycle_Date Positive_Test_Date   LH_Date
 1:  1     1         1  3/28/2019               <NA>      <NA>
 2:  1     1         2  3/29/2019               <NA> 3/29/2019
 3:  1     1         3  3/30/2019               <NA>      <NA>
 4:  1     1        NA       <NA>          3/29/2019      <NA>
 5:  1     2         1  4/23/2019               <NA>      <NA>
 6:  1     2         2  4/24/2019               <NA>      <NA>
 7:  1     2         3  4/25/2019               <NA> 4/25/2019
 8:  1     2        NA       <NA>          4/25/2019      <NA>
 9:  2     1         1  3/18/2019               <NA> 3/18/2019
10:  2     1         2  3/19/2019               <NA>      <NA>
11:  2     1         3  3/20/2019               <NA>      <NA>
12:  2     1        NA       <NA>          3/18/2019      <NA>
13:  2     2         1  4/23/2019               <NA>      <NA>
14:  2     2         2  4/24/2019               <NA> 4/24/2019
15:  2     2         3  4/25/2019               <NA>      <NA>
16:  2     2        NA       <NA>          4/24/2019      <NA>

data:

library(data.table)
DT <- fread("ID  Cycle  Cycle_Day Cycle_Date  Positive_Test_Date
1   1      1         3/28/2019   NA
1   1      2         3/29/2019   NA
1   1      3         3/30/2019   NA
1   1      NA        NA          3/29/2019
1   2      1         4/23/2019   NA 
1   2      2         4/24/2019   NA
1   2      3         4/25/2019   NA
1   2      NA        NA          4/25/2019
2   1      1         3/18/2019   NA
2   1      2         3/19/2019   NA
2   1      3         3/20/2019   NA
2   1      NA        NA          3/18/2019
2   2      1         4/23/2019   NA 
2   2      2         4/24/2019   NA
2   2      3         4/25/2019   NA
2   2      NA        NA          4/24/2019")

Using a List to Fetch Columns from a DataTable

You can replace:

Dim arrayOfObjects()() As Object = DT.AsEnumerable().Select(Function(b) {b("x1"), b("x2"), b("x3")}).ToArray()

With:

Dim mystr As String = "x1,x2,x3"

Dim tarCols As String() = mystr.Split({","}, StringSplitOptions.RemoveEmptyEntries)

' Shortcut
' Dim tarCols = { "x1", "x2", "x3" }

Dim arrayOfObjects As Object()() = dt.DefaultView.ToTable(False, tarCols).
    AsEnumerable().Select(Function(x) x.ItemArray).ToArray()

To extract the values of any given one or more DataColumn and create that jagged array.

How to Filter Data Table Rows with condition on column of Type list() in R

You can use sapply function to check if any of the values in vals is in Product for each row:

vals = c("UG12210","UG10000-WISD")

dt[Period %chin% "2018-Q1" & sapply(Product, function(v) any(vals %chin% v))]

#            Id  Period                      Product
# 1: 1000797366 2018-Q1                 UG10000-WISD
# 2: 1000797366 2018-Q1 NX11100,UG10000-WISD,UG12210
# 3: 1000797366 2018-Q1         UG10000-WISD,UG12210
# 4: 1000797366 2018-Q1         UG10000-WISD,UG12210
# 5: 1000797366 2018-Q1                      UG12210