Data.Table Objects Assigned With := from Within Function Not Printed

data.table objects assigned with := from within function not printed

As David Arenburg mentions in a comment, the answer can be found here. There was a bug fixed in the version 1.9.6 but the fix introduced this downside.

One should call DT[] at the end of the function to prevent this behaviour.

myfunction <- function(dt) {
dt[, z := y - x][]
}
myfunction(mydt) # prints immediately
# x y z
# 1: 1 5 4
# 2: 2 6 4
# 3: 3 7 4

This is described in data.table FAQ 2.23:

Why do I have to type DT sometimes twice after using := to print the result to console?

This is an unfortunate downside to get #869 to work. If a := is used inside a function with no DT[] before the end of the function, then the next time DT is typed at the prompt, nothing will be printed. A repeated DT will print. To avoid this: include a DT[] after the last := in your function. If that is not possible (e.g., it's not a function you can change) then print(DT) and DT[] at the prompt are guaranteed to print. As before, adding an extra [] on the end of := query is a recommended idiom to update and then print; e.g.> DT[,foo:=3L][].

data.table is not displayed on first call after being modified in a function

If you are using v1.9.6, see the corresponding Readme (sec. Bugfixes, 1st entry, https://github.com/Rdatatable/data.table):

if (TRUE) DT[,LHS:=RHS] no longer prints, #869 and #1122. Tests added. To get this to work we've had to live with one downside: if a := is used inside a function with no DT[] before the end of the function, then the next time DT or print(DT) is typed at the prompt, nothing will be printed. A repeated DT or print(DT) will print. To avoid this: include a DT[] after the last := in your function. If that is not possible (e.g., it's not a function you can change) then DT[] at the prompt is guaranteed to print. As before, adding an extra [] on the end of a := query is a recommended idiom to update and then print; e.g. > DT[,foo:=3L][]. Thanks to Jureiss and Jan Gorecki for reporting.

Thus: Does calling DT[] after your function call help?

R function returns nothing instead of data.table object when data.table := is last operation

library(data.table)

data <- data.table(x = 1:3)

test_function_1 <- function(df){
df[, new_column := 1][]
}

test_function_2 <- function(df){
df[, new_column := 1][]
return(df)
}

test_function_3 <- function(df){
df[, new_column := 1]
data.table(df)
}

test_function_1(data) # returns the modified data.table
test_function_2(data) # returns the modified data.table
test_function_3(data) # returns the modified data.table

more info: H E R E

An option to not suppress output after := assignment in data.table

One approach in 1.9.6 is to patch the print.data.table S3 method.

Prior to calling the original method, set the .global$print value to "" (default). This undoes how this value was just changed prior to the generic print method being called (using dynamic scoping rules), in the case where data.table would like to return invisibly (e.g., an assignment := line).

The effect is that the custom print method for data.table is still called, but data.table no longer tries to modify R's default logic to decide when and when not to print.

Likely a naive solution, as I'm still learning about packages, namespaces, environments, S3 methods, etc.

library(data.table)
print.data.table.orig = get('print.data.table', envir=asNamespace('data.table'))
print.data.table.patch = function(x, ...) {
.globalRef = get('.global', envir=asNamespace('data.table'))
.globalRef$print = ""
print.data.table.orig(x, ...)
}

library(R.methodsS3)
setMethodS3('print', 'data.table', print.data.table.patch)


fTbl = data.table(x=1:500000)
fTbl[, x := 5]
x
1: 5
2: 5
3: 5
4: 5
5: 5
---
499996: 5
499997: 5
499998: 5
499999: 5
500000: 5

fTbl
x
1: 5
2: 5
3: 5
4: 5
5: 5
---
499996: 5
499997: 5
499998: 5
499999: 5
500000: 5
>

Manipulate data.table objects within user defined function

This problem is a different flavor of the one described in the post Function on data.table environment errors. It's not exactly a problem, just how dget is designed. But for those curious, this happens because dget assigns the object to parent environment base, and the namespace base isn't data.table aware.

If x is a function the associated environment is stripped. Hence scoping information can be lost.

One workaround is to assign the function to the global enviornment:

> environment(foo) <- .GlobalEnv




But I think the best solution here is to use saveRDS to transfer R objects, which is what ?dget recommends.

Data.table is not returned visibly after applying ':='

We need to specify the [] after the assignment

testf <- function(dt){

dt[, t := seq(1:nrow(dt))][]

}
testf(dt)
# a b t
#1: 1 2 1


Related Topics



Leave a reply



Submit