"Long Vectors Not Supported Yet" Error in Rmd But Not in R Script

long vectors not supported yet error in Rmd but not in R Script

I also ran into this today, and fixed it by using cache.lazy = FALSE in the setup chunk in my .Rmd.

So what is inside of the first chunk in my R Markdown file looks like this:

library(knitr)
knitr::opts_chunk$set(cache = TRUE, warning = FALSE,
message = FALSE, cache.lazy = FALSE)

Large Matrices in R: long vectors not supported yet

A matrix is just an atomic vector with a dimension attribute which allows R to access it as a matrix. Your matrix is a vector of length 4000*9000000 which is 3.6e+10 elements (the largest integer value is approx 2.147e+9). Subsetting a long vector is supported for atomic vectors (i.e. accessing elements beyond the 2.147e+9 limit). Just treat your matrix as a long vector.

If we remember that by default R fills matrices column-wise then if we wanted to retrieve say the value at test[ 2701 , 850000 ] we could access it via:

i <- ( 2701 - 1 ) * 850000 + 2701 
test[i]
#[1] 1

Note that this really is long vector subsetting because:

2701L * 850000L
#[1] NA
#Warning message:
#In 2701L * 850000L : NAs produced by integer overflow

Error during wrapup: long vectors not supported yet: in glm() function

In order to close this question, I have to mention that the @Axeman's answer it is the only approach feasible for my problem. The whole issue is, there is not enough memory to manage such a huge design matrix.

Therefore, run a probit regression using the biglm package and bigglm() function is the only solution I found so far.

Nevertheless, I realize, due to how the biglm package works, taking iteratively chunks of the data, the use of factor() variables in the RHS it's problematic every time when factor level is not represented in the chunk. In other words, if a factor variable has 5 levels, but in the data chunk only 4 levels appear, I will have an error in the estimation.

There are several questions and comments about this on Stackoverflow.

Rcript Error in if (nx = 2^31 || ny = 2^31) stop(long vectors are not supported)

I guess you ran your R-script from command line on larger files or different files. The built-in (base) merge function won't merge data frames with more than 2^31 rows. Check the merge.data.frame code:

...
nx <- nrow(x <- as.data.frame(x))
ny <- nrow(y <- as.data.frame(y))
if (nx >= 2^31 || ny >= 2^31)
stop("long vectors are not supported")
...

Try alternative merge functions such as ..._join in dplyr library or the most efficient data.table framework.



Related Topics



Leave a reply



Submit