Memory Allocation "Error: Cannot Allocate Vector of Size 75.1 Mb"

Memory Allocation Error: cannot allocate vector of size 75.1 Mb

R has gotten to the point where the OS cannot allocate it another 75.1Mb chunk of RAM. That is the size of memory chunk required to do the next sub-operation. It is not a statement about the amount of contiguous RAM required to complete the entire process. By this point, all your available RAM is exhausted but you need more memory to continue and the OS is unable to make more RAM available to R.

Potential solutions to this are manifold. The obvious one is get hold of a 64-bit machine with more RAM. I forget the details but IIRC on 32-bit Windows, any single process can only use a limited amount of RAM (2GB?) and regardless Windows will retain a chunk of memory for itself, so the RAM available to R will be somewhat less than the 3.4Gb you have. On 64-bit Windows R will be able to use more RAM and the maximum amount of RAM you can fit/install will be increased.

If that is not possible, then consider an alternative approach; perhaps do your simulations in batches with the n per batch much smaller than N. That way you can draw a much smaller number of simulations, do whatever you wanted, collect results, then repeat this process until you have done sufficient simulations. You don't show what N is, but I suspect it is big, so try smaller N a number of times to give you N over-all.

R memory management / cannot allocate vector of size n Mb

Consider whether you really need all this data explicitly, or can the matrix be sparse? There is good support in R (see Matrix package for e.g.) for sparse matrices.

Keep all other processes and objects in R to a minimum when you need to make objects of this size. Use gc() to clear now unused memory, or, better only create the object you need in one session.

If the above cannot help, get a 64-bit machine with as much RAM as you can afford, and install 64-bit R.

If you cannot do that there are many online services for remote computing.

If you cannot do that the memory-mapping tools like package ff (or bigmemory as Sascha mentions) will help you build a new solution. In my limited experience ff is the more advanced package, but you should read the High Performance Computing topic on CRAN Task Views.

mgcv bam() error: cannot allocate vector of size 99.6 Gb

To diagnose more precisely where the problem is, try fitting your model with various terms left out. There are several terms in the model that could blow up on you:

  • the fixed effects involving center will blow up to 300 columns * 10^6 rows; depending on whether year is numeric or a factor, the year*center term could blow up to 600 columns or (nyears*300) columns
  • it's not clear to me whether bam uses sparse matrices for s(.,bs="re") terms; if not, you'll be in big trouble (2*10^5 columns * 10^6 rows)

Order of magnitude, a vector of 10^6 numeric values (one column of your model matrix) takes 7.6 Mb, so 500 GB / 7.6 MB would be approximately 65,000 columns ...

Just taking a guess here, but I would try out the gamm4 package. It's not specifically geared for low-memory use, but:

‘gamm4’ is most useful when the random effects are not i.i.d., or
when there are large numbers of random coeffecients [sic] (more than
several hundred), each applying to only a small proportion of the
response data.

I would also make most of the terms into random effects:

gamm4::gamm4(haz ~ s(month, bs = "cc", k = 12)+ sex+ s(age)+ 
(1|center)+ (1|year)+ (1|year:center)+(1|child), data)

or, if there are not very many years in the data set, treat year as a fixed effect:

gamm4::gamm4(haz ~ s(month, bs = "cc", k = 12)+ sex+ s(age)+ 
year + (1|center)+ (1|year:center)+(1|child), data)

If there are a small number of years then (year|center) might make sense, to assess among-center variation and covariation among years ... if there are many years, consider making it a smooth term instead ...

R + ggplot2 - Cannot allocate vector of size 128.0 Mb

Are you sure you have 9 million lines in a 4.5MB file (edit: perhaps your file is 4.5 GB??)? It must be heavily compressed -- when I create a file that is one tenth the size, it's 115Mb ...

n <- 9e5
set.seed(1001)
z <- rnorm(9e5)
z <- cumsum(z)/sum(z)
d <- data.frame(V1=seq(0,1,length=n),V2=z)
ff <- gzfile("lgfile2.gz", "w")
write.table(d,row.names=FALSE,col.names=FALSE,file=ff)
close(ff)
file.info("lgfile2.gz")["size"]

It's hard to tell from the information you've given what kind of "duplicate lines" you have in your data set ... unique(dataset) will extract just the unique rows, but that may not be useful. I would probably start by simply thinning the data set by a factor of 100 or 1000:

smdata <- dataset[seq(1,nrow(dataset),by=1000),]

and see how it goes from there. (edit: forgot a comma!)

Graphical representations of large data sets are often a challenge. In general you will be better off:

  • summarizing the data somehow before plotting it
  • using a specialized graphical type (density plots, contours, hexagonal binning) that reduces the data
  • using base graphics, which uses a "draw and forget" model (unless graphics recording is turned on, e.g. in Windows), rather than lattice/ggplot/grid graphics, which save a complete graphical object and then render it
  • using raster or bitmap graphics (PNG etc.), which only record the state of each pixel in the image, rather than vector graphics, which save all objects whether they overlap or not

R: Cannot allocate memory greater than x MB

Yes you should be using 64bit R, if you can.

See this question, and this from the R docs.



Related Topics



Leave a reply



Submit