caught segfault - 'memory not mapped' error in R
It's not really an explanation of the problem or a satisfactory answer but I examined the codes more closely and figured out that in the first example, the problem appears when using acast
from the reshape2
package. I deleted it in this case because I realized it's not actually needed there but it can be replaced with reshape
from the reshape
package (as shown in another question): reshape(input, idvar="x", timevar="y", direction="wide")[-1]
.
As for the second example, it's not easy to find the exact cause of the problem but as a workaround in my case helped to set a smaller number of cores used for parallel computation - the cluster has 48, I was using only 15 since even before this issue R was running out of memory if the code was run using all 48 cores. When I reduced the number of cores to 10 it suddenly started working like before.
caught segfault error in R
In case anyone else has this problem or similar in the future, I sent a bug report to the package maintainer and he recommended uninstalling all installed packages and starting over. I took his advice and it worked!
I followed advice from this posting: http://r.789695.n4.nabble.com/Reset-R-s-library-to-base-packages-only-remove-all-installed-contributed-packages-td3596151.html
ip <- installed.packages()
pkgs.to.remove <- ip[!(ip[,"Priority"] %in% c("base", "recommended")), 1]
sapply(pkgs.to.remove, remove.packages)
fisher.test crash R with *** caught segfault *** error
I can confirm that this is a bug in R 4.2 and that it is now fixed in the development branch of R (with this commit on 7 May). I wouldn't be surprised if it were ported to a patch-release sometime soon, but that's unknown/up to the R developers. Running your example above doesn't segfault any more, but it does throw an error:
Error in fisher.test(d, simulate.p.value = FALSE) :
FEXACT[f3xact()] error: hash key 5e+09 > INT_MAX, kyy=203, it[i (= nco = 6)]= 0.
Rather set 'simulate.p.value=TRUE'
So this makes your workflow better (you can handle these errors with try()
/tryCatch()
), but it doesn't necessarily satisfy you if you really want to perform an exact Fisher test on these data. (Exact tests on large tables with large entries are extremely computationally difficult, as they essentially have to do computations over the set of all possible tables with given marginal values.)
I don't have any brilliant ideas for detecting the exact conditions that will cause this problem (maybe you can come up with a rough rubric based on the dimensions of the table and the sum of the counts in the table, e.g. if (prod(dim(d)) > 30 && sum(d) > 200)
... ?)
Setting simulate.p.value=TRUE
is the most sensible approach. However, if you expect precise results for extreme tables (e.g. you are working in bioinformatics and are going to apply a huge multiple-comparisons correction to the results), you're going to be disappointed. For example:
dd <- matrix(0, 6, 6)
dd[5,5] <- dd[6,6] <- 100
fisher.test(dd)$p.value
## 2.208761e-59, reported as "< 2.2e-16"
fisher.test(dd, simulate.p.value = TRUE, B = 10000)$p.value
# 9.999e-05
fisher.test(..., simulate.p.value = TRUE)
will never return a value smaller than 1/(B+1)
(this is what happens if none of the simulated tables are more extreme than the observed table: technically, the p-value ought to be reported as "<= 9.999e-05"). Therefore, you will never (in the lifetime of the universe) be able to calculate a p-value like 1e-59, you'll just be able to set a bound based on how large you're willing to make B
.
Related Topics
How to Pass Command-Line Arguments When Calling Source() on an R File Within Another R File
Controlling Order of Facet_Grid/Facet_Wrap in Ggplot2
R: How to Get the Week Number of the Month
Make Readline Wait for Input in R
Extreme Numerical Values in Floating-Point Precision in R
How to Extract Certain Columns from a List of Data Frames
Data.Frame Merge and Selection of Values Which Are Common in 2 Data.Frames
Differencebetween Cat and Print
Add a New Column to a Dataframe Using Matching Values of Another Dataframe
Plotting Pca Biplot with Ggplot2
Replace All Values in a Matrix <0.1 with 0
One-Hot Encoding in [R] | Categorical to Dummy Variables
Join R Data.Tables Where Key Values Are Not Exactly Equal--Combine Rows with Closest Times
Collect All User Inputs Throughout the Shiny App
Subsetting a Dataframe for a Specified Month and Year