Why Can't I Get a P-Value Smaller Than 2.2E-16

Why can't I get a p-value smaller than 2.2e-16?

Try something like this t.test(a,b)$p.value see if that gives you the accuracy you need. I believe it has more to do with the printing of the result than it does the actual stored computer value which should have the necessary precision.

How should tiny $p$-values be reported? (and why does R put a minimum on 2.22e-16?)

There's a good reason for it.

The value can be found via noquote(unlist(format(.Machine)))

           double.eps        double.neg.eps           double.xmin 
         2.220446e-16          1.110223e-16         2.225074e-308 
          double.xmax           double.base         double.digits 
        1.797693e+308                     2                    53 
      double.rounding          double.guard     double.ulp.digits 
                    5                     0                   -52 
double.neg.ulp.digits       double.exponent        double.min.exp 
                  -53                    11                 -1022 
       double.max.exp           integer.max           sizeof.long 
                 1024            2147483647                     4 
      sizeof.longlong     sizeof.longdouble        sizeof.pointer 
                    8                    12                     4

If you look at the help, (?".Machine"):

double.eps  

the smallest positive floating-point number x such that 1 + x != 1. It equals 
double.base ^ ulp.digits if either double.base is 2 or double.rounding is 0; 
otherwise, it is (double.base ^ double.ulp.digits) / 2. Normally 2.220446e-16.

It's essentially a value below which you can be quite confident the value will be pretty numerically meaningless - in that any smaller value isn't likely to be an accurate calculation of the value we were attempting to compute. (Having studied a little numerical analysis, depending on what computations were performed by the specific procedure, there's a good chance numerical meaninglessness comes in a fair way above that.)

But statistical meaning will have been lost far earlier. Note that p-values depend on assumptions, and the further out into the extreme tail you go the more heavily the true p-value (rather than the nominal value we calculate) will be affected by the mistaken assumptions, in some cases even when they're only a little bit wrong. Since the assumptions are simply not going to be all exactly satisfied, middling p-values may be reasonably accurate (in terms of relative accuracy, perhaps only out by a modest fraction), but extremely tiny p-values may be out by many orders of magnitude.

Which is to say that usual practice (something like the "<0.0001" that's you say is common in packages, or the APA rule that Jaap mentions in his answer) is probably not so far from sensible practice, but the approximate point at which things lose meaning beyond saying 'it's very very small' will of course vary quite a lot depending on circumstances.

This is one reason why I can't suggest a general rule - there can't be a single rule that's even remotely suitable for everyone in all circumstances - change the circumstances a little and the broad grey line marking the change from somewhat meaningful to relatively meaningless will change, sometimes by a long way.

If you were to specify sufficient information about the exact circumstances (e.g. it's a regression, with this much nonlinearity, that amount of variation in this independent variable, this kind and amount of dependence in the error term, that kind of and amount of heteroskedasticity, this shape of error distribution), I could simulate 'true' p-values for you to compare with the nominal p-values, so you could see when they were too different for the nominal value to carry any meaning.

But that leads us to the second reason why - even if you specified enough information to simulate the true p-values - I still couldn't responsibly state a cut-off for even those circumstances.

What you report depends on people's preferences - yours, and your audience. Imagine you told me enough about the circumstances for me to decide that I wanted to draw the line at a nominal $p$ of $10^{-6}$.

All well and good, we might think - except your own preference function (what looks right to you, were you to look at the difference between nominal p-values given by stats packages and the the ones resulting from simulation when you suppose a particular set of failures of assumptions) might put it at $10^{-5}$ and the editors of the journal you want to submit to might put have their blanket rule to cut off at $10^{-4}$, while the next journal might put it at $10^{-3}$ and the next may have no general rule and the specific editor you got might accept even lower values than I gave ... but one of the referees may then have a specific cut off!

In the absence of knowledge of their preference functions and rules, and the absence of knowledge of your own utilities, how do I responsibly suggest any general choice of what actions to take?

I can at least tell you the sorts of things that I do (and I don't suggest this is a good choice for you at all):

There are few circumstances (outside of simulating p-values) in which I would make much of a p less than $10^{-6}$ (I may or may not mention the value reported by the package, but I wouldn't make anything of it other than it was very small, I would usually emphasize the meaningless of the exact number). Sometimes I take a value somewhere in the region of $10^{-5}$ to $10^{-4}$ and say that p was much less than that. On occasion I do actually do as suggested above - perform some simulations to see how sensitive the p-value is in the far tail to various violations of the assumptions, particularly if there's a specific kind of violation I am worried about.

That's certainly helpful in informing a choice - but I am as likely to discuss the results of the simulation as to use them to choose a cut-off-value, giving others a chance to choose their own.

An alternative to simulation is to look at some procedures that are more robust* to the various potential failures of assumption and see how much difference to the p-value that might make. Their p-values will also not be particularly meaningful, but they do at least give some sense of how much impact there might be. If some are very different from the nominal one, it also gives more of an idea which violations of assumptions to investigate the impact of. Even if you don't report any of those alternatives, it gives a better picture of how meaningful your small p-value is.

* Note that here we don't really need procedures that are robust to gross violations of some assumption; ones that are less affected by relatively mild deviations of the relevant assumption should be fine for this exercise.

I will say that when/if you do come to do such simulations, even with quite mild violations, in some cases it can be surprising at how far even not-that-small p-values can be wrong. That has done more to change the way I personally interpret a p-value than it has shifted the specific cut-offs I might use.

When submitting the results of an actual hypothesis test to a journal, I try to find out if they have any rule. If they don't, I tend to please myself, and then wait for the referees to complain.

How to get precise low p-value in R (from F test)

p-values this low are meaningless anyway. Firstly, most calculations use slight approximations so the imprecision comes to dominate the result as you tend towards a zero p-value and secondly, and probably more importantly, any tiny deviation of your population from the modelled distribution will overwhelm the accuracy you desire.

Simply quote the p-values as 'p < 0.0001' and be done with it.

extract verbatim p-value from cor.test

Your p-value is not 2.2e-16, its less than that. That means its computationally zero. You can skip the zeros, or set zero to the arbitrarily small 2.2e-16.

ifelse(pv==0,-log(2.2e-16,base=10),-log(pv,base=10))

Of course, 30 out of 81 times you do not know what the p-value is, just that it is very small.

Extract the full p.value from chisq.test

rslt = chisq.test(cbind(c(10,20),c(30,40)))

Pearson's Chi-squared test with Yates' continuity correction

data:  cbind(c(10, 20), c(30, 40))
X-squared = 0.44643, df = 1, p-value = 0.504

we can always use the chi-sq estimate from the test, and calculate p.value. Using example above you can see they are the same.

rslt$p.value == pchisq(rslt$statistic,rslt$parameter,lower.tail=FALSE)
X-squared 

TRUE

Using you example, as the p.value is very small, use the log:

pchisq(32943.9488257678,9,lower.tail=F,log.p=TRUE)
-16440.44

Decimal points - Probability value of 0 in Language R

There are a variety of possible answers -- which one is most useful depends on the context:

R is indeed incapable under ordinary circumstances of storing floating-point values closer to zero than .Machine$double.xmin, which varies by platform but is typically (as you discovered) on the order of 1e-308. If you really need to work with numbers this small and can't find a way to work on the log scale directly, you need to search Stack Overflow or the R wiki for methods for dealing with arbitrary/extended precision values (but you probably should try to work on the log scale -- it will be much less of a hassle)
in many circumstances R actually computes p values on the (natural) log scale internally, and can if requested return the log values rather than exponentiating them before giving the answer. For example, dnorm(-100,log=TRUE) gives -5000.919. You can convert directly to the log10 scale (without exponentiating and then using log10) by dividing by log(10): dnorm(-100,log=TRUE)/log(10)=-2171, which would be too small to represent in floating point. For the p*** (cumulative distribution function) functions, use log.p=TRUE rather than log=TRUE. (This particular point depends heavily on your particular context. Even if you are not using built-in R functions you may be able to find a way to extract results on the log scale.)
in some cases R presents p-value results as being <2.2e-16 even when a more precise value is known: (t1 <- t.test(rnorm(10,100),rnorm(10,80)))

prints

....
t = 56.2902, df = 17.904, p-value < 2.2e-16

but you can still extract the precise p-value from the result

> t1$p.value
[1] 1.856174e-18

(in many cases this behaviour is controlled by the format.pval() function)

An illustration of how all this would work with lm:

d <- data.frame(x=rep(1:5,each=10))
set.seed(101)
d$y <- rnorm(50,mean=d$x,sd=0.0001)
lm1 <- lm(y~x,data=d)

summary(lm1) prints the p-value of the slope as <2.2e-16, but if we use coef(summary(lm1)) (which does not use the p-value formatting), we can see that the value is 9.690173e-203.

A more extreme case:

set.seed(101); d$y <- rnorm(50,mean=d$x,sd=1e-7)
lm2 <- lm(y~x,data=d)
coef(summary(lm2))

shows that the p-value has actually underflowed to zero. However, we can still get an answer on the log scale:

tval <- coef(summary(lm2))["x","t value"]
2*pt(abs(tval),df=48,lower.tail=FALSE,log.p=TRUE)/log(10)

gives -692.62 (you can check this approach with the previous example where the p-value doesn't overflow and see that you get the same answer as printed in the summary).

cor.test() p value different than by hand?

Since you have not supplied a Minimal Reproducible Example with actual data, I cannot confirm with your own data, but here is a procedure that shows the manual version is equal to the cor.test p value:

MPMA_cortest <- cor.test(mtcars$hp, mtcars$mpg)

p_manual <- pt(
  q = abs(MPMA_cortest$statistic), 
  df = MPMA_cortest$parameter,
  lower.tail = FALSE) * 2

p_manual == MPMA_cortest$p.value
#>    t 
#> TRUE

Edit: Also note that the cor.test printout only says p-value < 2.2e-16. The two values may well be exactly equal (yours is smaller, thus meeting the inequality condition).

Calculating exact p-values from a Pearson's correlation test (manually or in R)

Thanks everyone for your suggestions and advice, and sorry for not replying sooner. I've been juggling a few things around until recently. However, I did ask a statistician within my department about this, and he agreed with what r2evans said. If the p-value is smaller than 10^-16, there's little point in reporting an 'exact' value, since the point is that there is strong evidence that the result differs from the null hypothesis.

One case when p-values might be important is when you want to rank by order of significance, but you could get around this by using z-scores to rank instead.

To address the original question, I defer to this guide, which I found long after posting this question: https://stats.stackexchange.com/questions/315311/how-to-find-p-value-using-estimate-and-standard-error.

Why Can't I Get a P-Value Smaller Than 2.2E-16