Error Using T.Test() in R - Not Enough 'Y' Observations

Error using t.test() in R - not enough 'y' observations

All standard variants of t-test use sample variances in their formulas, and you cannot compute that from one observation as you are dividing with n-1, where n is sample size.

This would probably be the easiest modification, although I cannot test it as you did not provide sample data (you could dput your data to your question):

 t<- lapply(1:length(x), function(i){
if(length(x[[i]][[2]])>1){
t.test(dat$Value,x[[i]][[2]])
} else "Only one observation in subset" #or NA or something else
})

Another option would be to modify the indices which are used in lapply:

ind<-which(sapply(x,function(i) length(i[[2]])>1))
t<- lapply(ind, function(i) t.test(dat$Value,x[[i]][[2]]))

Here's an example of the first case with artificial data:

x<-list(a=cbind(1:5,rnorm(5)),b=cbind(1,rnorm(1)),c=cbind(1:3,rnorm(3)))
y<-rnorm(20)

t<- lapply(1:length(x), function(i){
if(length(x[[i]][,2])>1){ #note the indexing x[[i]][,2]
t.test(y,x[[i]][,2])
} else "Only one observation in subset"
})

t
[[1]]

Welch Two Sample t-test

data: y and x[[i]][, 2]
t = -0.4695, df = 16.019, p-value = 0.645
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-1.2143180 0.7739393
sample estimates:
mean of x mean of y
0.1863028 0.4064921

[[2]]
[1] "Only one observation in subset"

[[3]]

Welch Two Sample t-test

data: y and x[[i]][, 2]
t = -0.6213, df = 3.081, p-value = 0.5774
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-3.013287 2.016666
sample estimates:
mean of x mean of y
0.1863028 0.6846135

Welch Two Sample t-test

data: y and x[[i]][, 2]
t = 5.2969, df = 10.261, p-value = 0.0003202
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
3.068071 7.496963
sample estimates:
mean of x mean of y
5.5000000 0.2174829

Error using t.test - not enough 'x' observations

I'm going to go out on a limb and guess that you want to apply the the t-test for each row in your data.frame and the fields are labeled 'case1','control1', etc.

methySample  <-  
data.frame(case1=rnorm(10),
case2=rnorm(10),
control1=rnorm(10),
control2=rnorm(10))

# identify the fields that are labeled 'case' and 'control'
caseFields <- grep('case',colnames(methySample), value=TRUE)
controlFields <- grep('control',colnames(methySample), value=TRUE)

# apply the t-test for each row (margin = 1)
apply(methySample,
1,
function(x)
t.test(x[caseFields],
x[controlFields])$p.value)

If you're still having trouble, this bit of code is equivalent and probably easier to debug:

pValue <- numeric(0)
for(i in seq(nrow(methySample)))
pValue <- c(pValue,
t.test(methySample[i,caseFields],
methySample[i,controlFields])$p.value)

Paired t test in R is giving me this error :not enough 'x' observations

Consider by to iterate across all unique clusters, passing subsetted data frames into a user-defined generalized method. The output becomes a list of t-test results.

proc_ttest <- function(df) t.test(df$columna, df$columnb, paired=True, na.rm=TRUE)

a_ttest_list <- by(A, A$cluster, proc_ttest)
b_ttest_list <- by(B, B$cluster, proc_ttest)

# RESULTS
a_ttest_list$`1` # NAME INDEX
b_ttest_list$`1`

a_ttest_list[[2]] # NUMBER INDEX
a_ttest_list[[2]]
...

To return a list of Cluster_## names, adjust the cluster column before running by:

A <- transform(A, cluster = paste0("cluster_", cluster))
a_ttest_list <- by(A, A$cluster, proc_ttest)

a_ttest_list$cluster_1
a_ttest_list$cluster_2
a_ttest_list$cluster_3

B <- transform(B, cluster = paste0("cluster_", cluster))
b_ttest_list <- by(B, B$cluster, proc_ttest)

b_ttest_list$cluster_1
b_ttest_list$cluster_2
b_ttest_list$cluster_3

R Continue t.test in a map-function, although there are not enough observations

What I would do is combine the data.frames in a different format - so that the "A" parts are in one data.frame and "B" parts - in the other:

dfs <- cbind(df1=df1, df2=df2, df3=df3)
dfA <- dfs[,grep("A$", colnames(dfs))]
dfB <- dfs[,grep("B$", colnames(dfs))]

Then everything is a lot easier:

doTtest <- function(x, y) {
if(any(!is.na(x)) & any(!is.na(y)))
broom::tidy(t.test(x,y))
else
rep(NA, 10)
}
res <- as.data.frame(t(mapply(doTtest, dfA, dfB)))

Alternatively you could the use a convenient library matrixTests:

library(matrixTests)
> col_t_welch(dfA, dfB)
obs.x obs.y obs.tot mean.x mean.y mean.diff var.x var.y stderr df statistic pvalue conf.low conf.high alternative mean.null conf.level
df1.var1A 10 10 20 1.5436119 0.7488449 0.79476695 0.2993602 0.5481971 0.2911284 16.57158 2.7299537 0.01449227 0.1793279 1.4102060 two.sided 0 0.95
df1.var2A 10 10 20 2.2205661 2.2320260 -0.01145988 0.4832561 0.5249799 0.3175273 17.96923 -0.0360910 0.97160771 -0.6786419 0.6557222 two.sided 0 0.95
df1.var3A 10 10 20 3.0457651 2.7835908 0.26217424 1.2998193 1.9933106 0.5738580 17.23565 0.4568626 0.65347516 -0.9473005 1.4716490 two.sided 0 0.95
df2.var1A 10 10 20 1.7233471 1.2761199 0.44722715 0.9328694 1.3631385 0.4791668 17.38932 0.9333434 0.36342238 -0.5620050 1.4564593 two.sided 0 0.95
df2.var2A 10 10 20 1.9278754 2.6368740 -0.70899858 1.0966493 0.6907785 0.4227798 17.11741 -1.6769925 0.11170922 -1.6005202 0.1825230 two.sided 0 0.95
df2.var3A 10 10 20 3.1245106 2.9569952 0.16751542 1.0357228 0.8209887 0.4308958 17.76242 0.3887609 0.70207375 -0.7386317 1.0736625 two.sided 0 0.95
df3.var1A 10 0 10 0.6804275 NaN NaN 0.6015624 0.0000000 NaN NaN NA NA NA NA two.sided 0 0.95
df3.var2A 10 10 20 2.0143381 1.9223843 0.09195379 0.7837613 0.7611496 0.3930535 17.99614 0.2339472 0.81766669 -0.7338338 0.9177413 two.sided 0 0.95
df3.var3A 10 10 20 3.0156624 3.2768350 -0.26117263 1.5437758 1.2608029 0.5295827 17.81860 -0.4931668 0.62791751 -1.3745971 0.8522518 two.sided 0 0.95


Related Topics



Leave a reply



Submit