Chi Square Analysis Using for Loop in R

Fix a column in for loop while doing Chi-square test

out is getting replaced in each iteration with the current output and the result OP got is from the last iteration. We can initialize with a list with length of 'x' to store the output

x <- 1:3
out <- vector('list', length(x))
for (i in x) {
test <- chisq.test(df[, i], df[, 4])
out[[i]] <- data.frame("X" = colnames(df[i]),
"Y" = colnames(df[4]),
"Chi.Square" = round(test$statistic, 3),
"df" = test$parameter,
"p.value" = round(test$p.value, 3))

}

Chi Square Analysis using for loop in R

A sample of your data would be appreciated, but I think this will work for you. First, create a combination of all columns with combn. Then write a function to use with an apply function to iterate through the combos. I like to use plyr since it is easy to specify what you want for a data structure on the back end. Also note you only need to compute the chi square test once for each combination of columns, which should speed things up quite a bit as well.

library(plyr)

combos <- combn(ncol(Dat),2)

adply(combos, 2, function(x) {
test <- chisq.test(Dat[, x[1]], Dat[, x[2]])

out <- data.frame("Row" = colnames(Dat)[x[1]]
, "Column" = colnames(Dat[x[2]])
, "Chi.Square" = round(test$statistic,3)
, "df"= test$parameter
, "p.value" = round(test$p.value, 3)
)
return(out)

})

Using loops to do Chi-Square Test in R

Are you sure that the strings in the vector you lapply over are in the column names of the icu dataset?

It works for me when I download the icu data:

system("wget http://course1.winona.edu/bdeppa/Biostatistics/Data%20Sets/ICU.TXT")
icu <- read.table('ICU.TXT', header=TRUE)

and change status to STA which is a column in icu. Here an example for some of your variables:

my.list <- lapply(c("Age","Sex","Race","Ser","Can"),         
function(var) {
formula <- as.formula(paste("STA ~", var))
res.logist <- glm(formula, data = icu, family = binomial)
summary(res.logist)
})

This gives me a list with summary.glm objects. Example:

lapply(my.list, coefficients)
[[1]]
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.05851323 0.69608124 -4.393903 1.113337e-05
Age 0.02754261 0.01056416 2.607174 9.129303e-03

[[2]]
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.4271164 0.2273030 -6.2784758 3.419081e-10
Sex 0.1053605 0.3617088 0.2912855 7.708330e-01

[[3]]
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.0500583 0.4983146 -2.1072198 0.03509853
Race -0.2913384 0.4108026 -0.7091933 0.47820450

[[4]]
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.9465961 0.2310559 -4.096827 0.0000418852
Ser -0.9469461 0.3681954 -2.571858 0.0101154495

[[5]]
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.386294e+00 0.1863390 -7.439638e+00 1.009615e-13
Can 7.523358e-16 0.5892555 1.276756e-15 1.000000e+00

If you want to do a chi-square test:

my.list <- lapply(c("Age","Sex","Race","Ser","Can"),function(var)chisq.test(icu$STA, icu[,var]))

or a chi-square test for all combinations of variables:

my.list.all <- apply(combn(colnames(icu), 2), 2, function(x)chisq.test(icu[,x[1]], icu[,x[2]]))

Does this work?

How to loop chi sq test in R

The easiest loop for a beginner still is the for loop:

d <- data.frame(a = c(8,3,4), b = c(6,7,9))
for(row in 1:nrow(d)){
print(row)
print(chisq.test(c(d[row,1],d[row,2])))
}

The same can be done with

apply(d, 1, chisq.test)

The latter is shorter and gives you a list as your result, which is probalby better for further evaluation.

Apply loop for chi square test in R

Here is the solution for the question I asked

i=1
for (i in 1:max(transaged$flag))

{
survey=as.data.frame(rbind(transaged$CHO[transaged$flag==i],transaged$HO[transaged$flag==i]))
chisq.test(survey)$p.value
result1 <- as.data.frame(cbind(flag=i,ChiSq=chisq.test(survey)$statistic,DF=chisq.test(survey)$parameter,Pvalue=chisq.test(survey)$p.value))
result<-rbind(result,result1)
}
finalage<-merge(result,unique(transaged[,.(HO_GROUP_CODE,START_DATE,flag)]),by='flag')
finalage$identifier<-'AGE'

Using Chi-Square in R Correctly

The chi-squared test is in fact two different types of tests. One is the goodness-of-fit test, which needs two variables or a variable and a distribution. This is the test you are conducting with the question's code.

But you are asking for a crosstab test. Then pass only a table with 2 columns.

CHIS <- lapply(seq_along(Q5_Q8.1)[-1], function(i) 
chisq.test(Q5_Q8.1[c(1, i)]))


Data

Q5_Q8.1 <-
structure(list(`1` = c(368L, 213L, 528L, 910L, 1579L, 961L),
`2` = c(768L, 598L, 2047L, 2953L, 7448L, 4851L), `3` = c(346L,
286L, 1293L, 1764L, 7489L, 6481L), `4` = c(155L, 140L, 501L,
806L, 4259L, 7944L)), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))


Related Topics



Leave a reply



Submit