How to Perform Single Factor Anova in R with Samples Organized by Column

How to perform single factor ANOVA in R with samples organized by column?

You stack them in the long format:

mdat <- stack(mydata)
mdat
   values ind
1       1   a
2       3   a
3       4   a
4       6   a
5       8   a
6       3   b
7       6   b
snipped output

> aov( values ~ ind, mdat)
Call:
   aov(formula = values ~ ind, data = mdat)

Terms:
                 ind Residuals
Sum of Squares  18.2      65.6
Deg. of Freedom    3        16

Residual standard error: 2.024846 
Estimated effects may be unbalanced

Given the warning it might be safer to use lm:

> anova(lm(values ~ ind, mdat))
Analysis of Variance Table

Response: values
          Df Sum Sq Mean Sq F value Pr(>F)
ind        3   18.2  6.0667  1.4797 0.2578
Residuals 16   65.6  4.1000               
> summary(lm(values~ind, mdat))

Call:
lm(formula = values ~ ind, data = mdat)

Residuals:
   Min     1Q Median     3Q    Max 
 -3.40  -1.25   0.00   0.90   3.60 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   4.4000     0.9055   4.859 0.000174 ***
indb          0.8000     1.2806   0.625 0.540978    
indc         -1.2000     1.2806  -0.937 0.362666    
indd         -1.6000     1.2806  -1.249 0.229491    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 2.025 on 16 degrees of freedom
Multiple R-squared: 0.2172, Adjusted R-squared: 0.07041 
F-statistic:  1.48 on 3 and 16 DF,  p-value: 0.2578

And please don't ask me why Excel gives a different answer. Excel has generally been shown to be highly unreliable when it comes to statistics. The onus is on Excel to explain why it doesn't give answers comparable to R.

Edit in response to comments: The Excel Data Analysis Pack ANOVA procedure creates an output but it does not use an Excel function for that process, so when you change the data in the data cells from which it was derived, and then hit F9, or the equivalent menu recalculation command, there will be no change in the output section. This and other sources of user and numerical problems are documented in various pages of David Heiser's efforts at assessing Excel's problems with statistical calculations: http://www.daheiser.info/excel/frontpage.html Heiser started out his efforts which are now at least a decade-long, with the expectation that Microsoft would take responsibility for these errors, but they have consistently ignored his and others' efforts at identifying errors and suggesting better procedures. There was also a 6 section Special Report in the June 2008 issue of "Computational Statistics & Data Analysis" edited by BD McCullough that cover various statistical concerns with Excel.

How to perform single factor ANOVA in R with samples organized by row?

Taking the advice found here, we use stack to produce a dataframe with one value and one indicator variable (ind), then perform the aov:

R1 = c(-5.11,-4.77,-6.64,-6.64,-3.06,3.26,2.34)
R2 = c(-6.64,-6.64,-4.01,0.66,2.7,3.23,3.28)
R3 = c(-2.44,2.49,3.07,-3.65,-6.64,1.64,3.25)

mydata = data.frame(cbind(R1,R2,R3))

mat <- t(mydata)
rownames(mat) <- NULL
colnames(mat) <- letters[seq_len(ncol(mat))]
df <- stack(as.data.frame(mat))

> head(df)
  values ind
1  -5.11   a
2  -6.64   a
3  -2.44   a
4  -4.77   b
5  -6.64   b
6   2.49   b

aov(values ~ ind, data = df)

Call:
   aov(formula = values ~ ind, data = df)

Terms:
                     ind Residuals
Sum of Squares  164.5202  179.6335
Deg. of Freedom        6        14

Residual standard error: 3.582033
Estimated effects may be unbalanced

If we need more information, we can also use anova(lm(...)):

test <- anova(lm(values ~ ind, data = df))
summary(test)

       Df         Sum Sq         Mean Sq         F value          Pr(>F)      
 Min.   : 6   Min.   :164.5   Min.   :12.83   Min.   :2.137   Min.   :0.1134  
 1st Qu.: 8   1st Qu.:168.3   1st Qu.:16.48   1st Qu.:2.137   1st Qu.:0.1134  
 Median :10   Median :172.1   Median :20.13   Median :2.137   Median :0.1134  
 Mean   :10   Mean   :172.1   Mean   :20.13   Mean   :2.137   Mean   :0.1134  
 3rd Qu.:12   3rd Qu.:175.9   3rd Qu.:23.77   3rd Qu.:2.137   3rd Qu.:0.1134  
 Max.   :14   Max.   :179.6   Max.   :27.42   Max.   :2.137   Max.   :0.1134  
                                              NA's   :1       NA's   :1

Edit: The ANOVA won't give you single p-values, but lm will:

summary(lm(values ~ ind, data = df))

Call:
lm(formula = values ~ ind, data = df)

Residuals:
   Min     1Q Median     3Q    Max 
-4.307 -1.797 -0.440  0.550  5.597 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept)   -4.730      2.068  -2.287   0.0383 *
indb           1.757      2.925   0.601   0.5577  
indc           2.203      2.925   0.753   0.4637  
indd           1.520      2.925   0.520   0.6114  
inde           2.397      2.925   0.819   0.4263  
indf           7.440      2.925   2.544   0.0234 *
indg           7.687      2.925   2.628   0.0199 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.582 on 14 degrees of freedom
Multiple R-squared:  0.478, Adjusted R-squared:  0.2543 
F-statistic: 2.137 on 6 and 14 DF,  p-value: 0.1134

How to do an one way anova with all columns of data fame?

You need first to reshape the data.frame from wide to long format.

The following uses an external package, reshape2, to reshape the data from wide to long format.

molten <- reshape2::melt(D)
head(molten)

model <- lm(value ~ variable, data = molten)
anova(model)

How to perform nested anova by groups using R

If I understood correctly, this example might help you

Libraries

library(tidyverse)
library(broom)

Example

Code

mtcars %>% 
  # Create 2 factors variables
  mutate(
    gear = as.factor(gear),
    cyl = as.factor(cyl)
    ) %>% 
  # Group and nest data by each level of gear
  group_nest(gear) %>% 
  mutate(
    # Apply ANOVA por each level of gear, for drat~cyl
    anova = map(.x = data,.f = ~anova(aov(drat~cyl,data = .x))),
    # Get results of ANOVA
    results = map(anova,tidy)
    ) %>% 
  # "Show" the results for each gear level
  unnest(results)

Output

# A tibble: 6 x 9
  gear                 data anova           term         df sumsq meansq statistic p.value
  <fct> <list<tibble[,10]>> <list>          <chr>     <int> <dbl>  <dbl>     <dbl>   <dbl>
1 3               [15 x 10] <anova [2 x 5]> cyl           2 0.414 0.207      3.91   0.0491
2 3               [15 x 10] <anova [2 x 5]> Residuals    12 0.634 0.0529    NA     NA     
3 4               [12 x 10] <anova [2 x 5]> cyl           1 0.107 0.107      1.10   0.318 
4 4               [12 x 10] <anova [2 x 5]> Residuals    10 0.967 0.0967    NA     NA     
5 5                [5 x 10] <anova [2 x 5]> cyl           2 0.158 0.0790     0.352  0.740 
6 5                [5 x 10] <anova [2 x 5]> Residuals     2 0.449 0.225     NA     NA

How to perform a two-way mixed ANOVA with three groups, three time points and one dependent variable in multiple columns in R

Welcome to SO! It's looks like you're new, but in the future to get great answers quickly, make sure you include sample data in a format that useable (i.e., the output from dput(head(myData))). Check it out: making R reproducible questions.

I know of two approaches to completing within and between ANOVA. The easier to implement version is from the package ez. The package jmv offers a much more complex write-up but you have an immense amount of control, as well. I'm sure there is more to ez's version, but I haven't worked with that package very much.

I created data to somewhat simulate what you are working with.

library(tidyverse)
library(jmv)
library(ez)

set.seed(35)
df1 <- data.frame(subject = factor(rep(1:15, each = 3)),
                  time = factor(rep(c(1:3), 15)),
                  group = factor(rep(1:3, each = 15)),
                  stim_IVis3 = rnorm(45, .5, .15),
                  stim_IVis4 = rnorm(45, .51, .125))

head(df1)
summary(df1)

To use ez, it's a pretty straightforward implementation. Although, I couldn't find an option that allowed for multiple dependent variables. Essentially, you would either need to pivot_long or you can use jmv.
For this method, you don't get the post hoc comparisons; the effect size is the generalized η².

ezANOVA(df1, dv = stim_IVis3, wid = subject, within = time, 
        between = group, detailed = T)

# $ANOVA
#        Effect DFn DFd         SSn       SSd           F            p p<.05        ges
# 1 (Intercept)   1  12 12.58700023 0.3872673 390.0252306 1.616255e-10     * 0.92579702
# 2       group   2  12  0.05853169 0.3872673   0.9068417 4.297644e-01       0.05483656
# 3        time   2  24  0.05417372 0.6215855   1.0458491 3.668654e-01       0.05096178
# 4  group:time   4  24  0.06774352 0.6215855   0.6539102 6.298074e-01       0.06292379
# 
# $`Mauchly's Test for Sphericity`
#       Effect         W         p p<.05
# 3       time 0.9914977 0.9541232      
# 4 group:time 0.9914977 0.9541232      
# 
# $`Sphericity Corrections`
#       Effect       GGe     p[GG] p[GG]<.05      HFe     p[HF] p[HF]<.05
# 3       time 0.9915694 0.3664558           1.187039 0.3668654          
# 4 group:time 0.9915694 0.6286444           1.187039 0.6298074

Now to use jmv, you will need to pivot the data wider. Your within-subject data needs to be in separate columns. Since time is the factor that represents with the within-subject values in the stim... columns, that's what you need to pivot.

df2 <- df1 %>% 
  pivot_wider(id_cols = c(subject, group),
              names_from = time,
              values_from = starts_with("stim"),
              names_vary = "fastest")

head(df2)
# # A tibble: 6 × 8
#   subject group stim_IVis3_1 stim_IVis3_2 stim_IVis3_3 stim_IVis4_1 stim_IVis4_2 stim_IVis4_3
#   <fct>   <fct>        <dbl>        <dbl>        <dbl>        <dbl>        <dbl>        <dbl>
# 1 1       1            0.497        0.505        0.601        0.417        0.776        0.660
# 2 2       1            0.584        0.408        0.737        0.454        0.670        0.650
# 3 3       1            0.741        0.399        0.450        0.306        0.665        0.577
# 4 4       1            0.223        0.450        0.350        0.522        0.342        0.417
# 5 5       1            0.717        0.661        0.581        0.601        0.700        0.686
# 6 6       2            0.534        0.780        0.443        0.420        0.565        0.739

Now you're ready for jmv. bs equates to between-subjects; rm equates to repeated measures.

fit = anovaRM(data = df2, 
              ss = "3",                                # type of SS (1, 2, or 3)
              bs = list("group"),                      # between subjects (links with other parameters this way)
              bsTerms = list("group"),
              rm = list(list(label = "stim_IVs",          # within subjects
                             levels = c(names(df2)[3:8]))), # could also write c("stim_IVs3_1", "stim_IVs3_2", "stim_IVs3_3")
              rmCells = list(list(measure = names(df2)[3], # could also write "stim_IVs3_1"
                                  cell = names(df2)[3]),   # the group associated in 'rm' level (I used the column name)
                             list(measure = names(df2)[4],     
                                  cell = names(df2)[4]),
                             list(measure = names(df2)[5],
                                  cell = names(df2)[5]),
                             list(measure = names(df2)[6],
                                  cell = names(df2)[6]),
                             list(measure = names(df2)[7],
                                  cell = names(df2)[7]),
                             list(measure = names(df2)[8],
                                  cell = names(df2)[8])
                             ),    
              rmTerms = list("stim_IVs"),         # groups variable for repeated/within measures
              emMeans = list(list("stim_IVs", "group")),  # all grouping vars in the ANOVA (for the em tables)
              emmPlots = T,                  # show em plots
              emmTables = T,                 # show em tables
              effectSize = c("omega", "partEta"), # multiple options here, see help
              spherTests = T,                # use correction
              spherCorr = c("none", "GG", "HF"),  # multiple options here, see help
              leveneTest = T,                # check homogeneity (p GREATER than .05 is good)
              qq = T,                        # plot normality validation qq plot
              postHoc = list("group",c("group","stim_IVs"),"stim_IVs"),
              postHocCorr = "tukey")         # use TukeyHSD

If you print fit, you're going to get a ton of information. In addition to the content provided by ezANOVA, this provides the post hoc comparisons of each group, time, dependent variable, and the intermix of each; the results of the statistical assumptions: Levene's for homogeneity and a QQ plot of the residuals for normality; and the estimated marginal means table and a plot of those means.

I realize that you have more fields in your data than what I have here. I believe I've provided enough for you to build off of.

I suggest that you determine what you want to learn from the data and choose the algorithm from there.

If you have any questions, let me know.

How to Perform Single Factor Anova in R with Samples Organized by Column