Means from a List of Data Frames in R

Get the mean across list of dataframes by rows

A simple way would be to cbind the list and calculate mean of each row with rowMeans

rowMeans(do.call(cbind, myLs))
#[1] 5 2 1

We can also use bind_cols from dplyr to combine all the dataframes.

rowMeans(dplyr::bind_cols(myLs))

Calculate mean of each row in a large list of dataframes in R

We may bind the list elements to a single data and then use a group by mean operation

library(dplyr)
bind_rows(lst1) %>%
group_by(id) %>%
summarise(value_mean = mean(value, na.rm = TRUE), .groups = 'drop')

-output

# A tibble: 3 x 2
id value_mean
<chr> <dbl>
1 id1 0.25
2 id2 0.25
3 id3 0.5

If the datasets have a the same dimension and the 'id' are in same order, extract the 'value' column, use Reduce to do elementwise + and divide by the length of list

Reduce(`+`, lapply(lst1, `[[`, "value"))/length(lst1)
[1] 0.25 0.25 0.50

Or a more efficient approach is with dapply/t_list from collapse

library(collapse)
dapply(t_list(dapply(lst1, `[[`, "value")), fmean)
V1 V2 V3
0.25 0.25 0.50

Return a dataframe of averages from a list of dataframes

After the hint from @tom above the final solution arrived at was to change the list of data frames to a single data frame with all data and use the tidyverse to process it.

There were a few little tidy ups needed.

  1. An errant character column from the origin of the data
  2. A column with data in both upper and lower case
  3. Avoiding the character columns in the mean calculation
  4. Then putting the character columns and the mean data frame back together to get it back in the correct order.

So...

Change the format to a single data frame and fix the non-numeric column

myfiles3 <- myfiles2 %>% 
bind_rows() %>%
transform(EdgeStepL2 = as.numeric(EdgeStepL2))

ensure the section names are in uppercase to be consistent

myfiles3$Section <- str_to_upper(myfiles3$Section)

calculate the mean of each cell grouped by common values.

myfiles4 <- myfiles3 %>% group_by(Section,Chainage) %>%
summarise_at(vars("East":"Surf.Det"),funs(mean(., na.rm = TRUE)))

myfiles5 <- data.frame(myfiles2[[1]][1:2])

myfiles6 <- left_join(myfiles5, myfiles4)

This is not the simple solution I had hoped for but for the next person to try this.

Look for the NA's (everywhere in the data).

Make sure that all the columns you are running the mean (or other function) on are those you can calculate with.

Means from a list of data frames in R

You can use lapply and pass indices as follows:

ids <- seq(3, 54, by=3)
out <- do.call(rbind, lapply(ids, function(idx) {
t <- unlist(x[[idx]][, -1])
c(mean(t), var(t))
}))

How do I make a list of data frames?

This isn't related to your question, but you want to use = and not <- within the function call. If you use <-, you'll end up creating variables y1 and y2 in whatever environment you're working in:

d1 <- data.frame(y1 <- c(1, 2, 3), y2 <- c(4, 5, 6))
y1
# [1] 1 2 3
y2
# [1] 4 5 6

This won't have the seemingly desired effect of creating column names in the data frame:

d1
# y1....c.1..2..3. y2....c.4..5..6.
# 1 1 4
# 2 2 5
# 3 3 6

The = operator, on the other hand, will associate your vectors with arguments to data.frame.

As for your question, making a list of data frames is easy:

d1 <- data.frame(y1 = c(1, 2, 3), y2 = c(4, 5, 6))
d2 <- data.frame(y1 = c(3, 2, 1), y2 = c(6, 5, 4))
my.list <- list(d1, d2)

You access the data frames just like you would access any other list element:

my.list[[1]]
# y1 y2
# 1 1 4
# 2 2 5
# 3 3 6

Calculate mean for each row across a list of dataframes in R

Using base functions, you could extract all the value columns into a matrix and use row means:

rowMeans(sapply(list, "[[", "value"))

For you sample data, you'd need to also convert to numeric (as below), but I'm hoping your real data has numbers not factors.

rowMeans(sapply(lapply(list, "[[", "value"), function(x) as.numeric(as.character(x))))

This just gives the values (and assumes the rows are in the right order). You can add the sample names with cbind, e.g., cbind(list[[1]][["sample"]], rowMeans(...)).



Related Topics



Leave a reply



Submit