Same function over multiple data frames in R
Make a list of data frames then use lapply to apply the function to them all.
df.list <- list(df1,df2,...)
res <- lapply(df.list, function(x) rowMeans(subset(x, select = c(start, stop)), na.rm = TRUE))
# to keep the original data.frame also
res <- lapply(df.list, function(x) cbind(x,"rowmean"=rowMeans(subset(x, select = c(start, stop)), na.rm = TRUE)))
The lapply will then feed in each data frame as x sequentially.
how to apply same function to multiple dataframes in R
We can keep the datasets in a list
and loop over the list
with lapply
lst1 <- lapply(list(df1, df2), PasteTwoColumn)
If there are many datasets, use mget
to get the values of the datasets into a list
lst1 <- lapply(mget(paste0('df', 1:100)), PasteTwoColumn)
Or instead of paste
, we can also use ls
lst1 <- lapply(mget(ls(pattern = '^df\\d+$')), PasteTwoColumn)
If we need to update the original object, use list2env
list2env(lst1, .GlobalEnv) #not recommended though
If we need to use a for
loop
for(obj in paste0("df", 1:100)) {
assign(obj, PasteTwoColumn(get(obj)))
}
A set of functions over multiple data frames and merge the outputs in R
Basil. Welcome to StackOverflow.
I was wary of lapply
when I first stated using R, but you should stick with it. It's almost always more efficient than using a for loop. In your particular case, you can put your individual data frames in a list
and the code you run on each into a function myFunc
, say, which takes the data frame you want to process as its argument.
Then you can simply say
allData <- bind_rows(lapply(1:length(dataFrameList), function(x) myFunc(dataFrameList[[x]])))
Incidentally, your column names make me think your data isn't yet tidy. I'd suggest you spend a little time making it so before you do much else. It will save you a huge amount of effort in the long run.
R - Apply function on multiple data frames
The reason for your error is I think that you passed x[1,3]
which would get the value from the first row of the third column only. I assume you want to calculate the mean of the same column across all the data.frames
, so I made a slight modification to your function so you can pass data and the name or position of the column:
mean_h_stem <- function(dat, col){ mean(dat[,col], na.rm=T)}
Column can be selected using an integer:
lapply(df.list, mean_h_stem, 2)
Or a column name, expressed as a string:
lapply(df.list, mean_h_stem, 'col_name')
Passing the second argument like this can feel a little unintuitive, so you can do it in a clearer way:
lapply(df.list, function(x) mean_h_stem(dat = x, col ='col_name'))
This will only work for single columns at a time per your question, but you could easily modify this to do multiple.
As an aside, to read in the csv files, you could also use an lapply
with read.csv
:
temp <- list.files(pattern='*.csv')
df.list <- lapply(temp, read.csv)
R loop to apply same function to multiple dataframes
There are many ways you could achieve this. Here's one approach that uses functions from the dplyr
package
library("dplyr")
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
df1 = data.frame(Bird_ID = c(1:6,7,7,6,2,1))
df2 = data.frame(Bird_ID = c(1:10,7,7,6,2,1,10,9,3))
# combine the dataframes into a named list, for convenience
df_list <- list(df1 = df1, df2 = df2)
# bind, group, and summarise
bind_rows(df_list, .id = "df_name") %>%
group_by(df_name) %>%
summarise(n_unique = length(unique(Bird_ID)))
#> # A tibble: 2 × 2
#> df_name n_unique
#> <chr> <int>
#> 1 df1 7
#> 2 df2 10
Created on 2021-10-26 by the reprex package (v2.0.1)
How do I apply a function over multiple data frames, but overwrite them?
You can use the list2env()
function.
list2env(data_list, envir = .GlobalEnv)
This will return all the data frames from your list and save them to the environment. This will also keep the data frame object's name.
Related Topics
The Condition Has Length > 1 and Only the First Element Will Be Used in If Else Statement
Data.Table and Parallel Computing
Download a File from Https Using Download.File()
Assigning Dates to Fiscal Year
Create Zip File: Error Running Command " " Had Status 127
Cowplot Made Ggplot2 Theme Disappear/How to See Current Ggplot2 Theme, and Restore the Default
Write List of Data.Frames to Separate CSV Files with Lapply
Replace Missing Values (Na) with Blank (Empty String)
Include Space for Missing Factor Level Used in Fill Aesthetics in Geom_Boxplot
What Is the Most Useful R Trick
How to Get Coefficients and Their Confidence Intervals in Mixed Effects Models
Split Character Data into Numbers and Letters
Find Which Interval Row in a Data Frame That Each Element of a Vector Belongs In
Setting Absolute Size of Facets in Ggplot2
Long Numbers as a Character String