Importing multiple .xlsx files with multiple sheets in R
Let's have 2 files with two worksheets each:
library(tidyverse)
library(readxl)
list.files("~/data", full.names = TRUE)
#> [1] "/home/rstudio/data/data.xlsx" "/home/rstudio/data/data2.xlsx"
read_excel("/home/rstudio/data/data.xlsx", sheet = 1)
#> # A tibble: 2 x 2
#> a `1`
#> <chr> <dbl>
#> 1 b 2
#> 2 c NA
read_excel("/home/rstudio/data/data.xlsx", sheet = 2)
#> # A tibble: 2 x 2
#> d `4`
#> <chr> <dbl>
#> 1 e 5
#> 2 f 6
expand_grid(
file = list.files("~/data", full.names = TRUE),
sheet = seq(2)
) %>%
transmute(data = file %>% map2(sheet, ~ read_excel(path = .x, sheet = .y))) %>%
pull(data)
#> [[1]]
#> # A tibble: 2 x 2
#> a `1`
#> <chr> <dbl>
#> 1 b 2
#> 2 c NA
#>
#> [[2]]
#> # A tibble: 2 x 2
#> d `4`
#> <chr> <dbl>
#> 1 e 5
#> 2 f 6
#>
#> [[3]]
#> # A tibble: 2 x 2
#> a `1`
#> <chr> <dbl>
#> 1 b 2
#> 2 c NA
#>
#> [[4]]
#> # A tibble: 2 x 2
#> d `4`
#> <chr> <dbl>
#> 1 e 5
#> 2 f 6
Created on 2021-11-11 by the reprex package (v2.0.1)
Reading all sheets in multiple excel files into R
You could try with readxl
...
I've not tested this for the case of different workbooks with duplicate worksheet names.
There were a number of issues with your code:
- the list.files pattern included a
.
which is a reserved character so needs to be escaped with\\
- As @deschen pointed out the excel referring functions are from the
openxlsx
package
library(readxl)
files.list <- list.files(recursive = T, pattern = '*\\.xlsx$') #get files list from folder
for (i in seq_along(files.list)){
sheet_nm <- excel_sheets(files.list[i])
for (j in seq_along(sheet_nm)){
assign(x = sheet_nm[j], value = read_xlsx(path = files.list[i], sheet = sheet_nm[j]), envir = .GlobalEnv)
}
}
Created on 2022-01-31 by the reprex package (v2.0.1)
Read one worksheet from multiple excel files using purrr and readxl and add field
Supposing the two packs.xlsx files are in different subfolders:
library(readxl)
filenames <- list.files(pattern = "packs.xlsx", recursive = TRUE)
df <- lapply(filenames, function(fn) {
# get the sheet detail
xl <- read_excel(fn, sheet = "summary")
# add the filename as a field
xl$filename <- fn
# function return
xl
})
# if both summary sheets have the same format, you can combine them into one
fin <- do.call(rbind, df)
Read multiple xlsx files with multiple sheets into one R data frame
I would use a nested loop like this to go through each sheet of each file.
It might not be the fastest solution but it is the simplest.
require(xlsx)
file.list <- list.files(recursive=T,pattern='*.xlsx') #get files list from folder
for (i in 1:length(files.list)){
wb <- loadWorkbook(files.list[i]) #select a file & load workbook
sheet <- getSheets(wb) #get sheet list
for (j in 1:length(sheet)){
tmp<-read.xlsx(files.list[i], sheetIndex=j, colIndex= c(1:6,8:10,12:17,19),
sheetName=NULL, startRow=4, endRow=NULL,
as.data.frame=TRUE, header=F)
if (i==1&j==1) dataset<-tmp else dataset<-rbind(dataset,tmp) #happend to previous
}
}
You can clean NA
values after the loading phase.
Importing multiple excel sheets into one dataframe adding the sheet name as variable
Simply use bind_rows()
in dplyr
and set the arg .id = "sheet"
, then data in each sheet will be row-bind together and a new column named what you set in .id
is added to record the sheet names which the data come from.
dplyr::bind_rows(
import_list("path/to/file/test.xlsx", setclass = "tbl"),
.id = "sheet"
)
Test
Write out an excel file with 2 sheets named AUS
and AUT
:
openxlsx::write.xlsx(
list(AUS = data.frame(x = 1:2, y = 3:4),
AUT = data.frame(x = 5:6, y = 7:8)),
file = "test.xlsx"
)
Then
dplyr::bind_rows(
rio::import_list("test.xlsx", setclass = "tbl"),
.id = "sheet"
)
# # A tibble: 4 × 3
# sheet x y
# <chr> <dbl> <dbl>
# 1 AUS 1 3
# 2 AUS 2 4
# 3 AUT 5 7
# 4 AUT 6 8
Related Topics
R - Ggplot2 - Highlighting Selected Points and Strange Behavior
Harnessing .F List Names with Purrr::Pmap
Remove Lines from Color and Fill Legends
Set a Functions Environment to That of the Calling Environment (Parent.Frame) from Within Function
Ggplot: Order Bars in Faceted Bar Chart Per Facet
How to Download and Display an Image from an Url in R
Best Way to Replace a Lengthy Ifelse Structure in R
Combining Low Frequency Counts
Combining Different Types of Graphs Together (R)
How to Find Difference Between Values in Two Rows in an R Dataframe Using Dplyr
How to Make a Timeseries Boxplot in R
Change Plotly Chart Y Variable Based on Selectinput
R: Calculate Cosine Distance from a Term-Document Matrix with Tm and Proxy
Convert List to Data Frame While Keeping List-Element Names
Two Y-Axes with Different Scales for Two Datasets in Ggplot2
Gsub in R with Unicode Replacement Give Different Results Under Windows Compared with Unix