Loading many files at once?
lapply
works, but you have to specify that you want the objects loaded to the .GlobalEnv
otherwise they're loaded into the temporary evaluation environment created (and destroyed) by lapply
.
lapply(file_names,load,.GlobalEnv)
Loading multiple files into R at the same time (with similar file names)
One solution is to parse the file names and assign them as names to elements in a list of data frames. We'll use some sample data that has monthly sales for beer brands across two years that were saved as CSV files into two subdirectories, year1
and year2
.
We will use lapply()
to read the files into a list of data frames, and then use the names()
function to name each element by appending year<x>.
to the file name (excluding .csv
).
fileList <- c("year1/beer.csv","year2/beer.csv")
data <- lapply(fileList,function(x){
read.csv(x)
})
# generate data set names to be assigned to elements in the list
fileNameTokens <- strsplit(fileList,"/|[.]")
theNames <- unlist(lapply(fileNameTokens,function(x){
paste0(x[1],".",x[2])
}))
names(data) <- theNames
# print first six rows of file 1 based on named extract
data[["year1.beer"]][1:6,]
...and the output.
> data[["year1.beer"]][1:6,]
Month Item Sales
1 1 Budweiser 83047
2 2 Budweiser 38374
3 3 Budweiser 47287
4 4 Budweiser 18417
5 5 Budweiser 23981
6 6 Budweiser 55471
>
Next, we'll print the first few rows of the second file.
> # print first six rows of file 1 based on named extract
> data[["year2.beer"]][1:6,]
Month Item Sales
1 1 Budweiser 23847
2 2 Budweiser 33847
3 3 Budweiser 44400
4 4 Budweiser 35333
5 5 Budweiser 18710
6 6 Budweiser 63108
>
If one needs to access the files directly without relying on the list()
names, they can be assigned to the parent environment within the lapply()
function via the assign()
function, as noted in the other answer.
# alternate form, assigning directly to parent environment
data <- lapply(fileList,function(x){
# x is the filename, parse into strings to generate data set name
fileNameTokens <- unlist(strsplit(x,"/|[.]"))
assign(paste0(fileNameTokens[1],".",fileNameTokens[2]), read.csv(x),pos=1)
})
head(year1.beer)
...and the output.
> head(year1.beer)
Month Item Sales
1 1 Budweiser 83047
2 2 Budweiser 38374
3 3 Budweiser 47287
4 4 Budweiser 18417
5 5 Budweiser 23981
6 6 Budweiser 55471
>
The technique also works with RDS
files as follows.
data <- lapply(fileList,function(x){
# x is the filename, parse into strings to generate data set name
fileNameTokens <- unlist(strsplit(x,"/|[.]"))
assign(paste0(fileNameTokens[1],".",fileNameTokens[2]), readRDS(x),pos=1)
})
head(year1.beer)
...and the output.
> head(year1.beer)
Month Item Sales
1 1 Budweiser 83047
2 2 Budweiser 38374
3 3 Budweiser 47287
4 4 Budweiser 18417
5 5 Budweiser 23981
6 6 Budweiser 55471
>
Loading multiple .RData and binding into a single data.frame
You could use get()
to return the data from the calling environment or alternatively load them into a new environment and bind them afterwards. Note that .Rdata
files can contain multiple objects but assuming these objects are all conformable, you could do:
library(purrr)
library(dplyr)
df1 <- data.frame(X = 1:10)
df2 <- data.frame(X = 1:10)
save(df1, file = "df1.RData", compress = "xz")
save(df2, file = "df2.RData", compress = "xz")
list.files(pattern = "\\.RData$") %>%
map_df(~ get(load(file = .x)))
X
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
11 1
12 2
13 3
14 4
15 5
16 6
17 7
18 8
19 9
20 10
Or:
temp_env <- new.env()
list.files(pattern = "\\.RData$") %>%
map(~load(file = .x, envir = temp_env))
bind_rows(as.list(temp_env))
how to load multiple files into one file using informatica
Use indirect file load using a list of files to load all files together. Then use sorter on col2 to order the data. Finally use a target file to store data.
Whole mapping should be like this -
SQ --> EXP--> SRT(key = col2) --> Target
Few things to note -
- In the session, use
indirect file
and use a list file name - mentionfilelist1.txt
- Use
ls -1 file* >filelist1.txt
in pre session command task to create a file list with all required files. - Expression transformation- convert the col2 to INTEGER if its coming up as string in SQ.
- Sorter transformation- use col2 as key column.
How to load multiple csv files into seperate objects(dataframes) in R based on filename?
Solution for anyone curious...
files <- list.files(pattern = ".*csv")
for(file in 1:length(files)) {
file_name <- paste(c("file00",file), collapse = " ")
file_name <- gsub(" ", "", file_name, fixed = TRUE)
ex_file_name <- paste(c("exfile00",file), collapse = " ")
ex_file_name <- gsub(" ", "", ex_file_name, fixed = TRUE)
file_object <- read.csv(file = paste(file_name, ".csv", sep=""),fileEncoding="UTF-8-BOM")
exfile_object <- read.csv(file = paste(ex_file_name, ".csv", sep=""),fileEncoding="UTF-8-BOM")
}
Essentially build the filename within the loop, then passs it to the readcsv function on each iteration.
Related Topics
Merge Data Frames Based on Rownames in R
Connecting Across Missing Values with Geom_Line
Get All Diagonal Vectors from Matrix
In R, Use Gsub to Remove All Punctuation Except Period
Cowplot Made Ggplot2 Theme Disappear/How to See Current Ggplot2 Theme, and Restore the Default
R: += (Plus Equals) and ++ (Plus Plus) Equivalent from C++/C#/Java, etc.
Using Grep to Help Subset a Data Frame
Integer Data Frame to Date in R
Dplyr: Lead() and Lag() Wrong When Used with Group_By()
Predict.Lm() with an Unknown Factor Level in Test Data
Show Frequencies Along with Barplot in Ggplot2
Shiny App: Downloadhandler Does Not Produce a File
Long Numbers as a Character String
How to Sort a Data Frame by Date