How can I turn the filename into a variable when reading multiple csvs into R
You can create the object from lapply
first.
Lapply <- lapply(myFiles, read.csv, header=TRUE))
names(Lapply) <- myFiles
for(i in myFiles)
Lapply[[i]]$Source = i
do.call(rbind, Lapply)
How can I turn a part of the filename into a variable when reading multiple text files?
Update: Although the initial answer is correct, the same goal can be achieved in fewer steps by using sapply
with simplify=FALSE
instead of lapply
because sapply
automatically assigns the filenames to the elements in the list:
library(data.table)
files <- list.files("pathname", pattern="*.TXT")
file.list <- sapply(files, read.table, simplify=FALSE)
masterfilesales <- rbindlist(file.list, idcol="id")[, id := substr(id,1,4)]
Old answer: To achieve what you want, you can utilize a combination of the setattr
function and the idcol
pararmeter of the rbindlist
function from the data.table
-package as follows:
library(data.table)
files <- list.files("pathname", pattern="*.TXT")
file.list <- lapply(files, read.table)
setattr(file.list, "names", files)
masterfilesales <- rbindlist(file.list, idcol="id")[, id := substr(id,1,4)]
Alternatively, you can set the filenames in base R with:
attr(file.list, "names") <- files
or:
names(file.list) <- files
and bind them together with bind_rows
from the dplyr
package (which has also an .id
parameter to create an id-column):
masterfilesales <- bind_rows(file.list, .id="id") %>% mutate(id = substr(id,1,4))
How to import multiple .csv files at once?
Something like the following should result in each data frame as a separate element in a single list:
temp = list.files(pattern="*.csv")
myfiles = lapply(temp, read.delim)
This assumes that you have those CSVs in a single directory--your current working directory--and that all of them have the lower-case extension .csv
.
If you then want to combine those data frames into a single data frame, see the solutions in other answers using things like do.call(rbind,...)
, dplyr::bind_rows()
or data.table::rbindlist()
.
If you really want each data frame in a separate object, even though that's often inadvisable, you could do the following with assign
:
temp = list.files(pattern="*.csv")
for (i in 1:length(temp)) assign(temp[i], read.csv(temp[i]))
Or, without assign
, and to demonstrate (1) how the file name can be cleaned up and (2) show how to use list2env
, you can try the following:
temp = list.files(pattern="*.csv")
list2env(
lapply(setNames(temp, make.names(gsub("*.csv$", "", temp))),
read.csv), envir = .GlobalEnv)
But again, it's often better to leave them in a single list.
How to load multiple csv files into seperate objects(dataframes) in R based on filename?
Solution for anyone curious...
files <- list.files(pattern = ".*csv")
for(file in 1:length(files)) {
file_name <- paste(c("file00",file), collapse = " ")
file_name <- gsub(" ", "", file_name, fixed = TRUE)
ex_file_name <- paste(c("exfile00",file), collapse = " ")
ex_file_name <- gsub(" ", "", ex_file_name, fixed = TRUE)
file_object <- read.csv(file = paste(file_name, ".csv", sep=""),fileEncoding="UTF-8-BOM")
exfile_object <- read.csv(file = paste(ex_file_name, ".csv", sep=""),fileEncoding="UTF-8-BOM")
}
Essentially build the filename within the loop, then passs it to the readcsv function on each iteration.
Read multiple CSV files into separate data frames
Quick draft, untested:
Use
list.files()
akadir()
to dynamically generate your list of files.This returns a vector, just run along the vector in a
for
loop.Read the i-th file, then use
assign()
to place the content into a new variable file_i
That should do the trick for you.
Importing multiple .csv files into R and adding a new column with file name
This should do it:
file_names <- dir("~/Desktop/data")
df <- do.call(rbind, lapply(file_names, function(x) cbind(read.csv(x), name=strsplit(x,'\\.')[[1]][1])))
Add filename column to table as multiple files are read and bound
I generally use the following approach, based on dplyr/tidyr:
data = tibble(File = files) %>%
extract(File, "Site", "([A-Z]{2}-[A-Za-z0-9]{3})", remove = FALSE) %>%
mutate(Data = lapply(File, read_csv)) %>%
unnest(Data) %>%
select(-File)
Load in multiple CSV files and add suffix to column names in R
Suppose we have the files generated reproducibly in the Note at the end.
Then we get the file names in fnames
and Map
a function Read
over them to read in each file and fix the names returning the fixed up data frame.
fnames <- Sys.glob("data*.csv")
Read <- function(f) {
df <- read.csv(f)
names(df)[-1] <- paste0(names(df[-1]), "_", sub(".csv$", "", basename(f)))
df
}
L <- Map(Read, fnames)
str(L)
giving this named list:
List of 3
$ data1.csv:'data.frame': 2 obs. of 3 variables:
..$ subject_id: int [1:2] 1 2
..$ var1_data1: int [1:2] 55 55
..$ var2_data1: int [1:2] 57 57
$ data2.csv:'data.frame': 2 obs. of 3 variables:
..$ subject_id: int [1:2] 1 2
..$ var1_data2: int [1:2] 55 55
..$ var2_data2: int [1:2] 57 57
$ data3.csv:'data.frame': 2 obs. of 3 variables:
..$ subject_id: int [1:2] 1 2
..$ var1_data3: int [1:2] 55 55
..$ var2_data3: int [1:2] 57 57
Note
Lines <- "subject_id var1 var2
1 55 57
2 55 57"
data1 <- data2 <- data3 <- read.table(text = Lines, header = TRUE)
for(f in c("data1", "data2", "data3")) write.csv(get(f), paste0(f, ".csv"), row.names = FALSE, quote = FALSE)
Related Topics
Setting Working Directory: Julia Versus R
Visualizing Two or More Data Points Where They Overlap (Ggplot R)
Ggplot Inserting Space Before Degree Symbol on Axis Label
R Dataframe with Varied Column Lengths
Rmarkdown Table with Cells That Have Two Values
Setting an Individual Color Palette for the Group Variable in Geom_Smooth
Prevent Automatic Conversion of Single Column to Vector
Calculate Proportions Within Subsets of a Data Frame
R Data.Table Join: SQL "Select *" Alike Syntax in Joined Tables
Height' Must Be a Vector or a Matrix. Barplot Error
Why Does ".." Work to Pass Column Names in a Character Vector Variable
Why Doesn't "+" Operate on Characters in R
How to Get Dimnames in Xtable.Table Output
How to Control Label Color Depending on Fill Darkness of Bars
Data.Table Join and J-Expression Unexpected Behavior