How to Loop Through a Folder of CSV Files in R

How to loop through a folder of CSV files in R

My favourite way to do this is using ldply from the plyr package. It has the advantage of returning a dataframe, so you don't need to do the rbind step afterwards:

library( plyr )
babynames <- ldply( .data = list.files(pattern="*.txt"),
.fun = read.csv,
header = FALSE,
col.names=c("Name", "Gender", "Count") )

As an added benefit, you can multi-thread the import very easily, making importing large multi-file datasets quite a bit faster:

library( plyr )
library( doMC )
registerDoMC( cores = 4 )
babynames <- ldply( .data = list.files(pattern="*.txt"),
.fun = read.csv,
header = FALSE,
col.names=c("Name", "Gender", "Count"),
.parallel = TRUE )

Changing the above slightly to include a Year column in the resulting data frame, you can create a function first, then execute that function within ldply in the same way you would execute read.csv

readFun <- function( filename ) {

# read in the data
data <- read.csv( filename,
header = FALSE,
col.names = c( "Name", "Gender", "Count" ) )

# add a "Year" column by removing both "yob" and ".txt" from file name
data$Year <- gsub( "yob|.txt", "", filename )

return( data )
}

# execute that function across all files, outputting a data frame
doMC::registerDoMC( cores = 4 )
babynames <- plyr::ldply( .data = list.files(pattern="*.txt"),
.fun = readFun,
.parallel = TRUE )

This will give you your data in a concise and tidy way, which is how I'd recommend moving forward from here. While it is possible to then separate each year's data into it's own column, it's likely not the best way to go.

Note: depending on your preference, it may be a good idea to convert the Year column to say, integer class. But that's up to you.

For loop to read multiple csv files in R from different directories

You are trying to use string interpolation, which does not exist in R.

Look at the output of this:

files <- c(21,22,29,30,34,65,66,69,70,74)

for(i in files) { # Loop over character vector
print("F:/Fish[i]/Fish[i].csv")
}

Output:

[1] "F:/Fish[i]/Fish[i].csv"
[1] "F:/Fish[i]/Fish[i].csv"
[1] "F:/Fish[i]/Fish[i].csv"
[1] "F:/Fish[i]/Fish[i].csv"
[1] "F:/Fish[i]/Fish[i].csv"
[1] "F:/Fish[i]/Fish[i].csv"
[1] "F:/Fish[i]/Fish[i].csv"
[1] "F:/Fish[i]/Fish[i].csv"
[1] "F:/Fish[i]/Fish[i].csv"
[1] "F:/Fish[i]/Fish[i].csv"

Additionally, what is F? If it is a list, you will need to use double square brackets:

for(i in files) {                                             # Loop over character vector
F[[i]] <- read.csv(paste0("F:/Fish",i,"/Fish", i, ".csv"))
}

Loop function for reading csv files and store them in a list

Try:

slotted <- lapply(setNames(nm = directory), function(D) {
alldat <- lapply(list.files(D, pattern="\\.csv$", full.names=TRUE),
function(fn) {
message(fn)
read.csv2(fn, stringsAsFactors=FALSE)
})
# stringsAsFactors=F should be the default as of R-3.6, I believe
do.call(rbind.fill, alldat)
})

Loop through CSV files--issue completing task for each individual file

You can give this a try:

library(stringr)
## List CSV files in folder
files<-list.files()

big.df <- vector('list',length(files))

## Run a for loop to complete the same tasks for each
for (i in 1:length(files)){
## Read table
tmp<-read.table(files[i],header=FALSE,sep=" ")
## Keep certain columns
tmp1 <- tmp[c(2:5,9,10,12,13)]
#Name the remaining columns
names(tmp1) <-
c("GMT_Date","GMT_Time","LMT_Date","LMT_Time","Latitude","Longitude","PDOP","2D_3D")
#Add column for collar ID
tmp1$AnimalID<-str_match(files[i], 'Collar(\\d+)_')[,2]
#Cleanup dataframe by removing records with NAs
tmp1[tmp1 == "N/A"] <- NA
tmp2<-na.omit(tmp1)
big.df[[i]] <- tmp2

}
final.df <- do.call('rbind', big.df)

It will require the stringr package and assumes your filenames all look like 'GPS_Collar33801_13.csv', etc. It then reads in each file, stores it in a large list, moves to the next file... and when it's done, it mashes them all together in a data.frame called final.df.

Edit: Just fixed the str_match argument.

How to use R to Iterate through Subfolders and bind CSV files of the same ID?

You can use recursive=T option for list.files,

 lapply(c('1234' ,'1345','1456','1560'),function(x){
sources.files <- list.files(path=TF,
recursive=T,
pattern=paste('*09061*',x,'*.csv',sep='')
,full.names=T)
## ou read all files with the id and bind them
dat <- do.call(rbind,lapply(sources.files,read.csv))
### write the file for the
write(dat,paste('agg',x,'.csv',sep='')
}

Looping through files in R and applying a function

It's easier if you read the file in the function

stratindex <- function(file){
ctd <- read.csv(file)
x <- ctd$Density..sigma.t..kg.m.3..
(x[30] - x[1]) / 29
}

Then apply the function to a vector of filenames

the.files <- list.files()
index <- sapply(the.files, stratindex)
output <- data.frame(File = the.files, StratIndex = index)
write.csv(output)

loop through batch read folders in setwd(), format dfs & write.csv() to different folder R

This should do the trick:

library(tidyverse)
library(janitor)

new_col_names <- rev(c("detritus","phyto","peri","zoops","amphipods","inverts","leucisids","lns",
"yct5plus","yct4","yct3","yct2","yctyoy","lkt5plus","lkt34","lkt2",
"lkt7mo1yo", "lktyoy","years"))

for (i in 1:30) {

setwd(paste0("/Users/Drive/MS/Ma/Ec/Effort_variation_Ec/MES", i, "/"))

ecosmpr <- list.files(pattern = "*.csv") %>%
map_df(~read_csv(.x))

ecosmpr <- ecosmpr %>%
select(-c(2:13)) %>%
row_to_names(row_number = 1)

names(ecosmpr) <- new_col_names

output_file <-
paste0("/Users/Drive/MS/Ma/Ec/Effort_variation_Ec/ecosmpr", i, "_partialformat.csv")

write.csv(ecosmpr, output_file, row.names = FALSE)
}


Related Topics



Leave a reply



Submit