How to loop through a folder of CSV files in R
My favourite way to do this is using ldply
from the plyr
package. It has the advantage of returning a dataframe, so you don't need to do the rbind step afterwards:
library( plyr )
babynames <- ldply( .data = list.files(pattern="*.txt"),
.fun = read.csv,
header = FALSE,
col.names=c("Name", "Gender", "Count") )
As an added benefit, you can multi-thread the import very easily, making importing large multi-file datasets quite a bit faster:
library( plyr )
library( doMC )
registerDoMC( cores = 4 )
babynames <- ldply( .data = list.files(pattern="*.txt"),
.fun = read.csv,
header = FALSE,
col.names=c("Name", "Gender", "Count"),
.parallel = TRUE )
Changing the above slightly to include a Year
column in the resulting data frame, you can create a function first, then execute that function within ldply
in the same way you would execute read.csv
readFun <- function( filename ) {
# read in the data
data <- read.csv( filename,
header = FALSE,
col.names = c( "Name", "Gender", "Count" ) )
# add a "Year" column by removing both "yob" and ".txt" from file name
data$Year <- gsub( "yob|.txt", "", filename )
return( data )
}
# execute that function across all files, outputting a data frame
doMC::registerDoMC( cores = 4 )
babynames <- plyr::ldply( .data = list.files(pattern="*.txt"),
.fun = readFun,
.parallel = TRUE )
This will give you your data in a concise and tidy way, which is how I'd recommend moving forward from here. While it is possible to then separate each year's data into it's own column, it's likely not the best way to go.
Note: depending on your preference, it may be a good idea to convert the Year
column to say, integer
class. But that's up to you.
For loop to read multiple csv files in R from different directories
You are trying to use string interpolation, which does not exist in R.
Look at the output of this:
files <- c(21,22,29,30,34,65,66,69,70,74)
for(i in files) { # Loop over character vector
print("F:/Fish[i]/Fish[i].csv")
}
Output:
[1] "F:/Fish[i]/Fish[i].csv"
[1] "F:/Fish[i]/Fish[i].csv"
[1] "F:/Fish[i]/Fish[i].csv"
[1] "F:/Fish[i]/Fish[i].csv"
[1] "F:/Fish[i]/Fish[i].csv"
[1] "F:/Fish[i]/Fish[i].csv"
[1] "F:/Fish[i]/Fish[i].csv"
[1] "F:/Fish[i]/Fish[i].csv"
[1] "F:/Fish[i]/Fish[i].csv"
[1] "F:/Fish[i]/Fish[i].csv"
Additionally, what is F
? If it is a list, you will need to use double square brackets:
for(i in files) { # Loop over character vector
F[[i]] <- read.csv(paste0("F:/Fish",i,"/Fish", i, ".csv"))
}
Loop function for reading csv files and store them in a list
Try:
slotted <- lapply(setNames(nm = directory), function(D) {
alldat <- lapply(list.files(D, pattern="\\.csv$", full.names=TRUE),
function(fn) {
message(fn)
read.csv2(fn, stringsAsFactors=FALSE)
})
# stringsAsFactors=F should be the default as of R-3.6, I believe
do.call(rbind.fill, alldat)
})
Loop through CSV files--issue completing task for each individual file
You can give this a try:
library(stringr)
## List CSV files in folder
files<-list.files()
big.df <- vector('list',length(files))
## Run a for loop to complete the same tasks for each
for (i in 1:length(files)){
## Read table
tmp<-read.table(files[i],header=FALSE,sep=" ")
## Keep certain columns
tmp1 <- tmp[c(2:5,9,10,12,13)]
#Name the remaining columns
names(tmp1) <-
c("GMT_Date","GMT_Time","LMT_Date","LMT_Time","Latitude","Longitude","PDOP","2D_3D")
#Add column for collar ID
tmp1$AnimalID<-str_match(files[i], 'Collar(\\d+)_')[,2]
#Cleanup dataframe by removing records with NAs
tmp1[tmp1 == "N/A"] <- NA
tmp2<-na.omit(tmp1)
big.df[[i]] <- tmp2
}
final.df <- do.call('rbind', big.df)
It will require the stringr
package and assumes your filenames all look like 'GPS_Collar33801_13.csv', etc. It then reads in each file, stores it in a large list, moves to the next file... and when it's done, it mashes them all together in a data.frame called final.df
.
Edit: Just fixed the str_match
argument.
How to use R to Iterate through Subfolders and bind CSV files of the same ID?
You can use recursive=T
option for list.files
,
lapply(c('1234' ,'1345','1456','1560'),function(x){
sources.files <- list.files(path=TF,
recursive=T,
pattern=paste('*09061*',x,'*.csv',sep='')
,full.names=T)
## ou read all files with the id and bind them
dat <- do.call(rbind,lapply(sources.files,read.csv))
### write the file for the
write(dat,paste('agg',x,'.csv',sep='')
}
Looping through files in R and applying a function
It's easier if you read the file in the function
stratindex <- function(file){
ctd <- read.csv(file)
x <- ctd$Density..sigma.t..kg.m.3..
(x[30] - x[1]) / 29
}
Then apply the function to a vector of filenames
the.files <- list.files()
index <- sapply(the.files, stratindex)
output <- data.frame(File = the.files, StratIndex = index)
write.csv(output)
loop through batch read folders in setwd(), format dfs & write.csv() to different folder R
This should do the trick:
library(tidyverse)
library(janitor)
new_col_names <- rev(c("detritus","phyto","peri","zoops","amphipods","inverts","leucisids","lns",
"yct5plus","yct4","yct3","yct2","yctyoy","lkt5plus","lkt34","lkt2",
"lkt7mo1yo", "lktyoy","years"))
for (i in 1:30) {
setwd(paste0("/Users/Drive/MS/Ma/Ec/Effort_variation_Ec/MES", i, "/"))
ecosmpr <- list.files(pattern = "*.csv") %>%
map_df(~read_csv(.x))
ecosmpr <- ecosmpr %>%
select(-c(2:13)) %>%
row_to_names(row_number = 1)
names(ecosmpr) <- new_col_names
output_file <-
paste0("/Users/Drive/MS/Ma/Ec/Effort_variation_Ec/ecosmpr", i, "_partialformat.csv")
write.csv(ecosmpr, output_file, row.names = FALSE)
}
Related Topics
How to Change the Order of the Panels in Simple Lattice Graphs
Storing a List Within a Data Frame Element in R
How to Show Corpus Text in R Tm Package
Converting R Matrix into Latex Matrix in the Math or Equation Environment
Access Data Frame Column Using Variable
Plotting Continuous and Discrete Series in Ggplot with Facet
Shiny R - Download the Result of a Table
Ggplot2 Find Number of Counts in Histogram Maximum
How to Use Aggregate Function in R
Import Multiple Text Files in R and Assign Them Names from a Predetermined List
How to Preprocess Features When Some of Them Are Factors
Taking a Disproportionate Sample from a Dataset in R
A Way to Access Google Streetview from R