Add "Filename" Column to Table as Multiple Files Are Read and Bound

Adding file name column to table as multiple files are read and merged

You can use sub to get the filename. It seems CHROM column is read as numeric in certain files, we can convert it to character explicitly. Try :

library(dplyr)
library(purrr)

sites <- sub('\\.csv$', '', basename(filenames))

ans <- map2_df(filenames, sites, ~read_csv(.x) %>%
mutate(CHROM = as.character(CHROM), id = .y))

Add filename column to table as multiple files are read and bound

I generally use the following approach, based on dplyr/tidyr:

data = tibble(File = files) %>%
extract(File, "Site", "([A-Z]{2}-[A-Za-z0-9]{3})", remove = FALSE) %>%
mutate(Data = lapply(File, read_csv)) %>%
unnest(Data) %>%
select(-File)

Insert a column with file name

We can use Map and create a new column with cbind to show filename for each file.

Map(cbind, lapply(files, data.table::fread, sep=","), filename = files)

We can also use functions from purrr package to do the same.

library(purrr)
map2(map(files, data.table::fread, sep=","), files, cbind)

To use lapply, we can loop over the index of filenames instead and use transform to add new column with name of the file.

lapply(seq_along(files), function(x) transform(read.csv(files[x]), file = files[x]))

Add new column name to a list of data frames from a part of the file name using lapply

Name the list of filenames using setNames(), then use the .id argument in bind_rows(), which adds a column containing list names.

library(tidyverse)
library(readxl)

files <- list.files(path ="Users/Desktop/week", pattern = "*.xlsx", full.names= T) %>%
setNames(nm = .) %>%
lapply(read_excel, sheet =4, skip =39) %>%
bind_rows(.id = "Week") %>%
mutate(Week = str_extract(Week, "wk\\d+"))

You could also combine the iteration and row-binding steps using purrr::map_dfr():

files <- list.files(path ="Users/Desktop/week", pattern = "*.xlsx", full.names= T) %>%
setNames(nm = .) %>%
map_dfr(read_excel, sheet = 4, skip = 39, .id = "Week") %>%
mutate(Week = str_extract(Week, "wk\\d+"))

Read multiple .txt files and add new column identifying file name in R

This should work, if your read.table command is correct:

myData_list <- lapply(files, function(x) {
out <- tryCatch(read.table(x, header = F, sep = ','), error = function(e) NULL)
if (!is.null(out)) {
out$source_file <- x
}
return(out)
})

myData <- data.table::rbindlist(myData_list)

In the past I found that you can spare yourself a lot of headache using data.table::fread instead of read.table. So you could consider this:

myData_list <- lapply(files, function(x) {
out <- data.table::fread(x, header = FALSE)
out$source_file <- x
return(out)
})

myData <- data.table::rbindlist(myData_list)

You can add the tryCatch part back if necessary. Depending on how the files vector looks, basename() might be interesting to use on the column source_file.

Add filename as column header when combining multiple files

You can do the following to add column names to TEST10. This assumes the column name you want for the first column is files3[1]

colnames(TEST10) <- c(files3[1], files3)

In case you want to keep the name of the first column as is, then we add the desired column names before binding WAVELENGTH with TEST9.

colnames(TEST9) <-  files3
TEST10 <- cbind(WAVELENGTH, TEST9)

Then you can write to a csv as usual, keeping the column names as headers in the resulting file.

write.csv(TEST10, file = "TEST10.csv", row.names = FALSE)

Assign 'filename' column to dataframe based on row ID

Don't have data to test this on but you can try the following :

library(dplyr)
library(rvest)

mydata <- sapply(filelist, function(x) {
read_html(x) %>% rvest::html_table(fill = TRUE) %>%
dplyr::nth(2)
}, simplify = FALSE)

mydata <- bind_rows(mydata, .id = ='company')
mydata$company <- sub('.*_(\\w+)_\\w+', '\\1', mydata$company)

We used sapply with simplify = FALSE to get a named list with filelist as names, when we use bind_rows that name is assigned as a new column company. Using regex we extract the relevant part of the data.



Related Topics



Leave a reply



Submit