Adding file name column to table as multiple files are read and merged
You can use sub
to get the filename. It seems CHROM
column is read as numeric in certain files, we can convert it to character explicitly. Try :
library(dplyr)
library(purrr)
sites <- sub('\\.csv$', '', basename(filenames))
ans <- map2_df(filenames, sites, ~read_csv(.x) %>%
mutate(CHROM = as.character(CHROM), id = .y))
Add filename column to table as multiple files are read and bound
I generally use the following approach, based on dplyr/tidyr:
data = tibble(File = files) %>%
extract(File, "Site", "([A-Z]{2}-[A-Za-z0-9]{3})", remove = FALSE) %>%
mutate(Data = lapply(File, read_csv)) %>%
unnest(Data) %>%
select(-File)
Insert a column with file name
We can use Map
and create a new column with cbind
to show filename for each file.
Map(cbind, lapply(files, data.table::fread, sep=","), filename = files)
We can also use functions from purrr
package to do the same.
library(purrr)
map2(map(files, data.table::fread, sep=","), files, cbind)
To use lapply
, we can loop over the index of filenames instead and use transform
to add new column with name of the file.
lapply(seq_along(files), function(x) transform(read.csv(files[x]), file = files[x]))
Add new column name to a list of data frames from a part of the file name using lapply
Name the list of filenames using setNames()
, then use the .id
argument in bind_rows()
, which adds a column containing list names.
library(tidyverse)
library(readxl)
files <- list.files(path ="Users/Desktop/week", pattern = "*.xlsx", full.names= T) %>%
setNames(nm = .) %>%
lapply(read_excel, sheet =4, skip =39) %>%
bind_rows(.id = "Week") %>%
mutate(Week = str_extract(Week, "wk\\d+"))
You could also combine the iteration and row-binding steps using purrr::map_dfr()
:
files <- list.files(path ="Users/Desktop/week", pattern = "*.xlsx", full.names= T) %>%
setNames(nm = .) %>%
map_dfr(read_excel, sheet = 4, skip = 39, .id = "Week") %>%
mutate(Week = str_extract(Week, "wk\\d+"))
Read multiple .txt files and add new column identifying file name in R
This should work, if your read.table
command is correct:
myData_list <- lapply(files, function(x) {
out <- tryCatch(read.table(x, header = F, sep = ','), error = function(e) NULL)
if (!is.null(out)) {
out$source_file <- x
}
return(out)
})
myData <- data.table::rbindlist(myData_list)
In the past I found that you can spare yourself a lot of headache using data.table::fread
instead of read.table
. So you could consider this:
myData_list <- lapply(files, function(x) {
out <- data.table::fread(x, header = FALSE)
out$source_file <- x
return(out)
})
myData <- data.table::rbindlist(myData_list)
You can add the tryCatch
part back if necessary. Depending on how the files
vector looks, basename()
might be interesting to use on the column source_file
.
Add filename as column header when combining multiple files
You can do the following to add column names to TEST10. This assumes the column name you want for the first column is files3[1]
colnames(TEST10) <- c(files3[1], files3)
In case you want to keep the name of the first column as is, then we add the desired column names before binding WAVELENGTH with TEST9.
colnames(TEST9) <- files3
TEST10 <- cbind(WAVELENGTH, TEST9)
Then you can write to a csv as usual, keeping the column names as headers in the resulting file.
write.csv(TEST10, file = "TEST10.csv", row.names = FALSE)
Assign 'filename' column to dataframe based on row ID
Don't have data to test this on but you can try the following :
library(dplyr)
library(rvest)
mydata <- sapply(filelist, function(x) {
read_html(x) %>% rvest::html_table(fill = TRUE) %>%
dplyr::nth(2)
}, simplify = FALSE)
mydata <- bind_rows(mydata, .id = ='company')
mydata$company <- sub('.*_(\\w+)_\\w+', '\\1', mydata$company)
We used sapply
with simplify = FALSE
to get a named list with filelist
as names, when we use bind_rows
that name is assigned as a new column company
. Using regex we extract the relevant part of the data.
Related Topics
How to Round Up to the Nearest 10 (Or 100 or X)
Perform a Semi-Join with Data.Table
Min for Each Row in a Data Frame
Apply a Function to Every Row of a Matrix or a Data Frame
R Suppress Startupmessages from Dependency
How to Remove Unicode <U+00A6> from String
Export a List into a CSV or Txt File in R
Convert Currency with Commas into Numeric
How to Initialize Empty Data Frame (Lot of Columns at the Same Time) in R
Remove Rows from Data Frame Where a Row Matches a String
Printing Multiple Ggplots into a Single PDF, Multiple Plots Per Page
Sum Cells of Certain Columns for Each Row
Shiny: Differencebetween Observeevent and Eventreactive
Duplicates in Multiple Columns