How to join data from 2 different csv-files in R?
Use read.csv
and then merge
.
Load the two csv files into R. (Don't forget to make sure their common variables share the same name!).
df1<-read.csv(dat1,head=T)
df2<-read.csv(dat2,head=T)
Merge the dataframes together by their shared variables and add argument all.x=T (the default) to ensure all rows are kept from your database containing species.
merge(df1,df2,by=c('transect_id','year'),all.x=T)
To see this in action using test data:
test<-data.frame(sp=c(rep(letters[1:10],2)),t=c(rep(1:3,2,20)),y=c(rep(2000:2008,len=20)),AUC=1:20)
test2<-data.frame(t=c(rep(1:3,2,9)),y=c(rep(2000:2008,len=9)),ppt=c(1:9),temp=c(11:19))
merge(test,test2,by=c('t','y'),all.x=T)
How to join two csv files in R and match the data from both tables?
try this:
final_csv = merge(csv_1, csv_2, by = c("Show.Name", "Episode.Name"))
full join of multiple csv files in R
map
syntax is map(list, function)
. read_csv2(paste("~/Documents/data", files, sep = "/")
isn't a function, it's trying to run read_csv2
on all the files, but read_csv2
isn't vecrorized. Change to purrr
style lambda syntax as below. I'd also strongly recommend specifying the by
columns in your join, especially a full join, to make sure you're getting what you think you are and the result doesn't blow up in size.
files %>%
map(~read_csv2(paste("~Documents/data", ., sep = "/")) %>%
reduce(full_join, by = c("name", "extension"))
If you need to debug more, give yourself a small example and do it one line at a time. Say, start with files2 <- files[1:2]
and just work on reading in the first two files. When that works, move on to the join.
Merging CSV files of the same names from different folders into one file
The problem can be split into parts:
- Identify all files in all subfolders of the working directory by using
list.files(..., recursive = TRUE)
. - Keep only the csv files
- Import them all into r - for example, by
map
pingread.csv
to all paths - Joining everything into a single dataframe, for example with
reduce
andbind_rows
(assuming that all csvs have the same structure) - Split this single dataframes according to station code, for example with
group_split()
- Writing these split dataframes to csv, for example by
map
pingwrite.csv
.
This way you can avoid using for loops.
library(here)
library(stringr)
library(purrr)
library(dplyr)
# Identify all files
filenames <- list.files(here(), recursive = TRUE, full.names = TRUE)
# Limit to csv files
joined <- filenames[str_detect(filenames, ".csv")] |>
# Read them all in
map(read.csv) |>
# Join them in
reduce(bind_rows)
# Split into one dataframe per station
split_df <- joined |> group_split(station_code)
# Save each dataframe to a separate csv
map(seq_along(split_df), function(i) {
write.csv(split_df[[i]],
paste0(split_df[[i]][1,1], "_combined.csv"),
row.names = FALSE)
})
Combining multiple csvs with different column names to unique column names
You can try this code -
library(tidyverse)
directory() %>%
filter(endsWith(path, ".csv")) %>%
pull(path) %>%
map_df(~{
.x %>%
read_csv %>%
mutate(Ori_code = names(.)[1],
Ori_desc = names(.)[2],
across(.fns = as.character)) %>%
rename_with(~c('Gen_code', 'Gen_Desc'), 1:2)
}) -> result
result
Code inside map_df
is what repeats for each file.
- Reads the csv
- Create two new columns
Ori_code
andOri_desc
which has the value from the two column names. - Convert all the columns to character since if we have columns with mixed datatype we will not be able to merge them together in one dataset.
- Rename the 1st two columns to
c('Gen_code', 'Gen_Desc')
. - Since we are using
map_df
it will combine all the files together into oneresult
.
merge multiple .csv files - R
You can use :
data_csv <- do.call(rbind, lapply(myfiles, read.csv, sep = ";"))
Or with purrr
's map_df
data_csv <- purrr::map_df(myfiles, read.csv, sep = ";"))
If there are lot of files you can use data.table
functions.
library(data.table)
data_csv <- rbindlist(lapply(myfiles, fread))
Related Topics
Pivot_Longer Multiple Variables of Different Kinds
How to Get Covariance Matrix for Random Effects (Blups/Conditional Modes) from Lme4
How to Show a Loading Screen When the Output Is Being Calculated in a Background Process
Numbered Code Chunks in Rmarkdown
Knitr: Object Cannot Be Found When Converting Markdown File into HTML
R - Lattice Xyplot - How to Add Error Bars to Groups and Summary Lines
Paste Several Column Values into One Value in R
Creating Igraph with Isolated Nodes
Plot Event Sequences/Event Sequences Clustering
Unzip Password Protected Zip Files in R
Transfer Data from Database to Spark Using Sparklyr
"Error: Continuous Value Supplied to Discrete Scale" in Default Data Set Example Mtcars and Ggplot2
R: Save All Data.Frames in Workspace to Separate .Rdata Files
Ggplot2: How to Reduce Space Between Narrow Width Bars, After Coord_Flip, and Panel Border
Independently Move 2 Legends Ggplot2 on a Map
Programming with Ggplot2 and Dplyr
Longtable in a Knitr (Pdf) Document: Using Xtable (Or Kable)