How can I merge two CSV files from command line?
A basic merge would be
cat a.csv <(tail +2 b.csv) > c.csv
This will put all of b.csv
after a.csv
.
Edit
I've added the <(tail +2 b.csv)
. It will skip the header in the b.csv
file.
edit2
$ cat a.csv
hdr
a
b
c
$ cat b.csv
hdr
e
f
g
$ cat a.csv <(tail +2 b.csv)
hdr
a
b
c
e
f
g
IHTH
Merge two csv files in bash
Use paste
:
paste -d, f1.csv f2.csv > out.csv
To ignore last column of first file:
awk -F, 'NF-=1' OFS=, f1.csv | paste -d, - f2.csv > out.csv
join two csv files with key value
Here's how to use join in bash:
{
echo "City, Tmin, Tmax, Date, Tmin1, Tmax1"
join -t, <(sort d01.csv) <(sed 1d d02.csv | sort)
} > d03.csv
cat d03.csv
City, Tmin, Tmax, Date, Tmin1, Tmax1
Barcelona, 19.5, 29.5, 20140916, 19.9, 28.5
Lleida, 16.5, 33.5 , 20140916, 17.5, 32.5
Tarragona, 20.4, 31.5 , 20140916, 21.4, 30.5
Note that join only outputs records where the key exists in both files. To get all of them, specify that you want missing records from both files, specify the fields you want, and give a default value for the missing fields:
join -t, -a1 -a2 -o 0,1.2,1.3,2.2,2.3,2.4 -e '?' <(sort d01.csv) <(sed 1d d02.csv | sort)
Barcelona, 19.5, 29.5, 20140916, 19.9, 28.5
Girona, 17.2, 32.5,?,?,?
Lleida, 16.5, 33.5 , 20140916, 17.5, 32.5
Tarragona, 20.4, 31.5 , 20140916, 21.4, 30.5
Tortosa,?,?, 20140916, 20.5, 30.4
Vic, 17.5, 31.4,?,?,?
How to outer-join two CSV files, using shell script?
We suggest gawk
script which is standard Linux awk
:
script.awk
NR == FNR {
valsStr = sprintf("%s,%s", $2, "na");
rowsArr[$1] = valsStr;
}
NR != FNR && $1 in rowsArr {
split(rowsArr[$1],valsArr);
valsStr = sprintf("%s,%s", valsArr[1], $2);
rowsArr[$1] = valsStr;
next;
}
NR != FNR {
valsStr = sprintf("%s,%s", "na", $2);
rowsArr[$1] = valsStr;
}
END {
printf("%s,%s\n", "label", rowsArr["label"]);
for (rowName in rowsArr) {
if (rowName == "label") continue;
printf("%s,%s\n", rowName, rowsArr[rowName]);
}
}
output:
awk -F, -f script.awk input.{1,2}.txt
label,Part-A,Part-B
LMN,na,8
ABC,2,na
PQR,6,6
EFG,na,1
XYZ,3,4
How to merge two CSV files with Linux column wise?
Use paste -d ,
to merge the two files and >
to redirect the command output to another file:
$ paste -d , file1.csv file2.csv > output.csv
E.g.:
$ cat file1.csv
A,B
$ cat file2.csv
C,D
$ paste -d , file1.csv file2.csv > output.csv
$ cat output.csv
A,B,C,D
-d ,
tells paste to use ,
as the delimiter to join the columns.
>
tells the shell to write the output of the paste command to the file output.csv
How to merge 2 CSV files based on filename
try this:
paste -d, 1234ABC.stats.csv 1234ABC.csv
loop over multiple files in local directory
#!/bin/bash
for statsfile in *.stats.csv; do
paste -d, "$statsfile" "${statsfile//.stats/}" > "new_${statsfile//.stats/}"
done
BASH: Joining 2 CSV files based on common field name
The following join
command should do the trick:
join --header -t',' -j 1 file_2.csv file_1.csv
Just make sure that your CSV files are sorted on the join fields; havingtrack_id
as the first field in each file makes this easy.
You should use test data in both files and when you're satisfied that the command is doing what you want, you can run it against actual data and redirect its output to file_3.csv
.
Related Topics
Ssh-Add from Bash Script and Automate Passphrase Entry
How to Limit CPU and Ram Resources for Mongodump
Sending Mail in Bash Script Outputs Literal \N Instead of a New Line
How to Get Started with Libsandbox
Removing Parts of a String That Contain Digit with Sed/Perl
Kernel Oops Page Fault Error Codes for Arm
How Does Sort Work Out How Much Ram There Is
Arguments Were Passed Wrong in Pthread
Is There Any General Interfaces on Linux to Simulate Mouse Movements and Click
Grep Array Parameter of Excluded Files
How to Find The Byte Position of Specific Line in a File
Shell Programming: Executing Two Applications at The Same Time
Starting Youtrack as a Service Fails Without Error Message
Linux - Limit Usb Device Bandwidth
End Perl Script Without Waiting for System Call to Return