Linux Combine Two Files by Column

Linux Combine two files by column

$ awk -v OFS='\t' '
NR==1   { print $0, "Remark1", "Remark2"; next }
NR==FNR { a[$1]=$0; next }
$1 in a { print a[$1], $2, $3 }
' Test1.txt Test2.txt
ID     Name  Telephone  Remark1 Remark2
1      John     011     Test1   Test2
2      Sam      013     Test3   Test4
3      Jena     014     Test5   Test6
4      Peter    015     Test7   Test8

Merge Two files of columns but insert columns of second file into columns of first file

You can use a loop in awk, for example

paste file_A file_B | awk '{ 
    half = NF/2; 
    for(i = 1; i < half; i++)
    {
        printf("%s %s ", $i, $(i+half));
    }
    printf("%s %s\n", $half, $NF);
}'

paste file_A file_B | awk '{ 
    i = 1; j = NF/2 + 1;
    while(j < NF)
    {
        printf("%s %s ", $i, $j);
        i++; j++;
    }
    printf("%s %s\n", $i, $j);
}'

The code assumes that the number of columns in awk's input is even.

How to merge two .txt file in unix based on one common column. Unix

Thanks for adding your own attempts to solve the problem - it makes troubleshooting a lot easier.

This answer is a bit convoluted, but here is a potential solution (GNU join):

join -t $'\t' -1 2 -2 1 <(head -n 1 File1.txt && tail -n +2 File1.txt | sort -k2,2 ) <(head -n 1 File2.txt && tail -n +2 File2.txt | sort -k1,1)

#Sam_ID Sub_ID  v1  code    V3  V4
#2253734    1878372 SAMN06396112    20481   NA  DNA
#2275341    1884646 SAMN06432785    20483   NA  DNA
#2277481    1860945 SAMN06407597    20488   NA  DNA

Explanation:

join uses a single character as a separator, so you can't use "\t", but you can use $'\t' (as far as I know)
the -1 2 and -2 1 means "for the first file, use the second field" and "for the second file, use the first field" when combining the files
in each subprocess (<()), sort the file by the Sam_ID column but exclude the header from the sort (per Is there a way to ignore header lines in a UNIX sort?)

Edit

To specify the order of the columns in the output (to put the Sub_ID before the Sam_ID), you can use the -o option, e.g.

join -t $'\t' -1 2 -2 1 -o 1.1,1.2,1.3,2.2,2.3,2.4 <(head -n 1 File1.txt && tail -n +2 File1.txt | sort -k2,2 ) <(head -n 1 File2.txt && tail -n +2 File2.txt | sort -k1,1)

#Sub_ID Sam_ID  v1  code    V3  V4
#1878372    2253734 SAMN06396112    20481   NA  DNA
#1884646    2275341 SAMN06432785    20483   NA  DNA
#1860945    2277481 SAMN06407597    20488   NA  DNA

Merging two files with unequal lengths based on two keys in linux

Your approach is correct but while printing you need to use like A[$2,$3], you are using A[$1,$2] which is NOT existing(Because 1st, 2nd columns of file1 should be compared to 2nd and 3rd columns of file2) in array A hence its printing only current line values of file2 in your file3.

awk 'NR==FNR{a[$1,$2]=$3;next} (($2,$3) in a) {print $0, a[$2,$3]}' file1 file2

Also see link(Thanks to James for providing nice link here) Why we shouldn't use variables in capital letters

How to merge two CSV files with Linux column wise?

Use paste -d , to merge the two files and > to redirect the command output to another file:

$ paste -d , file1.csv file2.csv > output.csv

E.g.:

$ cat file1.csv
A,B

$ cat file2.csv
C,D

$ paste -d , file1.csv file2.csv > output.csv

$ cat output.csv
A,B,C,D

-d , tells paste to use , as the delimiter to join the columns.

> tells the shell to write the output of the paste command to the file output.csv

How to merge two files based on one column and print both matching and non-matching?

Assuming your real files are sorted like your samples are:

$ join -o 0,1.2,2.2 -e0 -a1 -a2 tmptest1.txt tmptest2.txt
aaa 231 222
bbb 132 0
ccc 111 0
ddd 0 132

If not sorted and using bash, zsh, ksh93 or another shell that understands <(command) redirection:

join -o 0,1.2,2.2 -e0 -a1 -a2 <(sort temptest1.txt) <(sort tmptest2.txt)

Combining two columns from different files by common strings

Using GNU awk:

awk 'NR==FNR { map[$1]=$2;next } { map1[$1]=$2 } END { PROCINFO["sorted_in"]="@ind_str_asc";for (i in map) { print i"\t"map[i]"\t"map1[i] } }' file-1 file2

Explanation:

awk 'NR==FNR { 
               map[$1]=$2;                                  # Process the first file only and set up an array called map with the first space separated field as the index and the second the value
               next 
             } 
             { 
               map1[$1]=$2                                  # When processing the second file, set up an second array called map1 and use the first field as the index and the second the value.
             } 
         END { 
               PROCINFO["sorted_in"]="@ind_str_asc";         # Set the index ordering
               for (i in map) { 
                 print i"\t"map[i]"\t"map1[i]                # Loop through the map array and print the values along with the values in map1.
               } 
              }' file-1 file2

Linux Combine Two Files by Column