How to Combine Two Variable Column-By-Column in Bash

how to combine two variable column-by-column in bash

paste <(echo "$VAR1") <(echo "$VAR2") --delimiters ''

Linux Shell, Awk : Merge 2 Variable Data by Column

Using awk you can do this:

awk 'NR==FNR {a[FNR]=$1; next} {print $0, a[FNR]}' <(echo "$rpart") <(echo "$lpart")
"2017-07-03 13:39:5", "-39dB", "7c:e9:d3:f1:61:55" "qhy2"
"2017-07-03 13:39:5", "-39dB", "7c:e9:d3:f1:61:55" "qhy2"
"2017-07-03 13:39:5", "-39dB", "7c:e9:d3:f1:61:55" "Apple
"2017-07-03 13:39:5", "-37dB", "7c:e9:d3:f1:61:55" "qhy2"

How do I concatenate each line of 2 variables in bash?

paste -d' ' <(echo "$NUMS") <(echo "$TITLES")

Concatenation of two columns from the same file

for a generalized approach

$ f() { awk '{print $'$1'}' file; }; f 1; f 2

a
b
c
d
e
f

if the file is tab delimited perhaps simply with cut (the inverse operation of paste)

$ cut -f1 file.t; cut -f2 file.t

Create new column that merges two columns

Using awk you can add a column containing values of column 1 and 4 as:

awk '{print $4"_"$1, $0}' filename

OFS will print the value of output field separator variable.

On piping the output to column -t:

mir_seq                                     seq                          name            freq  mir              start  end  mism  add    t5   t3     s5        s3        DB     ambiguity
hsa-miR-143-3p_TGAGAAGAAGCACTGTAGCTCTT      TGAGAAGAAGCACTGTAGCTCTT      seq_100006_x2   2     hsa-miR-143-3p   61     81   6AT   u-TT   0    0      AGTCTGAG  GCTCAGGA  miRNA  1
hsa-miR-10a-5p_GACCCTGTAGATCCGAATTTGTA      GACCCTGTAGATCCGAATTTGTA      seq_100012_x2   2     hsa-miR-10a-5p   22     43   1GT   u-A    0    u-G    TATATACC  TGTGTAAG  miRNA  1
hsa-miR-10a-5p_GACCCTGTAGATCCGAATTTGTG      GACCCTGTAGATCCGAATTTGTG      seq_100013_x35  35    hsa-miR-10a-5p   22     44   1GT   0      0    0      TATATACC  TGTGTAAG  miRNA  1
hsa-miR-1296-5p_TTAGGGCCCTGGCTCCATCT        TTAGGGCCCTGGCTCCATCT         seq_100019_x13  13    hsa-miR-1296-5p  16     35   0     0      0    u-CC   TGGGTTAG  CTCCTTTA  miRNA  1
hsa-miR-887-3p_GTGAACGGGCGCCATCCCGAGGCTT    GTGAACGGGCGCCATCCCGAGGCTT    seq_100029_x2   2     hsa-miR-887-3p   48     72   0     0      0    d-CTT  TGGAGTGA  GAGGCTTT  miRNA  1
hsa-miR-10a-5p_ACCCGGTAGATCCGAATTTGTG       ACCCGGTAGATCCGAATTTGTG       seq_10002_x5    5     hsa-miR-10a-5p   23     44   5GT   0      d-T  0      TATATACC  TGTGTAAG  miRNA  1
hsa-miR-191-5p_CAACGGAATCCCAAAAGCAGCTGAAAA  CAACGGAATCCCAAAAGCAGCTGAAAA  seq_100031_x3   3     hsa-miR-191-5p   16     39   24AT  u-AAA  0    d-T    CGGGCAAC  GCTGTTGT  miRNA  1
hsa-miR-454-3p_TAGTGCAATATTGCTTATAGGGTAT    TAGTGCAATATTGCTTATAGGGTAT    seq_100033_x2   2     hsa-miR-454-3p   64     86   0     u-AT   0    0      TGAGTAGT  GGGTTTTG  miRNA  1
hsa-miR-191-5p_CAACGGAATCCGAAAAGCAGCTG      CAACGGAATCCGAAAAGCAGCTG      seq_100037_x16  16    hsa-miR-191-5p   16     38   12GC  0      0    0      CGGGCAAC  GCTGTTGT  miRNA  1

Since awk don't have inline editing option, you will have to use gawk for inline editing. Using awk, you can write the output to temporary file and then move/copy/rename it to original file name.

For using the command in multiple file:

for i in Miraligner_*.txt.mirna; do
    awk '{print $4"_"$1, $0}' "$i" | column -t;
done

If you are using gawk and interested in doing inline editing, use gawk -i inplace

Using perl:

perl -ane 'print "$F[3]_$F[0] $_";' filename | column -t

If you want to write to file, use -i option:

perl -ane 'print "$F[3]_$F[0] $_";' -i filename

Separate all the input fields and the appended column (field) with \t:

perl -ane '$"="\t"; print "$F[3]_$F[0] @F\n";' -i filename

If you want your output in your file in proper tabular form:

for i in Miraligner_*.txt.mirna; do
    awk '{print $4"_"$1, $0}' "$i" | column -t > temp && mv temp "$i";
done

This will give output separated into proper column in your file. For this, you won't need inline editing option.

Thanks to @EdMorton for correcting my mistakes.

Unix: How to combine separate columns into one column

You can put in string literal inside awk print command.

Here's an example:

$ cat a
1 2 3 [AUTORESTART] Mar 17 21:21:32 GMT 2022
$ cat a | awk '{print $4 "," $6 " " $7 " " $8 " " $9 " " $10}'
[AUTORESTART],17 21:21:32 GMT 2022

You can see that I print 4th column, then a literal comma, then 6th column, then literal space, and so on until 10th column

You can then redirect it to a csv file

$ cat a | awk '{print $4 "," $6 " " $7 " " $8 " " $9 " " $10}' > mycsv.csv

Combine multiple grep variables in one column-wise file

In this answers the variable names are shortened to ini and align.

First, we extract the sample name and count from grep's output. Since we have to do this multiple times, we define the function

e() { sed -E 's,^.*/(.*)_R1.*:(.*)$,\1\t\2,'; }

Then we join the extracted data into one file. Lines with the same sample name will be combined.

join -t $'\t' <(e <<< "$ini") <(e <<< "$align")

Now we nearly have the expected output. We only have to add the header and draw lines for the table.

join ... | column -to " | " -N Sample,ini,align

This will print

Sample                                  | ini   | align
V3_F357_N_V4_R805_1_A1_bach1_GTATCGTCGT | 13175 | 12589
V3_F357_N_V4_R805_1_A2_bach2_GAGTGATCGT | 14801 | 13934
V3_F357_N_V4_R805_1_A3_bach3_TGAGCGTGCT | 13475 | 12981
V3_F357_N_V4_R805_1_A4_bach4_TGTGTGCATG | 13424 | 12896
V3_F357_N_V4_R805_1_A5_bach5_TGTGCTCGCA | 12053 | 11617

Adding a horizontal line after the header is left as an exercise for the reader :)

This approach also works with more than two number columns. The join and -N parts have to be extended. join can only work with two files, requiring us to use an unwieldy workaround ...

e() { sed -E 's,^.*/(.*)_R1.*:(.*)$,\1\t\2,'; }
join -t $'\t' <(e <<< "$var1") <(e <<< "$var2") |
join -t $'\t' - <(e <<< "$var3") | ... | join -t $'\t' - <(e <<< "$varN") |
column -to " | " -N Sample,Col1,Col2,...,ColN

... so it would be easier to add another helper function

e() { sed -E 's,^.*/(.*)_R1.*:(.*)$,\1\t\2,'; }
j2() { join -t $'\t' <(e <<< "$1") <(e <<< "$2"); }
j() { join -t $'\t' - <(e <<< "$1"); }
j2 "$var1" "$var2" | j "$var3" | ... | j "$varN" |
column -to " | " -N Sample,Col1,Col2,...,ColN

Alternatively, if all inputs contain the same samples in the same order, join can be replaced with one single paste command.

How to Combine Two Variable Column-By-Column in Bash