Using Awk to Align Columns in Text File

Using awk to align columns in text file?

A trick to align right using column is to use rev:

$ head -1 file; tail -n+2 file | rev | column -t | rev
testing speed of encryption
test   0   (64  bit  key,    16  byte  blocks):  2250265  operations  in  1  seconds  (36004240  bytes)
test   1  (128  bit  key,    64  byte  blocks):   879149  operations  in  1  seconds  (56265536  bytes)
test   2  (128  bit  key,   256  byte  blocks):   258978  operations  in  1  seconds  (66298368  bytes)
test   3  (128  bit  key,  1024  byte  blocks):    68218  operations  in  1  seconds  (69855232  bytes)
test   4  (128  bit  key,  8192  byte  blocks):     8614  operations  in  1  seconds  (70565888  bytes)
test  10  (256  bit  key,    16  byte  blocks):  1790881  operations  in  1  seconds   (3654096  bytes)

Right Align Columns in Text File with Sed

You can use printf with width as you want like this:

awk '{printf "%-15s%3d%10s%2s%15s    %-5d\n", $1, $2, $3, $4, $5, $6}' file
apple            1    33.413 C            cat    10
banana           2    21.564 B          horse    356
cherry           3    43.223 D            cow    32
pear             4    26.432 A           goat    22
raspberry        5    72.639 C          eagle    4
watermelon       6    54.436 A            fox    976
pumpkin          7    42.654 B          mouse    1
peanut           8    36.451 B            dog    56
orange           9    57.333 C       elephant    32
coconut         10    10.445 A           frog    3
blueberry       11    46.435 B          camel    446

Feel free to adjust widths to tweak the output.

How to align text in a file to looks like a table in bash based on pattern text?

Looks the fields can be split by multi-spaces, then you can try using FS="*\047 *| +", this way, your final expected lines(based on NR==1) can be split into eXXX columns(from $2 to $(NF-2)), a regular column if exists at $(NF-1). both $1 and $NF are always EMPTY.

$ cat t17.1.awk
BEGIN{ FS = " *\047 *|  +"; OFS = "\t"; }

# on the first line, set up the total N = NF
# the keys and value lengths for the 'eXXX' cols 
# to sort and format fields for all rows
NR == 1 {
    N = NF
    for (i=2; i < N-1; i++) {
        n1 = split($i, a, " ")
        e_cols[i] = a[n1]
        e_lens[i] = length($i)
    }
    # the field-length of the regular column which is non eXXX-cols
    len_last = length($(NF-1))
}

{
    printf "\047 "
    # hash the e-key for field from '2' to 'NF-1'
    # include NF-1 in case the last regular column is missing
    for (i=2; i < NF; i++) {
        n1 = split($i, a, " ")
        hash[a[n1]] = $i
    }

    # print the eXXX-cols based on the order as in NR==1
    for (i=2; i < N-1; i++) {
        printf("%*s%s", e_lens[i], hash[e_cols[i]], OFS)
    }

    # print the regular column at $(NF-1) or EMPTY if it is an eXXX-cols
    printf("%*s\047\n", len_last, match($(NF-1),/ e[0-9]+$/)?"":$(NF-1))

    # reset the hash
    delete hash
}

Run the above script and you will get the following result: (Note, I appended one extra row so that an eXXX-cols + 14411.7647 e123 is at the end of the line before the trailing ')

$ awk -f t17.1.awk file.txt 
' 14411.7647 e0 - 2647.0588 e3  + 7352.9412 e12 + 14411.7647 e123       21828.2063'
'               - 2647.0588 e3  + 7352.9412 e12                          7814.9002'
' 14411.7647 e0                                 + 14411.7647 e123       20381.3131'
' 14411.7647 e0                                 + 14411.7647 e123       20381.3131'
'     0.0000 e0                                     + 0.0000 e123       1.9293e-12'
'                                                                       14411.7647'
'                                               + 14411.7647 e123                 '

Note:

you might need gawk to make "%*s" work for printf(), in case it's not working, try a fixed number, for example: printf("%18s%s", hash[e_cols[i]], OFS)
some of values in the e-cols might have longer size than the corresponding one at NR==1, to fix this, you can manually specify an array for lengths or just use a fixed number

How to align text file with awk in python?

You can use formatted output in python for this array. We just need to split each line using 2+ spaces to get individual fields.

import re

dihedrals=['na-2e-na-cd   4    1.200       180.000           2.000', 'Pd-2e-na-cd   4    1.200       180.000           2.000', 'Pd-2e-na-ca  4    1.200       180.000           2.000', 'Pd-4n-na-hn   4    4.800         0.000           2.000', 'na-4n-cc-cc   4    4.200       180.000           2.000', 'na-2e-na-ca   4    1.200       180.000           2.000', 'Pd-2e-na-ca   4    1.200       180.000           2.000', 'cc-4n-na-hn   4    4.800         0.000           2.000', 'Pd-4n-na-cd   4    4.800         0.000           2.000', 'Pd-2e-na-cc   4    1.200       180.000           2.000', 'X -4n-na-X   2    3.400       180.000           2.000', 'Pd-4n-cc-h4   4    4.200       180.000           2.000', 'Pd-4n-cc-cc   4    4.200       180.000           2.000', 'na-2e-na-cd  4    1.200       180.000           2.000', 'na-2e-na-cc  4    1.200       180.000           2.000', 'cc-4n-na-cd   4    4.800         0.000           2.000', 'na-2e-na-ca  4    1.200       180.000           2.000', 'Pd-2e-na-cc  4    1.200       180.000           2.000', 'na-2e-na-cc   4    1.200       180.000           2.000', 'Pd-2e-na-cd  4    1.200       180.000           2.000', 'na-4n-cc-h4   4    4.200       180.000           2.000']
for i in dihedrals:
     a = re.split(' {2,}', i)
     print( "%-11s  %2s   %8s   %12s  %12s" % (a[0], a[1], a[2], a[3], a[4]) )

Output:

na-2e-na-cd   4      1.200        180.000         2.000
Pd-2e-na-cd   4      1.200        180.000         2.000
Pd-2e-na-ca   4      1.200        180.000         2.000
Pd-4n-na-hn   4      4.800          0.000         2.000
na-4n-cc-cc   4      4.200        180.000         2.000
na-2e-na-ca   4      1.200        180.000         2.000
Pd-2e-na-ca   4      1.200        180.000         2.000
cc-4n-na-hn   4      4.800          0.000         2.000
Pd-4n-na-cd   4      4.800          0.000         2.000
Pd-2e-na-cc   4      1.200        180.000         2.000
X -4n-na-X    2      3.400        180.000         2.000
Pd-4n-cc-h4   4      4.200        180.000         2.000
Pd-4n-cc-cc   4      4.200        180.000         2.000
na-2e-na-cd   4      1.200        180.000         2.000
na-2e-na-cc   4      1.200        180.000         2.000
cc-4n-na-cd   4      4.800          0.000         2.000
na-2e-na-ca   4      1.200        180.000         2.000
Pd-2e-na-cc   4      1.200        180.000         2.000
na-2e-na-cc   4      1.200        180.000         2.000
Pd-2e-na-cd   4      1.200        180.000         2.000
na-4n-cc-h4   4      4.200        180.000         2.000

A gnu-awk solution would be:

... |
awk -F ' {2,}' -v RS=', *|\\]' '
gsub(/dihedrals=\[|\047/, "") {
   printf( "%-11s  %2s   %8s   %12s  %12s\n", $1, $2, $3, $4, $5)
}'

AWK: new column not aligning properly

The text on some lines in your first column extends past the first tabstop and, on other lines, it doesn't.

If you want things lined up in a visually nice way, try column -t:

$ awk '{print $0, "2018-10-22"}' file | column -t 
192.168.50.25    2018-10-22
192.168.111.145  2018-10-22
Unknown          2018-10-22

The above lines up columns based on any whitespace. If you want only tabs to separate columns (meaning that a column can include blanks), then try:

$ awk '{print $0, "2018-10-22"}' OFS='\t' file | column -s$'\t' -t
192.168.50.25    2018-10-22
192.168.111.145  2018-10-22
Unknown          2018-10-22

Using Awk to Align Columns in Text File