Using Awk to Align Columns in Text File

Using awk to align columns in text file?

A trick to align right using column is to use rev:

$ head -1 file; tail -n+2 file | rev | column -t | rev
testing speed of encryption
test 0 (64 bit key, 16 byte blocks): 2250265 operations in 1 seconds (36004240 bytes)
test 1 (128 bit key, 64 byte blocks): 879149 operations in 1 seconds (56265536 bytes)
test 2 (128 bit key, 256 byte blocks): 258978 operations in 1 seconds (66298368 bytes)
test 3 (128 bit key, 1024 byte blocks): 68218 operations in 1 seconds (69855232 bytes)
test 4 (128 bit key, 8192 byte blocks): 8614 operations in 1 seconds (70565888 bytes)
test 10 (256 bit key, 16 byte blocks): 1790881 operations in 1 seconds (3654096 bytes)

Right Align Columns in Text File with Sed

You can use printf with width as you want like this:

awk '{printf "%-15s%3d%10s%2s%15s    %-5d\n", $1, $2, $3, $4, $5, $6}' file
apple 1 33.413 C cat 10
banana 2 21.564 B horse 356
cherry 3 43.223 D cow 32
pear 4 26.432 A goat 22
raspberry 5 72.639 C eagle 4
watermelon 6 54.436 A fox 976
pumpkin 7 42.654 B mouse 1
peanut 8 36.451 B dog 56
orange 9 57.333 C elephant 32
coconut 10 10.445 A frog 3
blueberry 11 46.435 B camel 446

Feel free to adjust widths to tweak the output.

How to align text in a file to looks like a table in bash based on pattern text?

Looks the fields can be split by multi-spaces, then you can try using FS="*\047 *| +", this way, your final expected lines(based on NR==1) can be split into eXXX columns(from $2 to $(NF-2)), a regular column if exists at $(NF-1). both $1 and $NF are always EMPTY.

$ cat t17.1.awk
BEGIN{ FS = " *\047 *| +"; OFS = "\t"; }

# on the first line, set up the total N = NF
# the keys and value lengths for the 'eXXX' cols
# to sort and format fields for all rows
NR == 1 {
N = NF
for (i=2; i < N-1; i++) {
n1 = split($i, a, " ")
e_cols[i] = a[n1]
e_lens[i] = length($i)
}
# the field-length of the regular column which is non eXXX-cols
len_last = length($(NF-1))
}

{
printf "\047 "
# hash the e-key for field from '2' to 'NF-1'
# include NF-1 in case the last regular column is missing
for (i=2; i < NF; i++) {
n1 = split($i, a, " ")
hash[a[n1]] = $i
}

# print the eXXX-cols based on the order as in NR==1
for (i=2; i < N-1; i++) {
printf("%*s%s", e_lens[i], hash[e_cols[i]], OFS)
}

# print the regular column at $(NF-1) or EMPTY if it is an eXXX-cols
printf("%*s\047\n", len_last, match($(NF-1),/ e[0-9]+$/)?"":$(NF-1))

# reset the hash
delete hash
}

Run the above script and you will get the following result: (Note, I appended one extra row so that an eXXX-cols + 14411.7647 e123 is at the end of the line before the trailing ')

$ awk -f t17.1.awk file.txt 
' 14411.7647 e0 - 2647.0588 e3 + 7352.9412 e12 + 14411.7647 e123 21828.2063'
' - 2647.0588 e3 + 7352.9412 e12 7814.9002'
' 14411.7647 e0 + 14411.7647 e123 20381.3131'
' 14411.7647 e0 + 14411.7647 e123 20381.3131'
' 0.0000 e0 + 0.0000 e123 1.9293e-12'
' 14411.7647'
' + 14411.7647 e123 '

Note:

  • you might need gawk to make "%*s" work for printf(), in case it's not working, try a fixed number, for example: printf("%18s%s", hash[e_cols[i]], OFS)

  • some of values in the e-cols might have longer size than the corresponding one at NR==1, to fix this, you can manually specify an array for lengths or just use a fixed number

How to align text file with awk in python?

You can use formatted output in python for this array. We just need to split each line using 2+ spaces to get individual fields.

import re

dihedrals=['na-2e-na-cd 4 1.200 180.000 2.000', 'Pd-2e-na-cd 4 1.200 180.000 2.000', 'Pd-2e-na-ca 4 1.200 180.000 2.000', 'Pd-4n-na-hn 4 4.800 0.000 2.000', 'na-4n-cc-cc 4 4.200 180.000 2.000', 'na-2e-na-ca 4 1.200 180.000 2.000', 'Pd-2e-na-ca 4 1.200 180.000 2.000', 'cc-4n-na-hn 4 4.800 0.000 2.000', 'Pd-4n-na-cd 4 4.800 0.000 2.000', 'Pd-2e-na-cc 4 1.200 180.000 2.000', 'X -4n-na-X 2 3.400 180.000 2.000', 'Pd-4n-cc-h4 4 4.200 180.000 2.000', 'Pd-4n-cc-cc 4 4.200 180.000 2.000', 'na-2e-na-cd 4 1.200 180.000 2.000', 'na-2e-na-cc 4 1.200 180.000 2.000', 'cc-4n-na-cd 4 4.800 0.000 2.000', 'na-2e-na-ca 4 1.200 180.000 2.000', 'Pd-2e-na-cc 4 1.200 180.000 2.000', 'na-2e-na-cc 4 1.200 180.000 2.000', 'Pd-2e-na-cd 4 1.200 180.000 2.000', 'na-4n-cc-h4 4 4.200 180.000 2.000']
for i in dihedrals:
a = re.split(' {2,}', i)
print( "%-11s %2s %8s %12s %12s" % (a[0], a[1], a[2], a[3], a[4]) )

Output:

na-2e-na-cd   4      1.200        180.000         2.000
Pd-2e-na-cd 4 1.200 180.000 2.000
Pd-2e-na-ca 4 1.200 180.000 2.000
Pd-4n-na-hn 4 4.800 0.000 2.000
na-4n-cc-cc 4 4.200 180.000 2.000
na-2e-na-ca 4 1.200 180.000 2.000
Pd-2e-na-ca 4 1.200 180.000 2.000
cc-4n-na-hn 4 4.800 0.000 2.000
Pd-4n-na-cd 4 4.800 0.000 2.000
Pd-2e-na-cc 4 1.200 180.000 2.000
X -4n-na-X 2 3.400 180.000 2.000
Pd-4n-cc-h4 4 4.200 180.000 2.000
Pd-4n-cc-cc 4 4.200 180.000 2.000
na-2e-na-cd 4 1.200 180.000 2.000
na-2e-na-cc 4 1.200 180.000 2.000
cc-4n-na-cd 4 4.800 0.000 2.000
na-2e-na-ca 4 1.200 180.000 2.000
Pd-2e-na-cc 4 1.200 180.000 2.000
na-2e-na-cc 4 1.200 180.000 2.000
Pd-2e-na-cd 4 1.200 180.000 2.000
na-4n-cc-h4 4 4.200 180.000 2.000

A gnu-awk solution would be:

... |
awk -F ' {2,}' -v RS=', *|\\]' '
gsub(/dihedrals=\[|\047/, "") {
printf( "%-11s %2s %8s %12s %12s\n", $1, $2, $3, $4, $5)
}'

AWK: new column not aligning properly

The text on some lines in your first column extends past the first tabstop and, on other lines, it doesn't.

If you want things lined up in a visually nice way, try column -t:

$ awk '{print $0, "2018-10-22"}' file | column -t 
192.168.50.25 2018-10-22
192.168.111.145 2018-10-22
Unknown 2018-10-22

The above lines up columns based on any whitespace. If you want only tabs to separate columns (meaning that a column can include blanks), then try:

$ awk '{print $0, "2018-10-22"}' OFS='\t' file | column -s$'\t' -t
192.168.50.25 2018-10-22
192.168.111.145 2018-10-22
Unknown 2018-10-22


Related Topics



Leave a reply



Submit