Awk Command to Create Sha2 of Individual Column and Paste into New File

awk command to create sha2 of individual column and paste into new file

Try this :

awk -F"|" -v var="10" '
    NR==1;
    NR>1{
        "echo "$2"|sha256sum" | getline shaoutput; 
        split(shaoutput, sha, " "); 
        print var, $2, $3, $4, $5, sha[1]
    }' OFS="|" file

Output :

10|0001001010000026316|531849|1150|101|2e16abd9f3e3e368210b11faa5bfebdb6e001034b58cc9ad1c689dfd1f7eeacd

I prefer to use NR==1 and NR>1 as it is more readable.

NR==1; is ok, no need to add {print}

For NR>1, I use sha256sum to generate the sha as awk does not have any function to do that (to my knowledge). I save the output in shaoutput variable, clean the output using split, then print what is needed.

I prefer not to store the output delimiter inside the var variable.
Using commas inside print will make awk use the OFS variable as delimiter.

Edited

As suggested by Ed Morton, an improved solution :

awk -v var="10" '
    BEGIN{
        FS=OFS="|"
    }
    NR==1;
    NR>1{
        shaoutput="";
        cmd="echo \047" $2 "\047 | sha256sum" ;
        if ( (cmd | getline line) > 0 ){
            shaoutput=line
            close(cmd)
        }
        split(shaoutput, sha, " ");
        print var, $2, $3, $4, $5, sha[1];
    }' file

AWK write to new column base on if else of other column

You can use:

awk -F, 'NR>1 {$0 = $0 FS (($4 >= 0.7) ? 1 : 0)} 1' test_file.csv

Copy a row from multiple files and paste as column in a new file

As Deepu mentions in his good answer, saying FNR==4 suffices to print the fourth line on every file:

awk 'FNR==4' files*

With this you get something like

5  5  7  1
0  0  1  1
4  3  4  0

And now you just need to transpose the array. For this I created a little script some time ago, that I named transpose (very good at names, I know):

transpose () {
awk '{for (i=1; i<=NF; i++) a[i,NR]=$i; max=(max<NF?NF:max)}
        END {for (i=1; i<=max; i++)
            {for (j=1; j<=NR; j++) 
                printf "%s%s", a[i,j], (j<NR?OFS:ORS)
            }
        }'
}

All together, you just need to say:

$ awk 'FNR==4' f* | transpose
5 0 4
5 0 3
7 1 4
1 1 0

Note you can set input and output field separators if you wish to keep format (I guess they are tab separated right now).

Awk separate column output

$ cat tst.awk
BEGIN {
    numRows = 4
    OFS = "\t"
}
{
    rowNr = (NR - 1 ) % numRows + 1
    if ( rowNr == 1 ) {
        numCols++
    }
    val[rowNr,numCols] = $0
}
END {
    for (rowNr=1; rowNr<=numRows; rowNr++) {
        for (colNr=1; colNr<=numCols; colNr++) {
            printf "%s%s", val[rowNr,colNr], (colNr<numCols ? OFS : ORS)
        }
    }
}
$
$ awk -f tst.awk file
1 a,b   5 o,s
2 r,i   6 y
3 w
4 r,t

Combining multiple awk output statements into one line

Consider an input file data with header line like this (based closely on your minimal example):

Col1 Col2 Col3 Col4
 20 0  5  F001
  4 2  3  F002
 12 4  8  F003
100 10 29 O001

You want the output to contain a column 5 that is the value of $3 - $2 + 1 (column 3 minus column 2 plus 1), and a column 6 that is the value of column 1 divided by column 5 (with 1 decimal place in the output), and a file name that is based on a variable fname passed to the script but that has a unique value for each line. And you only want lines where column 4 matches F and 3 digits, and you want to skip the first line. That can all be written directly in awk:

awk -v fname=C '
NR == 1                     { next }
$4 ~ /^F[0-9][0-9][0-9]$/   { c5 = $3 - $2 + 1
                              c6 = sprintf("%.1f", $1 / c5)
                              print $0, c5, c6, fname NR
                            }' data

You could write that on one line too:

awk -v fname=C 'NR==1{next} $4~/^F[0-9][0-9][0-9]$/ { c5=$3-$2+1; print $0,c5,sprintf("%.1f",$1/c5), fname NR }' data

The output is:

 20 0  5  F001 6 3.3 C2
  4 2  3  F002 2 2.0 C3
 12 4  8  F003 5 2.4 C4

Clearly, you could change the file name so that the counter starts from 0 or 1 by using counter++ or ++counter respectively in place of the NR in the print statement, and you could format it with leading zeros or whatever else you want with sprintf() again. If you want to drop the first line of each file, rather than just the first file, change the NR == 1 condition to FNR == 1 instead.

Note that this does not need the preprocessing provided by cat foo.txt | tail -n +2.

how to use awk in bash -c do the ' ' in awk ' ' doesn't break the bash -c ' ' bash?

In your second command, you used backtick "quotes" for xargs -0 bash -c `...` bash. Those behave like $(...) so the command string was executed before find | xargs even started.

And in that command string, bash replaced $1 before awk even started.

Command strings with nested quotes are easier to write in multiple steps using one helper variable for each level of quoting, but since you are using bash, you can export a function instead, which makes things trivial.

Your command correctly wrapped

f() {
  paste -d ";" \
    <(md5sum "$@" | awk '{print $1}') \
    <(sha1sum "$@" | awk '{print $1}') \
    <(sha256sum "$@" | awk '{print $1}') \
    <(du -lh "$@" | awk '{print $1}')
}
export -f f
find / -type f -not \( -path '/dev/*' -or -path '/proc/*' -or -path '/sys/devices/*' \) -print0 |
xargs -0 bash -c 'f "$@"' bash

Slightly improved and adapted to your needs

As you wanted, we can print all fields for all files in a single line
by replacing (tr) each \n by an ;. The paths are not quoted in any way. If they contain a ; or linebreak, parsing the result could be difficult. If you need some form of quoting try printf %q or sed [-z] 's/.../.../g'.

f() {
  paste -d ";" \
    <(md5sum "$@" | awk NF=1) \
    <(sha1sum "$@" | awk NF=1) \
    <(sha256sum "$@" | awk NF=1) \
    <(du -lh "$@" | awk NF=1) \
    <(printf '%s\n' "$@") |
  tr '\n' ';'
}
export -f f
find / \( -path /dev/ -o -path /proc/ -o -path /sys/devices/ \) -prune -o \
  -type f -exec bash -c 'f "$@"' bash +

How to take a hash of a line in CSV and add it as a last column

while read -r line
do
  echo "$line","$(echo -n $line | openssl dgst -sha256 -hmac "SECRET" | cut -d' ' -f2)"
done < test.csv

This produces the output:

Unix,10,A,f9be1a25bec3e55418e4f6a75a6bdceecb6d6d17af911d8b4ef478431edc68d2
Linux,30,B,659c957414b20e098c299a5769f0c05b225b7fef007cd0e71e0355f7bc8afe5c
Solaris,40,C,3189a15aa81b86277e8e910eeb17a2d6a4e52fbdcbf326034d7691471788b9b7
Fedora,20,D,14a0ae4fb2a3bd2209f60969d75bee5ca243921f02be8ffc0f37f2ea9354f0b2
Ubuntu,50,E,dc635842ca6f904ca658ec71b5d9205221664688eaa028917663ab9760e823c3

Awk Command to Create Sha2 of Individual Column and Paste into New File