Use awk/bash to add 1 to all columns but first
You can do this with one awk
call:
awk 'BEGIN{OFS="\t"} FNR==1{print; next} {for (i=2;i<=NF;i++)$i=$i+1}1' infile > outfile
AWK: Help on transforming data table
Using GNU awk for arrays of arrays:
$ cat tst.awk
BEGIN { OFS="\t" }
{
sub(/-.*/,"",$1)
minYear = ( NR==1 || $1 < minYear ? $1 : minYear )
maxYear = ( NR==1 || $1 > maxYear ? $1 : maxYear )
items[$2][$3]
vals[$1][$2][$3] += $4
typeTots[$1][$2] += $4
yearTots[$1] += $4
}
END {
printf "%s", OFS
for ( year=minYear; year<=maxYear; year++ ) {
printf "%s%s", OFS, year
}
print ""
for ( type in items ) {
itemCnt = 0
for ( item in items[type] ) {
printf "%s%s%s", (itemCnt++ ? "" : type), OFS, item
for ( year=minYear; year<=maxYear; year++ ) {
printf "%s%0.2f", OFS, vals[year][type][item]
}
print ""
}
printf "Subt%s", OFS
for ( year=minYear; year<=maxYear; year++ ) {
printf "%s%0.2f", OFS, typeTots[year][type]
}
print ORS
}
printf "Total%s", OFS
for ( year=minYear; year<=maxYear; year++ ) {
printf "%s%0.2f", OFS, yearTots[year]
}
print ""
}
$ awk -f tst.awk in.txt
2020 2021 2022
alcohol beer 0.00 6.00 12.00
smirnov 26.99 0.00 0.00
Subt 26.99 6.00 12.00
fruit orange 8.40 0.00 4.30
mango 0.00 6.99 7.20
banana 3.40 0.00 0.00
Subt 11.80 6.99 11.50
Total 38.79 12.99 23.50
or if you really want specific date ranges instead of just the year in the header:
$ cat tst.awk
BEGIN { OFS="\t" }
{
sub(/-.*/,"",$1)
minYear = ( NR==1 || $1 < minYear ? $1 : minYear )
maxYear = ( NR==1 || $1 > maxYear ? $1 : maxYear )
items[$2][$3]
vals[$1][$2][$3] += $4
typeTots[$1][$2] += $4
yearTots[$1] += $4
}
END {
printf "%s", OFS
for ( year=minYear; year<=maxYear; year++ ) {
printf "%s%s-01-01", OFS, year
}
print ""
printf "%s", OFS
for ( year=minYear; year<=maxYear; year++ ) {
printf "%s-%s-12-31", OFS, year
}
print ""
for ( type in items ) {
itemCnt = 0
for ( item in items[type] ) {
printf "%s%s%s", (itemCnt++ ? "" : type), OFS, item
for ( year=minYear; year<=maxYear; year++ ) {
printf "%s%0.2f", OFS, vals[year][type][item]
}
print ""
}
printf "Subt%s", OFS
for ( year=minYear; year<=maxYear; year++ ) {
printf "%s%0.2f", OFS, typeTots[year][type]
}
print ORS
}
printf "Total%s", OFS
for ( year=minYear; year<=maxYear; year++ ) {
printf "%s%0.2f", OFS, yearTots[year]
}
print ""
}
$ awk -f tst.awk in.txt | column -s$'\t' -t
2020-01-01 2021-01-01 2022-01-01
-2020-12-31 -2021-12-31 -2022-12-31
alcohol beer 0.00 6.00 12.00
smirnov 26.99 0.00 0.00
Subt 26.99 6.00 12.00
fruit orange 8.40 0.00 4.30
mango 0.00 6.99 7.20
banana 3.40 0.00 0.00
Subt 11.80 6.99 11.50
Total 38.79 12.99 23.50
awk: find minimum and maximum in column
Awk guesses the type.
String "10" is less than string "4" because character "1" comes before "4".
Force a type conversion, using addition of zero:
min=`awk 'BEGIN{a=1000}{if ($1<0+a) a=$1} END{print a}' mydata.dat`
max=`awk 'BEGIN{a= 0}{if ($1>0+a) a=$1} END{print a}' mydata.dat`
Extract specific columns from delimited file using Awk
I don't know if it's possible to do ranges in awk. You could do a for loop, but you would have to add handling to filter out the columns you don't want. It's probably easier to do this:
awk -F, '{OFS=",";print $1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$20,$21,$22,$23,$24,$25,$30,$33}' infile.csv > outfile.csv
something else to consider - and this faster and more concise:
cut -d "," -f1-10,20-25,30-33 infile.csv > outfile.csv
As to the second part of your question, I would probably write a script in perl that knows how to handle header rows, parsing the columns names from stdin or a file and then doing the filtering. It's probably a tool I would want to have for other things. I am not sure about doing in a one liner, although I am sure it can be done.
awk sql parsing
Rather that using awk
. If you have access to the database, why not generate the SQL you need from the catalog
SELECT 'ALTER TABLE "' || TABSCHEMA || '"."' || TABNAME || '" '
|| LISTAGG('ALTER COLUMN "' || COLNAME || '" DROP GENERATED', ' ') || ';'
FROM SYSCAT.COLUMNS
WHERE GENERATED = 'A' AND IDENTITY = 'N'
GROUP BY
TABSCHEMA, TABNAME
which produces this against your test table
ALTER TABLE "SCHEMA "."TABLE"ALTER COLUMN "COL6" DROP GENERATED ALTER COLUMN "COL7" DROP GENERATED;
Simple
Check if two files match in 2 column values and print those lines to a new output file
This should work:
$ awk 'NR==FNR{a[$4,$5]=$0;next}(($2,$5) in a)' file2 file1
Output:
CHR BP BETA SE P PHENOTYPE FDR CATEGORY SNP
10 110408937 3.386e+00 1.333e+00 1.112e-02 1 1 Medication rs113627704
Explained:
$ awk '
NR==FNR { # process file2 as output we want are from file1
a[$4,$5]=$0 # desired fields are 4th and 5th, use them as hash key
next # move to next record
} # process file1 below this point
(($2,$5) in a) # test if 2nd and 5th in hash and output
' file2 file1 # mind the file order
Related Topics
Linux Sed Command - Using Variable with Backslash
Force Unmount of Nfs-Mounted Directory
How to Include File in a Bash Shell Script
Check Whether a Certain File Type/Extension Exists in Directory
Importing a Cmake Project into Eclipse Cdt
How to Open a "-" Dashed Filename Using Terminal
How to Check That Two Folders Are the Same in Linux
What Context Does the Scheduler Code Run In
Maximum Number of Inodes in a Directory
Git Merge Branch of Another Remote
Trailing Arguments with Find -Exec {} +
How to Gzip All Files in All Sub-Directories into One Compressed File in Bash
Fast Way to Get Image Dimensions (Not Filesize)
I Get "Dquote>" as a Result of Executing a Program in Linux Shell
Display Only Files and Folders That Are Symbolic Links in Tcsh or Bash
How to Fetch Java Version Using Single Line Command in Linux
Shared Libraries: Windows VS Linux Method
How to Know the Interrupt/Gpio Number for a Specific Pin in Linux