Like a vlookup but in bash to match filenames in a directory against a ref file and return full description
I would try something like this:
sed 's/,/.csv/;s/$/.csv/' filemapping.dat | xargs -n2 mv
Either cd
to tmp
beforehand, or modify the sed
command to include the path name.
The sed
commands simply replace the comma and the line end with the string ".csv".
vlookup style query in nested for loop (
There's a join
utility for this. In particular, given your example:
join -o 1.1,1.1,1.3,1.4,1.5,1.6,2.2 -1 6 -2 1 test.data ref.txt |
join -o 1.1,1.1,1.3,1.4,1.5,1.6,1.7,2.2 -j 1 - ref2.txt
This produces your proposed output. The -o <list>
option specifies each field you want printed in the format <filenumber>.<fieldnumber>
. -1 <n>
and -2 <n>
specify which field respectively in each file you want to match, and -j <n>
is a shortcut you can use when the field number is the same in both files.
implementing Excel-vlookup-like function with awk
here is your script with fixes: set field separators; save the score value for each id; print the value from lookup, if missing NaN
$ awk 'BEGIN {FS=OFS=","}
FNR==NR {a[$1]=$2; next}
{print $1, (($1 in a)?a[$1]:"NAN")}' file1 file2
id,score
1,16
3,12
5,13
7,NAN
11,8
13,32
17,37
19,NAN
23,74
29,7
31,70
37,NAN
41,83
Using awk for vlookup functionality in bash script
Could you please try following with awk
, written and tested with GNU awk
with shown samples.
awk '
BEGIN{
FS=OFS="|"
}
FNR==NR{
arr[$1]=$2
next
}
($1 in arr){
$1=($1 OFS arr[$1])
}
1
' file2.csv file1.csv
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section from here of this program.
FS=OFS="|" ##Setting | as field separator and output field separator.
}
FNR==NR{ ##Checking condition if FNR==NR which will be TRUE when file2.csv is being read.
arr[$1]=$2 ##Creating arr with index of 1st field and value of 2nd field.
next ##next will skip all further statements from here.
}
($1 in arr){ ##checking condition if $1 is present in arr then do following.
$1=($1 OFS arr[$1]) ##Saving current $1 OFS and value of arr with index of $1 in $1.
}
1 ##1 will print the current line.
' file2.csv file1.csv ##Mentioning Input_file names here.
Vlookup using awk command
standard join operation with awk
$ awk 'BEGIN {FS=OFS=","}
NR==FNR {a[$2]=$3 OFS $4; next}
{print $1, (($1 in a)?a[$1]:"NA" OFS "NA")} file2 file1
substring variation (not tested)
$ awk 'BEGIN {FS=OFS=","}
NR==FNR {a[substr($2,1,7)]=$3 OFS $4; next}
{key=substr($1,1,7);
print $1, ((key in a)?a[key]:"NA" OFS "NA")} file2 file1
CLI method for vlookup like search
$ awk '
NR==1 { print; next }
{ key = ($2>$3 ? $2 FS $3 : $3 FS $2) }
seen[key]++ { print orig[key] $0; delete orig[key]; next }
{ orig[key] = $0 ORS }
' file
call_start_time called_no calling_no
43284.85326 1111111111 2222222222
43284.83205 2222222222 1111111111
Why using awk in bash like vlookup in Excel give empty output file?
Awk classic. Hash the thousands of lines gene file to a hash (a
) to not waste all the memory and lookup $1
from billions of lines transcriptome file:
$ awk '
# { sub(/\r$/,"") } # uncomment to remove Windows style line-endings.
NR==FNR{a[$1] # hash $1 of genes file to a
next
}
($1 in a) { # lookup from transcriptome
print
}' genes transcriptome # mind the order
>TRINITY_DN261_c0_g1_i1 GATATTTATCCGAATATTCAATATGAT
>TRINITY_DN220_c0_g1_i1 GGGAGATAATAACAATGATAACACACAAAATTCCAATG
Related Topics
How to Run a Linux Command That Still Runs After I Close My Putty Ssh Session
Perl Fork() Exec() , Child Process Gone Wild
Permission Issues, Not Able to Run Script as Root
Some Details on Arm Linux Boot
"Thread Apply All Bt Full" Gives Blank in Gdb
Printing Grep Results to File and Terminal
Tomcat6 -> How to Put a Project into the Root Folder
How to Cross Compile R Packages for MACos from a Linux Environment
If Adding a Command That Repeats Every 10 Minutes to Crontab, When Does the First Job Run
Combine Results of Column One Then Sum Column 2 to List Total for Each Entry in Column One
How to Count Most Occuring Sequence of 3 Letters Within a Word with a Bash Script
How to Export a Modified Kernel Header
How to Compare 2 Lists of Ranges in Bash
Erge Text Files Ordered by Numerical Filenames in Bash
Kubernetes Can't Start Due to Too Many Open Files in System