Find Value from One CSV in Another One (Like Vlookup) in Bash (Linux)

Like a vlookup but in bash to match filenames in a directory against a ref file and return full description

I would try something like this:

sed 's/,/.csv/;s/$/.csv/' filemapping.dat | xargs -n2 mv

Either cd to tmp beforehand, or modify the sed command to include the path name.

The sed commands simply replace the comma and the line end with the string ".csv".

vlookup style query in nested for loop (

There's a join utility for this. In particular, given your example:

join -o 1.1,1.1,1.3,1.4,1.5,1.6,2.2 -1 6 -2 1 test.data ref.txt |
join -o 1.1,1.1,1.3,1.4,1.5,1.6,1.7,2.2 -j 1 - ref2.txt

This produces your proposed output. The -o <list> option specifies each field you want printed in the format <filenumber>.<fieldnumber>. -1 <n> and -2 <n> specify which field respectively in each file you want to match, and -j <n> is a shortcut you can use when the field number is the same in both files.

implementing Excel-vlookup-like function with awk

here is your script with fixes: set field separators; save the score value for each id; print the value from lookup, if missing NaN

$ awk 'BEGIN   {FS=OFS=","} 
FNR==NR {a[$1]=$2; next}
{print $1, (($1 in a)?a[$1]:"NAN")}' file1 file2

id,score
1,16
3,12
5,13
7,NAN
11,8
13,32
17,37
19,NAN
23,74
29,7
31,70
37,NAN
41,83

Using awk for vlookup functionality in bash script

Could you please try following with awk, written and tested with GNU awk with shown samples.

awk '
BEGIN{
FS=OFS="|"
}
FNR==NR{
arr[$1]=$2
next
}
($1 in arr){
$1=($1 OFS arr[$1])
}
1
' file2.csv file1.csv

Explanation: Adding detailed explanation for above.

awk '                  ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section from here of this program.
FS=OFS="|" ##Setting | as field separator and output field separator.
}
FNR==NR{ ##Checking condition if FNR==NR which will be TRUE when file2.csv is being read.
arr[$1]=$2 ##Creating arr with index of 1st field and value of 2nd field.
next ##next will skip all further statements from here.
}
($1 in arr){ ##checking condition if $1 is present in arr then do following.
$1=($1 OFS arr[$1]) ##Saving current $1 OFS and value of arr with index of $1 in $1.
}
1 ##1 will print the current line.
' file2.csv file1.csv ##Mentioning Input_file names here.

Vlookup using awk command

standard join operation with awk

$ awk 'BEGIN   {FS=OFS=","} 
NR==FNR {a[$2]=$3 OFS $4; next}
{print $1, (($1 in a)?a[$1]:"NA" OFS "NA")} file2 file1

substring variation (not tested)

$ awk 'BEGIN   {FS=OFS=","} 
NR==FNR {a[substr($2,1,7)]=$3 OFS $4; next}
{key=substr($1,1,7);
print $1, ((key in a)?a[key]:"NA" OFS "NA")} file2 file1

CLI method for vlookup like search

$ awk '
NR==1 { print; next }
{ key = ($2>$3 ? $2 FS $3 : $3 FS $2) }
seen[key]++ { print orig[key] $0; delete orig[key]; next }
{ orig[key] = $0 ORS }
' file
call_start_time called_no calling_no
43284.85326 1111111111 2222222222
43284.83205 2222222222 1111111111

Why using awk in bash like vlookup in Excel give empty output file?

Awk classic. Hash the thousands of lines gene file to a hash (a) to not waste all the memory and lookup $1 from billions of lines transcriptome file:

$ awk '
# { sub(/\r$/,"") } # uncomment to remove Windows style line-endings.
NR==FNR{a[$1] # hash $1 of genes file to a
next
}
($1 in a) { # lookup from transcriptome
print
}' genes transcriptome # mind the order
>TRINITY_DN261_c0_g1_i1 GATATTTATCCGAATATTCAATATGAT
>TRINITY_DN220_c0_g1_i1 GGGAGATAATAACAATGATAACACACAAAATTCCAATG


Related Topics



Leave a reply



Submit