Filtering Rows Based On Number of Columns with AWK
You need to use the NF
(number of fields) variable to control the actions, such as in the following transcript:
$ echo '0333 foo
> bar
> 23243 qux' | awk 'NF==2{print}{}'
0333 foo
23243 qux
This will print the line if the number of fields is two, otherwise it will do nothing. The reason I have the (seemingly) strange construct NF==2{print}{}
is because some implementations of awk
will print by default if no rules are matched for a line. The empty command {}
guarantees that this will not happen.
If you're lucky enough to have one of those that doesn't do this, you can get away with:
awk 'NF==2'
but the first solution above will work in both cases.
Filter out rows from column A based on values in column B
With an array. I assume that there are no duplicates in the first column.
awk -F ',' 'NR>1{
array[$1]++; array[$2]--
}
END{
for(i in array){ if(array[i]==1){ print i } }
}' file
As one line:
awk -F ',' 'NR>1{ array[$1]++; array[$2]-- } END{for(i in array){ if(array[i]==1){ print i } } }' file
Output:
esther@example.com
daisy@example.com
bill@example.com
Using awk command to filter out specific columns from a huge file
I think you may be looking for something more along the lines of:
gzcat filename.csv.gz |
awk -F, '{print $19,$26,$27}' |
gzip > filename_FILTERED.csv.gz
How to use Awk to filter rows using a column value under double quotes
Just match the literal "
with an escape character. This is the straight-forward filter to match the literal "AA"
on the first column. Since awk
works on a pattern { action }
basis, the condition match to see if first column is "AA"
can be done directly without needing to use explicit { print }
If the condition is met for that line, awk
is left with a condition as awk 1 file
on which case the line is printed.
awk -v FS=, '$1=="\"AA\""' file
Also, you can avoid escapes, by putting the match string in a variable under single-quotes and let it match the variable
awk -v FS=, -v m='"AA"' '$1==m' file
Filter a file using a column value greater than a number (awk not working)
You need to tell your awk
to coerce $8
into a number by computing $8+0
. It is recommended that you ensure you have GNU awk installed to avoid issues. Also, you may probably use dos2unix
before working on the files to normalize the line endings.
The whole command can be written as
awk -F"," '/^LQN/ && $8+0 >= 10 {print $1, $8}' df_TPM.csv
See the online awk
demo.
NOTE: To only count these lines, use
The whole command can be written as
awk -F, '/^LQN/ && $8+0 >= 10 {cnt++} END{print cnt}' df_TPM.csv
To find the lines that do not start with LQN
, just add the negation operator !
before /^LQN/
:
awk -F, '!/^LQN/ && $8+0 >= 10 {cnt++} END{print cnt}' df_TPM.csv
Details
-F","
(=-F,
) - set the field separator to a comma/^LQN/ && $8+0 >= 10
- if the current line starts withLQN
and the eighth field is equal or larger than10
!/^LQN/ && $8+0 >= 10
- if the current line does not start withLQN
and the eighth field is equal or larger than10
{print $1, $8}
- print Field 1 and 8{cnt++}
- increment thecnt
variableEND{print cnt}
- print thecnt
variable once theawk
finishes processing lines.
Related Topics
How to Limit File Size on Commit
Dependency Walker Equivalent for Linux
Keep Meteor Running on Amazon Ec2
What's the Practical Limit on the Size of Single Packet Transmitted Over Domain Socket
Microsecond Accurate (Or Better) Process Timing in Linux
How to Read Only First N Bytes from the Http Server Using Linux Command
Docker MACvlan Network, Unable to Access Internet
How to Search an Image for Subimages Using Linux Console
Jenkins Path to Git Windows Master/Linux Slave
Increase of Virtual Memory Without Increse of Vmsize
How to Save the Output of This Awk Command to File
How to Check Hz in the Terminal