Count occurrences of character per line/field on Unix
To count occurrence of a character per line you can do:
awk -F'|' 'BEGIN{print "count", "lineNum"}{print gsub(/t/,"") "\t" NR}' file
count lineNum
4 1
3 2
6 3
To count occurrence of a character per field/column you can do:
column 2:
awk -F'|' -v fld=2 'BEGIN{print "count", "lineNum"}{print gsub(/t/,"",$fld) "\t" NR}' file
count lineNum
1 1
0 2
1 3
column 3:
awk -F'|' -v fld=3 'BEGIN{print "count", "lineNum"}{print gsub(/t/,"",$fld) "\t" NR}' file
count lineNum
2 1
1 2
4 3
gsub()
function's return value is number of substitution made. So we use that to print the number.NR
holds the line number so we use it to print the line number.- For printing occurrences of particular field, we create a variable
fld
and put the field number we wish to extract counts from.
UNIX - Count occurrences of character per line between two fields and add new column with result
You can use awk to check for column, row based data:
awk '{c=0; for(i=7; i<=NF; i++) if ($i==2) c++; if (c<2) c++; print $0, c}' file
ACS_D132 ACS_D132 0 0 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
ACS_D140 ACS_D140 0 0 2 2 1 1 1 1 1 1 1 1 2 1 1 1 2 1 1 1 2
ACS_D141 ACS_D141 0 0 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 2
ACS_D147 ACS_D147 0 0 2 2 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 2
ACS_D155 ACS_D155 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
ACS_D196 ACS_D196 0 0 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
ACS_D221 ACS_D221 0 0 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Counting number of character occurrences per line
Unfortunately every line in your sample data has six semicolons, which means they should all be printed. However, here is a one-line Perl solution
$ perl -ne'print if tr/;// != 5' aaa.csv
AAAA;BBBB;CCCCCCCC;DD;EEEEEEEE;FF;
AAA1;BBBBB;CCCC;DD;EEEEEEEE;FFFFF;
AAA3;BB;CCCC;DDDDDDDDD;EEEEEEE;FF;
Count occurrences of a char in a string using Bash
I would use the following awk
command:
string="text,text,text,text"
char=","
awk -F"${char}" '{print NF-1}' <<< "${string}"
I'm splitting the string by $char
and print the number of resulting fields minus 1.
If your shell does not support the <<<
operator, use echo
:
echo "${string}" | awk -F"${char}" '{print NF-1}'
unix - breakdown of how many lines with number of character occurrences
#!/usr/bin/env perl
use strict; use warnings;
my $seq = shift @ARGV;
die unless defined $seq;
my %freq;
while ( my $line = <> ) {
last unless $line =~ /\S/;
my $occurances = () = $line =~ /(\Q$seq\E)/g;
$freq{ $occurances } += 1;
}
for my $occurances ( sort { $b <=> $a} keys %freq ) {
print "$occurances:\t$freq{$occurances}\n";
}
If you want short, you can always use:
#!/usr/bin/env perl
$x=shift;/\S/&&++$f{$a=()=/(\Q$x\E)/g}while<>
;print"$_:\t$f{$_}\n"for sort{$b<=>$a}keys%f;
or, perl -e '$x=shift;/\S/&&++$f{$a=()=/(\Q$x\E)/g}while<>;print"$_:\t$f{$_}\n"for sort{$b<=>$a}keys%f' inputfile
, but now I am getting silly.
How can I use the UNIX shell to count the number of times a letter appears in a text file?
grep char -o filename | wc -l
Awk: count occurrence of each character for every column and write it in define order
This awk script produces the output that you want:
$ awk 'BEGIN{c["H"];c["G"];c["I"];c["B"];c["b"];c["T"];c["0"]}
{for(i=1;i<=NF;++i)++a[i,$i]}
END{for(i=1;i<=NF;++i){
printf "%s ",i;
for(j in c)printf "%s=%d ",j,a[i,j];print ""}}' file.txt
1 B=0 G=0 T=0 H=0 b=0 I=0 0=5
2 B=1 G=1 T=1 H=1 b=0 I=0 0=1
3 B=1 G=2 T=1 H=0 b=1 I=0 0=0
Initialise the array c
in the BEGIN block so that it contains a key for every character. Loop through every field in each line. Increment the value of the array a
whose key comprises of the field number and the character in the field. Once every record has been processed, loop through the fields and the keys of the array c
, printing the counts in the array a
.
The keys in an array are not ordered, so when you use a for x in y
loop, you cannot rely on a specific ordering of the output. If you would like to print the keys in a certain order, you would have to specify that yourself. For example, you could do something like this:
$ awk '{for(i=1;i<=NF;++i)++a[i,$i]}
END{for(i=1;i<=NF;++i){
printf "%s ",i
printf "H=%d ", a[i,"H"]
printf "G=%d ", a[i,"G"]
printf "I=%d ", a[i,"I"]
printf "B=%d ", a[i,"B"]
printf "b=%d ", a[i,"b"]
printf "T=%d ", a[i,"T"]
printf "0=%d\n", a[i,"0"]
}}' file.txt
Unix awk - count of occurrences for each unique value
I think you need a better sample input file, but I guess that's what you're looking for
$ awk -F' \\| ' -v OFS=, '{k=substr($3,1,1); ks[k]; c[k,length($3)]++}
END {for(k in ks) print k": "c[k,6],c[k,10],c[k,15]}' file
A: 1,,
B: 1,,
a: 2,,
b: 2,,
note that since all lengths are 6, I printed that count instead of 8. With the right data you should be able to get the output you expect. Note however that the order is not preserved.
Related Topics
Makefile with Multiple Targets
Bash: Best Architecture for Reading from Two Input Streams
How to Detect Usb Drive Insertion in Linux
Best Way to Set Environment Variables in Calling Shell
How to Use Stdin Twice from Pipe
Qt Creator: Add Qt Module to Project
Ftdi D2Xx Conflict with Ftdi_Sio on Linux - How to Remove Ftdi_Sio Automatically
How to Disable CPU Cache (L1/L2) on a Linux System
Should %Rsp Be Aligned to 16-Byte Boundary Before Calling a Function in Nasm
Using a Glob Expression Passed as a Bash Script Argument
Android Studio 3.0 Emulator Does Not Start
Linux:How to Set Default Route from C
Best Posix Way to Determine If a Filesystem Is Mounted Read Only
Boost with Qt Creator and Linux
Delete a Column from a Delimited File in Linux