Count Total Number of Pattern Between Two Pattern (Using Sed If Possible) in Linux

Count total number of pattern between two pattern (using sed if possible) in Linux

A very cryptic perl answer:

perl -nE 's/\{(.*?)\}/ say ($1 =~ tr{=}{=}) /ge'

The tr function returns the number of characters transliterated.


With the new requirements, we can make a couple of small changes:

perl -0777 -nE 's/\{(.*?)\}/ say ($1 =~ tr{=}{=}) /ges'
  • -0777 reads the entire file/stream into a single string
  • the s flag to the s/// function allows . to handle newlines like a plain character.

How to iteratively find number of lines between two patterns?

awk '$0=="123" {if (n) print NR-1-n; n=NR}' file

This uses the line number of matched lines to print the number of lines between them.

Count the number of occurrences of a string using sed?

I don't think sed would be appropriate, unless you use it in a pipeline to convert your file so that the word you need appears on separate lines, and then use grep -c to count the occurrences.

I like Jonathan's idea of using tr to convert spaces to newlines. The beauty of this method is that successive spaces get converted to multiple blank lines but it doesn't matter because grep will be able to count just the lines with the single word 'title'.

How to print lines between two patterns, inclusive or exclusive (in sed, AWK or Perl)?

Print lines between PAT1 and PAT2

$ awk '/PAT1/,/PAT2/' file
PAT1
3 - first block
4
PAT2
PAT1
7 - second block
PAT2
PAT1
10 - third block

Or, using variables:

awk '/PAT1/{flag=1} flag; /PAT2/{flag=0}' file

How does this work?

  • /PAT1/ matches lines having this text, as well as /PAT2/ does.
  • /PAT1/{flag=1} sets the flag when the text PAT1 is found in a line.
  • /PAT2/{flag=0} unsets the flag when the text PAT2 is found in a line.
  • flag is a pattern with the default action, which is to print $0: if flag is equal 1 the line is printed. This way, it will print all those lines occurring from the time PAT1 occurs and up to the next PAT2 is seen. This will also print the lines from the last match of PAT1 up to the end of the file.

Print lines between PAT1 and PAT2 - not including PAT1 and PAT2

$ awk '/PAT1/{flag=1; next} /PAT2/{flag=0} flag' file
3 - first block
4
7 - second block
10 - third block

This uses next to skip the line that contains PAT1 in order to avoid this being printed.

This call to next can be dropped by reshuffling the blocks: awk '/PAT2/{flag=0} flag; /PAT1/{flag=1}' file.

Print lines between PAT1 and PAT2 - including PAT1

$ awk '/PAT1/{flag=1} /PAT2/{flag=0} flag' file
PAT1
3 - first block
4
PAT1
7 - second block
PAT1
10 - third block

By placing flag at the very end, it triggers the action that was set on either PAT1 or PAT2: to print on PAT1, not to print on PAT2.

Print lines between PAT1 and PAT2 - including PAT2

$ awk 'flag; /PAT1/{flag=1} /PAT2/{flag=0}' file
3 - first block
4
PAT2
7 - second block
PAT2
10 - third block

By placing flag at the very beginning, it triggers the action that was set previously and hence print the closing pattern but not the starting one.

Print lines between PAT1 and PAT2 - excluding lines from the last PAT1 to the end of file if no other PAT2 occurs

This is based on a solution by Ed Morton.

awk 'flag{
if (/PAT2/)
{printf "%s", buf; flag=0; buf=""}
else
buf = buf $0 ORS
}
/PAT1/ {flag=1}' file

As a one-liner:

$ awk 'flag{ if (/PAT2/){printf "%s", buf; flag=0; buf=""} else buf = buf $0 ORS}; /PAT1/{flag=1}' file
3 - first block
4
7 - second block

# note the lack of third block, since no other PAT2 happens after it

This keeps all the selected lines in a buffer that gets populated from the moment PAT1 is found. Then, it keeps being filled with the following lines until PAT2 is found. In that point, it prints the stored content and empties the buffer.

using sed to print between two patterns

dtpwmbp:~ pwadas$ echo "Alas poor Yorik, I knew him well" | sed -e 's/^.*poor //g;s/ well.*$//g'
Yorik, I knew him
dtpwmbp:~ pwadas$ echo "Alas poor Yorik, I knew him well" | awk '{sub(/.*poor /,"");sub(/ well.*/,"");print;}'
Yorik, I knew him

Usage with file input:

dtpwmbp:~ pwadas$ echo "Alas poor Yorik, I knew him well" > infile
dtpwmbp:~ pwadas$ cat infile
Alas poor Yorik, I knew him well
dtpwmbp:~ pwadas$ cat infile | sed -e 's/^.*poor //g;s/ well.*$//g'
Yorik, I knew him
dtpwmbp:~ pwadas$ sed -e 's/^.*poor //g;s/ well.*$//g' < infile
Yorik, I knew him
dtpwmbp:~ pwadas$ cat infile | awk '{sub(/.*poor /,"");sub(/ well.*/,"");print;}'
Yorik, I knew him
dtpwmbp:~ pwadas$ awk '{sub(/.*poor /,"");sub(/ well.*/,"");print;}' < infile
Yorik, I knew him

How to use sed to identify patterns in multiple lines

One possible sed solution:

sed -r 's/^[[:digit:]]+\. /# /g' <inputfile>
  • -r : treat search pattern as an extended regex
  • /^[[:digit]]+\. /# /g : look for lines that start with 1 or more digits followed by a period and a space, and if found replace with a # followed by a space
  • leave all other lines as they are (ie, don't make any changes)

For example:

$ cat datfile
1. numberedlist
2. one
3. two
where in the world is waldo
10. pickles
15. jam
# I'm just a comment
sky blue
100. bash
101. ksh
102. csh
72.don't touch this
# rubber ducky

And a test run of our sed script:

$ sed -r 's/^[[:digit:]]+\. /# /g' datfile
# numberedlist
# one
# two
where in the world is waldo
# pickles
# jam
# I'm just a comment
sky blue
# bash
# ksh
# csh
72.don't touch this
# rubber ducky

Using sed to delete all lines between two matching patterns

Use this sed command to achieve that:

sed '/^#/,/^\$/{/^#/!{/^\$/!d}}' file.txt

Mac users (to prevent extra characters at the end of d command error) need to add semicolons before the closing brackets

sed '/^#/,/^\$/{/^#/!{/^\$/!d;};}' file.txt

OUTPUT

# ID 1
$ description 1
blah blah
# ID 2
$ description 2
blah blah
blah blah

Explanation:

  • /^#/,/^\$/ will match all the text between lines starting with # to lines starting with $. ^ is used for start of line character. $ is a special character so needs to be escaped.
  • /^#/! means do following if start of line is not #
  • /^$/! means do following if start of line is not $
  • d means delete

So overall it is first matching all the lines from ^# to ^\$ then from those matched lines finding lines that don't match ^# and don't match ^\$ and deleting them using d.

How to select lines between two marker patterns which may occur multiple times with awk/sed

Use awk with a flag to trigger the print when necessary:

$ awk '/abc/{flag=1;next}/mno/{flag=0}flag' file
def1
ghi1
jkl1
def2
ghi2
jkl2

How does this work?

  • /abc/ matches lines having this text, as well as /mno/ does.
  • /abc/{flag=1;next} sets the flag when the text abc is found. Then, it skips the line.
  • /mno/{flag=0} unsets the flag when the text mno is found.
  • The final flag is a pattern with the default action, which is to print $0: if flag is equal 1 the line is printed.

For a more detailed description and examples, together with cases when the patterns are either shown or not, see How to select lines between two patterns?.

How to use sed/grep to extract text between two words?

sed -e 's/Here\(.*\)String/\1/'


Related Topics



Leave a reply



Submit