Count total number of pattern between two pattern (using sed if possible) in Linux
A very cryptic perl answer:
perl -nE 's/\{(.*?)\}/ say ($1 =~ tr{=}{=}) /ge'
The tr
function returns the number of characters transliterated.
With the new requirements, we can make a couple of small changes:
perl -0777 -nE 's/\{(.*?)\}/ say ($1 =~ tr{=}{=}) /ges'
-0777
reads the entire file/stream into a single string- the
s
flag to thes///
function allows.
to handle newlines like a plain character.
How to iteratively find number of lines between two patterns?
awk '$0=="123" {if (n) print NR-1-n; n=NR}' file
This uses the line number of matched lines to print the number of lines between them.
Count the number of occurrences of a string using sed?
I don't think sed
would be appropriate, unless you use it in a pipeline to convert your file so that the word you need appears on separate lines, and then use grep -c
to count the occurrences.
I like Jonathan's idea of using tr
to convert spaces to newlines. The beauty of this method is that successive spaces get converted to multiple blank lines but it doesn't matter because grep
will be able to count just the lines with the single word 'title'.
How to print lines between two patterns, inclusive or exclusive (in sed, AWK or Perl)?
Print lines between PAT1 and PAT2
$ awk '/PAT1/,/PAT2/' file
PAT1
3 - first block
4
PAT2
PAT1
7 - second block
PAT2
PAT1
10 - third block
Or, using variables:
awk '/PAT1/{flag=1} flag; /PAT2/{flag=0}' file
How does this work?
/PAT1/
matches lines having this text, as well as/PAT2/
does./PAT1/{flag=1}
sets theflag
when the textPAT1
is found in a line./PAT2/{flag=0}
unsets theflag
when the textPAT2
is found in a line.flag
is a pattern with the default action, which is toprint $0
: ifflag
is equal 1 the line is printed. This way, it will print all those lines occurring from the timePAT1
occurs and up to the nextPAT2
is seen. This will also print the lines from the last match ofPAT1
up to the end of the file.
Print lines between PAT1 and PAT2 - not including PAT1 and PAT2
$ awk '/PAT1/{flag=1; next} /PAT2/{flag=0} flag' file
3 - first block
4
7 - second block
10 - third block
This uses next
to skip the line that contains PAT1
in order to avoid this being printed.
This call to next
can be dropped by reshuffling the blocks: awk '/PAT2/{flag=0} flag; /PAT1/{flag=1}' file
.
Print lines between PAT1 and PAT2 - including PAT1
$ awk '/PAT1/{flag=1} /PAT2/{flag=0} flag' file
PAT1
3 - first block
4
PAT1
7 - second block
PAT1
10 - third block
By placing flag
at the very end, it triggers the action that was set on either PAT1 or PAT2: to print on PAT1, not to print on PAT2.
Print lines between PAT1 and PAT2 - including PAT2
$ awk 'flag; /PAT1/{flag=1} /PAT2/{flag=0}' file
3 - first block
4
PAT2
7 - second block
PAT2
10 - third block
By placing flag
at the very beginning, it triggers the action that was set previously and hence print the closing pattern but not the starting one.
Print lines between PAT1 and PAT2 - excluding lines from the last PAT1 to the end of file if no other PAT2 occurs
This is based on a solution by Ed Morton.
awk 'flag{
if (/PAT2/)
{printf "%s", buf; flag=0; buf=""}
else
buf = buf $0 ORS
}
/PAT1/ {flag=1}' file
As a one-liner:
$ awk 'flag{ if (/PAT2/){printf "%s", buf; flag=0; buf=""} else buf = buf $0 ORS}; /PAT1/{flag=1}' file
3 - first block
4
7 - second block
# note the lack of third block, since no other PAT2 happens after it
This keeps all the selected lines in a buffer that gets populated from the moment PAT1 is found. Then, it keeps being filled with the following lines until PAT2 is found. In that point, it prints the stored content and empties the buffer.
using sed to print between two patterns
dtpwmbp:~ pwadas$ echo "Alas poor Yorik, I knew him well" | sed -e 's/^.*poor //g;s/ well.*$//g'
Yorik, I knew him
dtpwmbp:~ pwadas$ echo "Alas poor Yorik, I knew him well" | awk '{sub(/.*poor /,"");sub(/ well.*/,"");print;}'
Yorik, I knew him
Usage with file input:
dtpwmbp:~ pwadas$ echo "Alas poor Yorik, I knew him well" > infile
dtpwmbp:~ pwadas$ cat infile
Alas poor Yorik, I knew him well
dtpwmbp:~ pwadas$ cat infile | sed -e 's/^.*poor //g;s/ well.*$//g'
Yorik, I knew him
dtpwmbp:~ pwadas$ sed -e 's/^.*poor //g;s/ well.*$//g' < infile
Yorik, I knew him
dtpwmbp:~ pwadas$ cat infile | awk '{sub(/.*poor /,"");sub(/ well.*/,"");print;}'
Yorik, I knew him
dtpwmbp:~ pwadas$ awk '{sub(/.*poor /,"");sub(/ well.*/,"");print;}' < infile
Yorik, I knew him
How to use sed to identify patterns in multiple lines
One possible sed
solution:
sed -r 's/^[[:digit:]]+\. /# /g' <inputfile>
-r
: treat search pattern as an extended regex/^[[:digit]]+\. /# /g
: look for lines that start with 1 or more digits followed by a period and a space, and if found replace with a#
followed by a space- leave all other lines as they are (ie, don't make any changes)
For example:
$ cat datfile
1. numberedlist
2. one
3. two
where in the world is waldo
10. pickles
15. jam
# I'm just a comment
sky blue
100. bash
101. ksh
102. csh
72.don't touch this
# rubber ducky
And a test run of our sed
script:
$ sed -r 's/^[[:digit:]]+\. /# /g' datfile
# numberedlist
# one
# two
where in the world is waldo
# pickles
# jam
# I'm just a comment
sky blue
# bash
# ksh
# csh
72.don't touch this
# rubber ducky
Using sed to delete all lines between two matching patterns
Use this sed command to achieve that:
sed '/^#/,/^\$/{/^#/!{/^\$/!d}}' file.txt
Mac users (to prevent extra characters at the end of d command
error) need to add semicolons before the closing brackets
sed '/^#/,/^\$/{/^#/!{/^\$/!d;};}' file.txt
OUTPUT
# ID 1
$ description 1
blah blah
# ID 2
$ description 2
blah blah
blah blah
Explanation:
/^#/,/^\$/
will match all the text between lines starting with#
to lines starting with$
.^
is used for start of line character.$
is a special character so needs to be escaped./^#/!
means do following if start of line is not#
/^$/!
means do following if start of line is not$
d
means delete
So overall it is first matching all the lines from ^#
to ^\$
then from those matched lines finding lines that don't match ^#
and don't match ^\$
and deleting them using d
.
How to select lines between two marker patterns which may occur multiple times with awk/sed
Use awk
with a flag to trigger the print when necessary:
$ awk '/abc/{flag=1;next}/mno/{flag=0}flag' file
def1
ghi1
jkl1
def2
ghi2
jkl2
How does this work?
/abc/
matches lines having this text, as well as/mno/
does./abc/{flag=1;next}
sets theflag
when the textabc
is found. Then, it skips the line./mno/{flag=0}
unsets theflag
when the textmno
is found.- The final
flag
is a pattern with the default action, which is toprint $0
: ifflag
is equal 1 the line is printed.
For a more detailed description and examples, together with cases when the patterns are either shown or not, see How to select lines between two patterns?.
How to use sed/grep to extract text between two words?
sed -e 's/Here\(.*\)String/\1/'
Related Topics
Giving Linux User Git Access But Not Shell Access
How to Check If Emacs in Frame or in Terminal
Can't Untar a Complete Directory Using Tar -Cvpzf
Relative-To-Executable Path to Ld-Linux Dynamic Linker/Interpreter
How to Automatically Close The Execution of The 'Qemu' After End of Process
The Button.Connect Syntax in Genie
Why Does Printf Still Work with Rax Lower Than The Number of Fp Args in Xmm Registers
Cvs Error: Failed to Create Lock Directory... Permission Denied
Convert a Base64 Ldif File to Plaintext (For Import)
How to Get The File System Type for Syscall.Mount() Programmatically
How to Save State When Preempted on a Google Preemptible Instance
Perl Script to Capture Stderr and Stdout of Command Executed in Back-Quotes
Cuda-Gdb Not Working in Nsight on Linux
Wget Returns "Unable to Establish Ssl Connection"
Creating Filename_$(Date %Y-%M-%D) from Systemd Bash Inline Script
Passwd in One Command Isn't Working
Testing - Intentionally Corrupt a .Z File Using 'Dd'
Executing Shell Script from Current Directory Without '"./Filename"