Extract Lines Between Two Patterns from a File

How to print lines between two patterns, inclusive or exclusive (in sed, AWK or Perl)?


Print lines between PAT1 and PAT2

$ awk '/PAT1/,/PAT2/' file
PAT1
3 - first block
4
PAT2
PAT1
7 - second block
PAT2
PAT1
10 - third block

Or, using variables:

awk '/PAT1/{flag=1} flag; /PAT2/{flag=0}' file

How does this work?

  • /PAT1/ matches lines having this text, as well as /PAT2/ does.
  • /PAT1/{flag=1} sets the flag when the text PAT1 is found in a line.
  • /PAT2/{flag=0} unsets the flag when the text PAT2 is found in a line.
  • flag is a pattern with the default action, which is to print $0: if flag is equal 1 the line is printed. This way, it will print all those lines occurring from the time PAT1 occurs and up to the next PAT2 is seen. This will also print the lines from the last match of PAT1 up to the end of the file.

Print lines between PAT1 and PAT2 - not including PAT1 and PAT2

$ awk '/PAT1/{flag=1; next} /PAT2/{flag=0} flag' file
3 - first block
4
7 - second block
10 - third block

This uses next to skip the line that contains PAT1 in order to avoid this being printed.

This call to next can be dropped by reshuffling the blocks: awk '/PAT2/{flag=0} flag; /PAT1/{flag=1}' file.

Print lines between PAT1 and PAT2 - including PAT1

$ awk '/PAT1/{flag=1} /PAT2/{flag=0} flag' file
PAT1
3 - first block
4
PAT1
7 - second block
PAT1
10 - third block

By placing flag at the very end, it triggers the action that was set on either PAT1 or PAT2: to print on PAT1, not to print on PAT2.

Print lines between PAT1 and PAT2 - including PAT2

$ awk 'flag; /PAT1/{flag=1} /PAT2/{flag=0}' file
3 - first block
4
PAT2
7 - second block
PAT2
10 - third block

By placing flag at the very beginning, it triggers the action that was set previously and hence print the closing pattern but not the starting one.

Print lines between PAT1 and PAT2 - excluding lines from the last PAT1 to the end of file if no other PAT2 occurs

This is based on a solution by Ed Morton.

awk 'flag{
if (/PAT2/)
{printf "%s", buf; flag=0; buf=""}
else
buf = buf $0 ORS
}
/PAT1/ {flag=1}' file

As a one-liner:

$ awk 'flag{ if (/PAT2/){printf "%s", buf; flag=0; buf=""} else buf = buf $0 ORS}; /PAT1/{flag=1}' file
3 - first block
4
7 - second block

# note the lack of third block, since no other PAT2 happens after it

This keeps all the selected lines in a buffer that gets populated from the moment PAT1 is found. Then, it keeps being filled with the following lines until PAT2 is found. In that point, it prints the stored content and empties the buffer.

Extract lines between two patterns from a file

This can be an approach:

$ awk '/pattern1/ {p=1}; p; /pattern2/ {p=0}' file
********************************* Results *********************************
SUCCEEDED
...
...
some text
***************************************************************************
  • When it finds pattern1, then makes variable p=1.
  • it just prints lines when p==1. This is accomplished with the p condition. If it is true, it performs the default awk action, that is, print $0. Otherwise, it does not.
  • When it finds pattern2, then makes variable p=0. As this condition is checked after p condition, it will print the line in which pattern2 appears for the first time.

If you want an exact match of the lines:

$ awk '$0=="pattern1" {p=1}; p; $0=="pattern2" {p=0}' file

Test

$ cat a
***************************************************************************
text line # n-2
pattern1
********************************* Results *********************************
SUCCEEDED
...
...
some text
***************************************************************************
pattern2
text line # m+2
pattern2
***************************************************************************
$ awk '/pattern1/ {p=1}; p; /pattern2/ {p=0}' a
pattern1
********************************* Results *********************************
SUCCEEDED
...
...
some text
***************************************************************************
pattern2

Extract lines between two patterns from a file, not including the end-pattern matches

Like this:

sed -n '/begin/,/end/{/end/!p}'

That will print all lines in the range begin - end except of the line containing end itself from the output.

how to print Lines Between Two Patterns in file using SED or AWK?

You may use this sed:

sed -n '/MULTIPLE-RESOURCES/,/^###$/ { /###$/!p; }' file

### MULTIPLE-RESOURCES

#### Viewing Resource Information

> kubectl get svc, po
> kubectl get deploy, no
> kubectl get all
> kubectl get all --all-namespaces

## KUBECTL

Extract lines between two patterns and remove in between lines with if condition

Use sed or Perl:

sed '/001.*start/,/001.*end/!d;/002.*start/,/002.*end/d'

perl -ne 'print if /001.*start/ .. /001.*end/
and not /002.*start/ .. /002.*end/'

Using look-ahead assertions can make the excluded tag dynamic easily:

perl -ne 'print if /001.*start/ .. /001.*end/
and not /text \[(?!001).*start/ .. /text \[(?!001).*end/'

Extract lines between specific start/end pattern from text file

You can work with a boolean mode-flag like extract_on, which signals if currently in between start and stop and should extract the line.
Also the line-matching can be done using re.match function, which either returns a match-object or None.

import re

pattern_start = re.compile(r"^vsi ipcbb")
pattern_stop = re.compile(r"^vsi ipcbb-ipran")

i = 0
extract_on = False
extracts = []
with open(r'readline.txt', 'rt') as myfile:
for line in myfile:
i += 1 # line counting starts with 1
if pattern_start.match(line):
extract_on = True
if pattern_stop.search(line):
extract_on = False
if extract_on:
extracts.append((i, line.rstrip('\n')))

for line in extracts:
print(line)

Given your input, it will ignore the first 4 lines, extract the middle 5, and again ignores the last 5.
So print-out of extracted lines including position-in-file is:

(5, 'vsi ipcbb-RAC_YBPNM01H-00 static')
(6, ' description *** M-ipcbb-RAC_YBPNM01H(via RAG_MBSPM01H&RAG_YBPNM01H) ***')
(7, ' tnl-policy TE')
(8, ' diffserv-mode pipe af1 green')
(9, '#')

Left out the XLS-writing, which is assumed to be working as expected.

How to print a range of lines between two patterns only if another pattern is included in this range?

You can use

awk 'flag{
buf = buf $0 ORS;
if (/PatternEnd/ && buf ~ /PatternInside/)
{printf "%s", buf; flag=0; buf=""}
}
/PatternStart/{buf = $0 ORS; flag=1}' file

Here, the /PatternStart/{buf = $0; flag=1}' finds the line that matches the PatternStart pattern, starts writing the output value to buf, and sets the flag. If the flag is true, subsequent lines are appended to buf, and once there is a line where PatternEnd matches and the PatternInside finds a match in the buf, the match is printed, buf gets cleared and the flag is reset.

See the online demo that yields

PatternStart
line1
line2
PatternInside
line3
line4
PatternEnd

Print all lines between two patterns, exclusive, first instance only (in sed, AWK or Perl)

With awk (assumes that PATTERN1 and PATTERN2 are always present in pairs and either of them do not occur inside a pair)

$ cat ip.txt
aaa
PATTERN1
bbb
ccc
ddd
PATTERN2
eee
fff
PATTERN1
ggg
hhh
iii
PATTERN2
jjj

$ awk '/PATTERN2/{exit} f; /PATTERN1/{f=1}' ip.txt
bbb
ccc
ddd
  • /PATTERN1/{f=1} set flag if /PATTERN1/ is matched
  • /PATTERN2/{exit} exit if /PATTERN2/ is matched
  • f; print input line if flag is set



Generic solution, where the block required can be specified

$ awk -v b=1 '/PATTERN2/ && c==b{exit} c==b; /PATTERN1/{c++}' ip.txt
bbb
ccc
ddd
$ awk -v b=2 '/PATTERN2/ && c==b{exit} c==b; /PATTERN1/{c++}' ip.txt
2
46


Related Topics



Leave a reply



Submit