Reading lines between two strings in text file using python
Just rearrange your if statements. Think about the order in which they flow and when if flag
is being evaluated. Also, you can use elif
so only one of the three conditions would execute, but make sure the elif flag
line is the last condition.
With the way you have your example setup, it will check to see if the line starts with START
, and then set the flag. Immediately after that happens, you are checking if the flag was set, so it will print out START
. Additionally it will print every line, then check after you've printed the line to see if it should have printed END
or not.
With rearranging the order, you will see that if the line starts with START
, then there's no command below that will print the line. Similarly, it checks to see if it should stop before printing the END
line.
with open('/tmp/test.txt','r') as f:
for line in f:
if line.strip().endswith('END'):
flag=False
if flag:
data.append(line)
if line.startswith('START'):
flag=True
The elif
version is probably the better way to go since it will save you a few checks of if statements, but only one outcome can be executed per iteration. So if a flag is changed, then it will never print out the line.
with open('/tmp/test.txt','r') as f:
for line in f:
if line.startswith('START'):
flag=True
elif line.strip().endswith('END'):
flag=False
elif flag:
data.append(line)
Read lines between two keywords
This will work for you:
$ awk '/\*System_Power/{f=1;next}/\*System_Terminate/{f=0}f' infile
1
1.2
1.8
2
Getting strings in between two keywords from a file in python
Try escaping your outermost parentheses pair.
navigated_pages = re.findall(r'EVENT\(X(.*?)\) ',data,re.DOTALL|re.MULTILINE)
This appears to make it match properly, at least for my little sample input:
>>> s = "EVENT(X_HELLO) ... EVENT(X_HOW_ARE_YOU_DOING_TODAY)... EVENT(this one shouldn't appear because it doesn't start with X)"
>>> re.findall(r"EVENT\(X(.*?)\)", s)
['_HELLO', '_HOW_ARE_YOU_DOING_TODAY']
If you want the starting X too, you should nudge the inner parentheses to the left by one. Don't worry, I'm pretty sure the *?
will still have the proper precedence.
>>> re.findall(r"EVENT\((X.*?)\)", s)
['X_HELLO', 'X_HOW_ARE_YOU_DOING_TODAY']
Print multiple lines between two specific lines (keywords) from a text file
You can set a boolean to know if to print a line or not:
newfile = open('newfile.txt', 'w')
printing = False
with open('drama.txt', 'r') as f:
for line in f:
if line.startswith('Characters:'):
printing = True
continue # go to next line
elif line.startswith('First scene'):
printing = False
break # quit file reading
if printing:
print(line, file=newfile)
newfile.close()
Python read specific lines of text between two strings
One slight modification which looks like it should cover your problem:
flist = open("filename.txt").readlines()
parsing = False
for line in flist:
if line.startswith("\t**** Report 1"):
parsing = True
elif line.startswith("\t**** Report 2"):
parsing = False
if parsing:
#Do stuff with data
If you want to avoid parsing the line "* Report 1"... itself, simply put the start condition after the if parsing
, i.e.
flist = open("filename.txt").readlines()
parsing = False
for line in flist:
if line.startswith("\t**** Report 2"):
parsing = False
if parsing:
#Do stuff with data
if line.startswith("\t**** Report 1"):
parsing = True
Grab text between two lines
If your text file is huge, don't read it into memory. Don't look for indexes either, just process it line by line:
bool writing = false;
using var sw = File.CreateText(@"C:\some\path\to.txt");
foreach(var line in File.ReadLines(...)){ //don't use ReadAllInes, use ReadLines - it's incremental and burns little memory
if(!writing && line.Contains("target1")){
writing = true; //start writing
continue; //don't write this line
}
if(writing){
if(line.Contains("target2"))
break; //exit loop without writing this line
sw.WriteLine(line);
}
}
How do I extract two different keywords from two different lines in a file in bash shell?
If there are just two lines in the file, one with type:
and another with mount:
and they come in a set order, you can use
awk '/type:|mounts:/{gsub(/https?:\/\/|:.*/, "", $2); a = (length(a)==0 ? "" : a " ") $2} END{print a}' file
If a line contains type:
or mounts:
, the http://
or https://
and all text after :
are removed from Field 2, and then the value is either assigned to a
or appended with a space to a
, and once there is an end of file, the a
value is printed.
Details:
/type:|mounts:/
- find lines containngtype:
ormounts:
gsub(/https?:\/\/|:.*/, "", $2)
- removeshttp://
,https://
or:
and the rest of the string from Field 2 valuea = (length(a)==0 ? "" : a " ") $2
- assigna
+ space + Field 2 value toa
ifa
is not empty, if it is, just assign Field 2 value toa
END{print a}
- at the end of the file processing, printa
value.
See the online demo:
#!/bin/bash
s='name: linuxVol
id: 6
type: Linux
dir excludes: .snapshot*
~snapshot*
.zfs
.isilon
.ifsvar
.lustre
inode: 915720
free_space: 35.6TiB (auto)
total_capacity: 95.0TiB (auto)
number_of_files: 5,789,643
number_of_dirs: 520,710
mounts: h''ttps://server1.example.com:30002: /mnt/tube'
awk '/type:|mounts:/{gsub(/https?:\/\/|:.*/, "", $2); a = (length(a)==0 ? "" : a " ") $2} END{print a}' <<< "$s"
Output:
Linux server1.example.com
Related Topics
Simplest Way to Build Dotnet Sdk Project Requiring Net461 on Macos
Using Git to Clone from a Windows Machine to a Linux Webserver (In House)
Cargo Plugin Throws Cargoexception When Deploying on Glassfish - Deployment Has Failed: Null
I Would Like to Store All Command-Line Arguments to a Bash Script into a Single Variable
Cannot Run 32-Bit Apps on 64-Bit Linux
Vimdiff: How to Put All Changes Inside a Particular Function from One File to Another
Intel Msr Frequency Scaling Per - Thread
How to Set CPU Load on a Red Hat Linux Box
How to Pass Local Variable to Remote Using Ssh and Bash Script
How to Send a Mail with a Message in Unix Script
How to Split Two Vertical Pane Inside a Horizontal Pane in Tmux Using Tmuxinator
How to Clear Space on My Main System Drive on a Linux Centos System
How to Log from a Non-Root Debian Linux Daemon
Why Is Git Creating Read-Only (444) Files
Linux Script with Netcat Stops Working After X Hours