How to search for a string in text files?
The reason why you always got True
has already been given, so I'll just offer another suggestion:
If your file is not too large, you can read it into a string, and just use that (easier and often faster than reading and checking line per line):
with open('example.txt') as f:
if 'blabla' in f.read():
print("true")
Another trick: you can alleviate the possible memory problems by using mmap.mmap()
to create a "string-like" object that uses the underlying file (instead of reading the whole file in memory):
import mmap
with open('example.txt') as f:
s = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
if s.find('blabla') != -1:
print('true')
NOTE: in python 3, mmaps behave like bytearray
objects rather than strings, so the subsequence you look for with find()
has to be a bytes
object rather than a string as well, eg. s.find(b'blabla')
:
#!/usr/bin/env python3
import mmap
with open('example.txt', 'rb', 0) as file, \
mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ) as s:
if s.find(b'blabla') != -1:
print('true')
You could also use regular expressions on mmap
e.g., case-insensitive search: if re.search(br'(?i)blabla', s):
How to find all files containing specific text (string) on Linux?
Do the following:
grep -rnw '/path/to/somewhere/' -e 'pattern'
-r
or-R
is recursive,-n
is line number, and-w
stands for match the whole word.-l
(lower-case L) can be added to just give the file name of matching files.-e
is the pattern used during the search
Along with these, --exclude
, --include
, --exclude-dir
flags could be used for efficient searching:
- This will only search through those files which have .c or .h extensions:
grep --include=\*.{c,h} -rnw '/path/to/somewhere/' -e "pattern"
- This will exclude searching all the files ending with .o extension:
grep --exclude=\*.o -rnw '/path/to/somewhere/' -e "pattern"
- For directories it's possible to exclude one or more directories using the
--exclude-dir
parameter. For example, this will exclude the dirs dir1/, dir2/ and all of them matching *.dst/:
grep --exclude-dir={dir1,dir2,*.dst} -rnw '/path/to/somewhere/' -e "pattern"
This works very well for me, to achieve almost the same purpose like yours.
For more options, see man grep
.
Trying to search through multiple text files for a specific string. I then want the program to print out the text file that it found the string in
Try saving the file contents in a variable:
if os.path.isfile(cur_path):
with open(cur_path, 'r') as file:
contents = file.read()
username = input("Enter a username: ")
if username in contents:
After the first file.read()
, the file object's position is at the end and nothing will be read in subsequent calls, unless you go back to the start with file.seek(0)
.
See Methods of File Objects for more details.
As an aside, pathlib might be a better fit than os.path
.
Tools to search for strings inside files without indexing
Original Answer
Windows Grep does this really well.
Edit: Windows Grep is no longer being maintained or made available by the developer. An alternate download link is here: Windows Grep - alternate
Current Answer
Visual Studio Code has excellent search and replace capabilities across files. It is extremely fast, supports regex and live preview before replacement.
Python: Search a string from multiple Text files
Here is the full code that solved the issue:
import os,tarfile, glob
string_to_search=input("Enter the string you want to search : ")
#all_files holds all the files in current directory
all_files = [f for f in os.listdir('.') if os.path.isfile(f)]
for current_file in all_files:
if (current_file.endswith(".tgz")) or (current_file.endswith("tar.gz")):
tar = tarfile.open(current_file, "r:gz")
#file_name contains only name by removing the extension
file_name=os.path.splitext(current_file)[0]
os.makedirs(file_name) #make directory with the file name
output_file_path=file_name #Path to store the files after extraction
tar.extractall(output_file_path) #extract the current file
tar.close()
#----Following code is to find the string from all the files in a directory
path1=output_file_path + r'\nvram2\logs'
all_files=glob.glob(os.path.join(path1,"*"))
for my_file1 in glob.glob(os.path.join(path1,"*")):
if os.path.isfile(my_file1): # to discard folders
with open(my_file1, errors='ignore') as my_file2:
for line_no, line in enumerate(my_file2):
if string_to_search in line:
print(string_to_search + " is found in " + my_file1 + "; Line Number = " + str(line_no))
Got help from this answer. The path and file not found issue was resolved by "Joining the directory with the filename solves it."
Python - Search Text File For Any String In a List
You could do this with a for loop as below. The issue with your code is it does not know what x
is. You can define it inside of the loop to make x
equal to a value in the KeyWord
list for each run of the loop.
KeyWord =['word', 'word1', 'word3']
with open('Textfile.txt', 'r') as f:
read_data = f.read()
for x in KeyWord:
if x in read_data:
print('True')
Search through multiple files/dirs for string, then print content of text file
Here is a solution with Python 3.4+ because of pathlib
:
from pathlib import Path
path = Path('/some/dir')
search_string = 'string'
for o in path.rglob('*.txt'):
if o.is_file():
text = o.read_text()
if search_string in text:
print(o)
print(text)
The code above will look for all *.txt
in path
and its sub-directories, read the content of each file in text
, search for search_string
in text
and, if it matches, print the file name and its contents.
Search for several string in text files and if exits remove a file
You can use regular expressions (regex) to find all groups with this structure in the file.
import os
import re
file_path = "example.txt"
delete_file_path = "delete_me"
delete_file_ending = ".txt"
pattern = re.compile("12.*(?=[90|25|30]).*(?=40).*(?=20)") # add a proper regex here to match all you required strings properly
with open(file_path) as file:
text = file.read()
paragraphs = text.split(os.linesep)
paragraph_tokens = [re.findall(pattern, paragraph) for paragraph in paragraphs]
for i in range(paragraph_tokens):
if paragraph_tokens[i]:
os.remove(delete_file_path +s tr(i) + delete_file_ending)
you could also get re.match, if you only want to know if any matched pattern is in there, but then you would change the if condition a little bit since re.match returns an object.
Related Topics
How to Merge Dictionaries of Dictionaries
How to Open a Chrome Profile Through Python
Text Progress Bar in Terminal With Block Characters
What Is an Alternative to Execfile in Python 3
Why Is Python Ordering My Dictionary Like So
Install a Python Package into a Different Directory Using Pip
Typeerror: a Bytes-Like Object Is Required, Not 'Str' When Writing to a File in Python 3
How to Sort a List of Strings Numerically
How to Convert an Rgb Image into Grayscale in Python
How to Input a Regex in String.Replace
How to Split a Text into Sentences
About the Changing Id of an Immutable String
Is Floating Point Arbitrary Precision Available