How to Search For a String in Text Files

How to search for a string in text files?

The reason why you always got True has already been given, so I'll just offer another suggestion:

If your file is not too large, you can read it into a string, and just use that (easier and often faster than reading and checking line per line):

with open('example.txt') as f:
if 'blabla' in f.read():
print("true")

Another trick: you can alleviate the possible memory problems by using mmap.mmap() to create a "string-like" object that uses the underlying file (instead of reading the whole file in memory):

import mmap

with open('example.txt') as f:
s = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
if s.find('blabla') != -1:
print('true')

NOTE: in python 3, mmaps behave like bytearray objects rather than strings, so the subsequence you look for with find() has to be a bytes object rather than a string as well, eg. s.find(b'blabla'):

#!/usr/bin/env python3
import mmap

with open('example.txt', 'rb', 0) as file, \
mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ) as s:
if s.find(b'blabla') != -1:
print('true')

You could also use regular expressions on mmap e.g., case-insensitive search: if re.search(br'(?i)blabla', s):

How to find all files containing specific text (string) on Linux?

Do the following:

grep -rnw '/path/to/somewhere/' -e 'pattern'
  • -r or -R is recursive,
  • -n is line number, and
  • -w stands for match the whole word.
  • -l (lower-case L) can be added to just give the file name of matching files.
  • -e is the pattern used during the search

Along with these, --exclude, --include, --exclude-dir flags could be used for efficient searching:

  • This will only search through those files which have .c or .h extensions:
grep --include=\*.{c,h} -rnw '/path/to/somewhere/' -e "pattern"
  • This will exclude searching all the files ending with .o extension:
grep --exclude=\*.o -rnw '/path/to/somewhere/' -e "pattern"
  • For directories it's possible to exclude one or more directories using the --exclude-dir parameter. For example, this will exclude the dirs dir1/, dir2/ and all of them matching *.dst/:
grep --exclude-dir={dir1,dir2,*.dst} -rnw '/path/to/somewhere/' -e "pattern"

This works very well for me, to achieve almost the same purpose like yours.

For more options, see man grep.

Trying to search through multiple text files for a specific string. I then want the program to print out the text file that it found the string in

Try saving the file contents in a variable:

            if os.path.isfile(cur_path):
with open(cur_path, 'r') as file:
contents = file.read()
username = input("Enter a username: ")
if username in contents:

After the first file.read(), the file object's position is at the end and nothing will be read in subsequent calls, unless you go back to the start with file.seek(0).
See Methods of File Objects for more details.

As an aside, pathlib might be a better fit than os.path.

Tools to search for strings inside files without indexing

Original Answer

Windows Grep does this really well.

Edit: Windows Grep is no longer being maintained or made available by the developer. An alternate download link is here: Windows Grep - alternate

Current Answer

Visual Studio Code has excellent search and replace capabilities across files. It is extremely fast, supports regex and live preview before replacement.

Sample Image

Python: Search a string from multiple Text files

Here is the full code that solved the issue:

import os,tarfile, glob

string_to_search=input("Enter the string you want to search : ")

#all_files holds all the files in current directory
all_files = [f for f in os.listdir('.') if os.path.isfile(f)]
for current_file in all_files:
if (current_file.endswith(".tgz")) or (current_file.endswith("tar.gz")):
tar = tarfile.open(current_file, "r:gz")
#file_name contains only name by removing the extension
file_name=os.path.splitext(current_file)[0]
os.makedirs(file_name) #make directory with the file name
output_file_path=file_name #Path to store the files after extraction
tar.extractall(output_file_path) #extract the current file
tar.close()

#----Following code is to find the string from all the files in a directory
path1=output_file_path + r'\nvram2\logs'
all_files=glob.glob(os.path.join(path1,"*"))
for my_file1 in glob.glob(os.path.join(path1,"*")):
if os.path.isfile(my_file1): # to discard folders
with open(my_file1, errors='ignore') as my_file2:
for line_no, line in enumerate(my_file2):
if string_to_search in line:
print(string_to_search + " is found in " + my_file1 + "; Line Number = " + str(line_no))

Got help from this answer. The path and file not found issue was resolved by "Joining the directory with the filename solves it."

Python - Search Text File For Any String In a List

You could do this with a for loop as below. The issue with your code is it does not know what x is. You can define it inside of the loop to make x equal to a value in the KeyWord list for each run of the loop.

KeyWord =['word', 'word1', 'word3']
with open('Textfile.txt', 'r') as f:
read_data = f.read()
for x in KeyWord:
if x in read_data:
print('True')

Search through multiple files/dirs for string, then print content of text file

Here is a solution with Python 3.4+ because of pathlib:

from pathlib import Path

path = Path('/some/dir')

search_string = 'string'

for o in path.rglob('*.txt'):
if o.is_file():
text = o.read_text()
if search_string in text:
print(o)
print(text)

The code above will look for all *.txt in path and its sub-directories, read the content of each file in text, search for search_string in text and, if it matches, print the file name and its contents.

Search for several string in text files and if exits remove a file

You can use regular expressions (regex) to find all groups with this structure in the file.

import os
import re

file_path = "example.txt"
delete_file_path = "delete_me"
delete_file_ending = ".txt"
pattern = re.compile("12.*(?=[90|25|30]).*(?=40).*(?=20)") # add a proper regex here to match all you required strings properly

with open(file_path) as file:
text = file.read()
paragraphs = text.split(os.linesep)
paragraph_tokens = [re.findall(pattern, paragraph) for paragraph in paragraphs]

for i in range(paragraph_tokens):
if paragraph_tokens[i]:
os.remove(delete_file_path +s tr(i) + delete_file_ending)

you could also get re.match, if you only want to know if any matched pattern is in there, but then you would change the if condition a little bit since re.match returns an object.



Related Topics



Leave a reply



Submit