Find a File in Python

Find a file in python

os.walk is the answer, this will find the first match:

import os

def find(name, path):
for root, dirs, files in os.walk(path):
if name in files:
return os.path.join(root, name)

And this will find all matches:

def find_all(name, path):
result = []
for root, dirs, files in os.walk(path):
if name in files:
result.append(os.path.join(root, name))
return result

And this will match a pattern:

import os, fnmatch
def find(pattern, path):
result = []
for root, dirs, files in os.walk(path):
for name in files:
if fnmatch.fnmatch(name, pattern):
result.append(os.path.join(root, name))
return result

find('*.txt', '/path/to/dir')

Find all files in a directory with extension .txt in Python

You can use glob:

import glob, os
os.chdir("/mydir")
for file in glob.glob("*.txt"):
print(file)

or simply os.listdir:

import os
for file in os.listdir("/mydir"):
if file.endswith(".txt"):
print(os.path.join("/mydir", file))

or if you want to traverse directory, use os.walk:

import os
for root, dirs, files in os.walk("/mydir"):
for file in files:
if file.endswith(".txt"):
print(os.path.join(root, file))

Python: search for a file in current directory and all it's parents

Well this is not so well implemented, but will work

use listdir to get list of files/folders in current directory and then in the list search for you file.

If it exists loop breaks but if it doesn't it goes to parent directory using os.path.dirname and listdir.

if cur_dir == '/' the parent dir for "/" is returned as "/" so if cur_dir == parent_dir it breaks the loop

import os
import os.path

file_name = "test.txt" #file to be searched
cur_dir = os.getcwd() # Dir from where search starts can be replaced with any path

while True:
file_list = os.listdir(cur_dir)
parent_dir = os.path.dirname(cur_dir)
if file_name in file_list:
print "File Exists in: ", cur_dir
break
else:
if cur_dir == parent_dir: #if dir is root dir
print "File not found"
break
else:
cur_dir = parent_dir

How to search for a string in text files?

The reason why you always got True has already been given, so I'll just offer another suggestion:

If your file is not too large, you can read it into a string, and just use that (easier and often faster than reading and checking line per line):

with open('example.txt') as f:
if 'blabla' in f.read():
print("true")

Another trick: you can alleviate the possible memory problems by using mmap.mmap() to create a "string-like" object that uses the underlying file (instead of reading the whole file in memory):

import mmap

with open('example.txt') as f:
s = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
if s.find('blabla') != -1:
print('true')

NOTE: in python 3, mmaps behave like bytearray objects rather than strings, so the subsequence you look for with find() has to be a bytes object rather than a string as well, eg. s.find(b'blabla'):

#!/usr/bin/env python3
import mmap

with open('example.txt', 'rb', 0) as file, \
mmap.mmap(file.fileno(), 0, access=mmap.ACCESS_READ) as s:
if s.find(b'blabla') != -1:
print('true')

You could also use regular expressions on mmap e.g., case-insensitive search: if re.search(br'(?i)blabla', s):

Find files in a directory containing desired string in Python

You are trying to search for string in filename, use open(filename, 'r').read():

import os

user_input = input('What is the name of your directory')
directory = os.listdir(user_input)

searchstring = input('What word are you trying to find?')

for fname in directory:
if os.path.isfile(user_input + os.sep + fname):
# Full path
f = open(user_input + os.sep + fname, 'r')

if searchstring in f.read():
print('found string in file %s' % fname)
else:
print('string not found')
f.close()

We use user_input + os.sep + fname to get full path.

os.listdir gives files and directories names, so we use os.path.isfile to check for files.

Find files in a directory with a partial string match

Having the files in /mydir as follows

mydir
├── apple1.json.gz
├── apple2.json.gz
├── banana1.json.gz
├── melon1.json.gz
└── melon2.json.gz

you could either do

import glob
import os

os.chdir('/mydir')
for file in glob.glob('apple*.json.gz'):
print file

or

import glob

for file in glob.glob('/mydir/apple*.json.gz'):
print file

Changing directories will not effect glob.glob('/absolute/path').

Find file in directory with the highest number in the filename

I'll try to solve it only using filenames, not dates.

You have to convert to integer before appling criteria or alphanum sort applies to the whole filename

Proof of concept:

import re
list_of_files = ["file1","file100","file4","file7"]

def extract_number(f):
s = re.findall("\d+$",f)
return (int(s[0]) if s else -1,f)

print(max(list_of_files,key=extract_number))

result: file100

  • the key function extracts the digits found at the end of the file and converts to integer, and if nothing is found returns -1
  • you don't need to sort to find the max, just pass the key to max directly
  • if 2 files have the same index, use full filename to break tie (which explains the tuple key)

How to find files in multilevel subdirectories

The answer of @JonathanDavidArndt is good but quite outdated. Since Python 3.5, you can use pathlib.Path.glob to search a pattern in any subdirectory.

For instance:

import pathlib

destination_root = r"C:\Projects\NED_1m"
pattern = "**/*_meta*.xml"

master_list = list(pathlib.Path(destination_root).glob(pattern))


Related Topics



Leave a reply



Submit