How to Use Glob() to Find Files Recursively

How to use glob() to find files recursively?

pathlib.Path.rglob

Use pathlib.Path.rglob from the the pathlib module, which was introduced in Python 3.5.

from pathlib import Path

for path in Path('src').rglob('*.c'):
print(path.name)

If you don't want to use pathlib, use can use glob.glob('**/*.c'), but don't forget to pass in the recursive keyword parameter and it will use inordinate amount of time on large directories.

For cases where matching files beginning with a dot (.); like files in the current directory or hidden files on Unix based system, use the os.walk solution below.

os.walk

For older Python versions, use os.walk to recursively walk a directory and fnmatch.filter to match against a simple expression:

import fnmatch
import os

matches = []
for root, dirnames, filenames in os.walk('src'):
for filename in fnmatch.filter(filenames, '*.c'):
matches.append(os.path.join(root, filename))

How can I search sub-folders using glob.glob module?

In Python 3.5 and newer use the new recursive **/ functionality:

configfiles = glob.glob('C:/Users/sam/Desktop/file1/**/*.txt', recursive=True)

When recursive is set, ** followed by a path separator matches 0 or more subdirectories.

In earlier Python versions, glob.glob() cannot list files in subdirectories recursively.

In that case I'd use os.walk() combined with fnmatch.filter() instead:

import os
import fnmatch

path = 'C:/Users/sam/Desktop/file1'

configfiles = [os.path.join(dirpath, f)
for dirpath, dirnames, files in os.walk(path)
for f in fnmatch.filter(files, '*.txt')]

This'll walk your directories recursively and return all absolute pathnames to matching .txt files. In this specific case the fnmatch.filter() may be overkill, you could also use a .endswith() test:

import os

path = 'C:/Users/sam/Desktop/file1'

configfiles = [os.path.join(dirpath, f)
for dirpath, dirnames, files in os.walk(path)
for f in files if f.endswith('.txt')]

Clean way to glob for files recursively in python

Using pathlib:

from pathlib import Path
Path('/to/myDir').glob('**/*.c')

As for why glob didn't work for you:

glob.glob('myDir/**/*.c', recursive=True)
^
|___ you had a lower d here?

Make sure you're running it from within the parent of myDir and that your Python version is 3.5+.

Using glob to find all zip files recursively in three sub folder

The string with commas in it is ... just a string. If you want to perform three globs, you need something like

zip_file = []
for dir in {"comm", "nmr", "nmh"}:
zip_file.extend(glob.glob(os.path.join(inputpath, dir, "*.zip"), recursive=True)

As noted by @Barmar in comments, if you want to look for zip files anywhere within these folders, the pattern needs to be ...(os.path.join(inputpath, dir, "**/*.zip"). If not, perhaps edit your question to provide an example of the structure you want to traverse.

Python - Glob to recursively dig through directories

Use **/* as a pattern:

>>> from pprint import pprint as pp
>>> import pathlib as pl
>>>
>>>
>>> p = pl.Path(".")
>>>
>>> old_way = list(p.glob("**/")) # Your way
>>> pp(old_way)
[WindowsPath('.'),
WindowsPath('dir0'),
WindowsPath('dir1'),
WindowsPath('dir1/dir10')]
>>>
>>> new_way = list(p.glob("**/*"))
>>> pp(new_way)
[WindowsPath('dir0'),
WindowsPath('dir1'),
WindowsPath('file0.txt'),
WindowsPath('dir0/file00.txt'),
WindowsPath('dir1/dir10'),
WindowsPath('dir1/file10.txt')]
>>>
>>> newer_way = [p] + list(p.glob("**/*")) # Prepend globed dir
>>> pp(newer_way)
[WindowsPath('.'),
WindowsPath('dir0'),
WindowsPath('dir1'),
WindowsPath('file0.txt'),
WindowsPath('dir0/file00.txt'),
WindowsPath('dir1/dir10'),
WindowsPath('dir1/file10.txt')]

Here's [Python.Docs]: pathlib - Path.glob(pattern) for reference.

How to access files in multiple sub-directories using glob

Instead of specifying .jpg within the path itself, specify it in glob() function. So, you can try this:

import glob 
file = '/content/drive/My Drive/Jacob_Images/Substrate_Training/B_Substrate/Originals'
for f in glob.glob(file + "/*.jpg"):
print(f)

Or you can use this code below borrowing code from this post.

for i in glob.glob('/home/studio-lab-user/sagemaker-studiolab-notebooks/Misc/tsv/**/*.csv', recursive=True):
print(i)

Your code also works. I am certain the issue is with the path you are providing.

Using your way:

Code 1

Using my way:

Code 2



Related Topics



Leave a reply



Submit