Python Glob Multiple Filetypes

Python glob multiple filetypes

Maybe there is a better way, but how about:

import glob
types = ('*.pdf', '*.cpp') # the tuple of file types
files_grabbed = []
for files in types:
files_grabbed.extend(glob.glob(files))

# files_grabbed is the list of pdf and cpp files

Perhaps there is another way, so wait in case someone else comes up with a better answer.

Python glob multiple filetypes

Maybe there is a better way, but how about:

import glob
types = ('*.pdf', '*.cpp') # the tuple of file types
files_grabbed = []
for files in types:
files_grabbed.extend(glob.glob(files))

# files_grabbed is the list of pdf and cpp files

Perhaps there is another way, so wait in case someone else comes up with a better answer.

pathlib.Path().glob() and multiple file extension

If you need to use pathlib.Path.glob()

from pathlib import Path
def get_files(extensions):
all_files = []
for ext in extensions:
all_files.extend(Path('.').glob(ext))
return all_files

files = get_files(('*.txt', '*.py', '*.cfg'))

Use one glob.glob instead of multiple glob.glob

You cannot do complex wildcard globbing like glob.glob('*.{JPG, png..}) because,
if you take a look at the source code you will see,

def glob(pathname):
"""Return a list of paths matching a pathname pattern.
....
"""
return list(iglob(pathname))

And then if you find the source of iglob you will then see,

def iglob(pathname):
....
....
dirname, basename = os.path.split(pathname)

# voila, here, our complex glob wildcard will certainly break, and can't be used :)

Therefore, you can only do simple globbing using glob :)

Python glob multiple filetypes

Maybe there is a better way, but how about:

import glob
types = ('*.pdf', '*.cpp') # the tuple of file types
files_grabbed = []
for files in types:
files_grabbed.extend(glob.glob(files))

# files_grabbed is the list of pdf and cpp files

Perhaps there is another way, so wait in case someone else comes up with a better answer.

Use multiple file extensions for glob to find files

You could use os.walk, which looks in subdirectories as well.

import os

for root, dirs, files in os.walk("path/to/directory"):
for file in files:
if file.endswith((".py", ".json")): # The arg can be a tuple of suffixes to look for
print(os.path.join(root, file))

Glob Multiple File Types

You're right, glob doesn't accept this kind of pattern. You need to call it once for each extension:

extern crate glob;

use glob::glob;

fn main() {
for file_name_result in glob("example/**/*.json")
.unwrap()
.chain(glob("example/**/*.jsonc").unwrap())
{
match file_name_result {
Ok(file_path) => {
println!("Found:{}", file_path.display());
}
Err(e) => {
eprintln!("ERROR: {}", e);
}
};
}
}

Search multiple patterns using glob only once

glob understands shell-style path globbing, so you can simply do:

files1 = glob.glob('*type[12]*/*')

or if you needed to expand to more numbers, something like this (for 1 through 6):

files1 = glob.glob('*type[1-6]*/*')

It will be faster to only call glob() once, because glob() will have to make multiple reads of the current directory and each subdirectory of the current directory (on a Unix system, this is readdir() function) and those will be repeated for each call to glob(). The directory contents might be cached by the OS, so it doesn't have to be read from disk, but the call still has to be repeated and glob() has to compare all of the filenames against the glob pattern.

That said, practically speaking, the performance difference isn't likely to be noticeable unless you have thousands of files and subdirectories.



Related Topics



Leave a reply



Submit