Regular expression usage in glob.glob?
The easiest way would be to filter the glob results yourself. Here is how to do it using a simple loop comprehension:
import glob
res = [f for f in glob.glob("*.txt") if "abc" in f or "123" in f or "a1b" in f]
for f in res:
print f
You could also use a regexp and no glob
:
import os
import re
res = [f for f in os.listdir(path) if re.search(r'(abc|123|a1b).*\.txt$', f)]
for f in res:
print f
(By the way, naming a variable list
is a bad idea since list
is a Python type...)
Regular expression and Python glob
glob
regex doesn't support alternation pipe symbol (|
), like you used, it's better to use some regex pattern (re
) to create your desired file list on one line and then iterate over it. you have 3 range, so you need 3 for loop to do this! one of them using your mentioned regex will be as follow:
import re
import glob
dest_dir = "/tmp/folder3/"
for file in [f for f in glob.glob("/tmp/source/*.jpg") if re.search(r'([0-9]|[1-9][0-9]|[1-9][0-9][0-9]|1000)\.jpg', f)]:
#shutil.copy(file, dest_dir)
print(file)
Regular expression, glob, Python
For this specific case, glob
already supports what you need (see fnmatch
docs for glob wildcards). You can just do:
for filename in glob.glob("pc[23456]??.txt"):
If you need to be extra specific that the two trailing characters are numbers (some files might have non-numeric characters there), you can replace the ?
s with [0123456789]
, but otherwise, I find the ?
a little less distracting.
In a more complicated scenario, you might be forced to resort to regular expressions, and you could do so here with:
import re
for filename in filter(re.compile(r'^pc_[2-6]\d\d\.txt$').match, os.listdir('.')):
but given that glob-style wildcards work well enough, you don't need to break out the big guns just yet.
Finding file name using regex from glob
regex solution:
import os
import re
res=[i for i in os.listdir(BASEDIR) if re.match(r'test\.[a-zA-Z0-9]{8}\.js',i)]
print(res)
NOTE: the solution would just be the name of file, you can use
os.join(BASEDIR,res[i])
to get full path
Python Glob regex file search with for single result from multiple matches
glob
accepts Unix wildcards, not regexes. Those are less powerful but what you're asking can still be achieved. This:
glob.glob("/path/to/file/*[!0-9]3.txt")
filters the files containing 3 without digits before.
For other cases, you can use a list comprehension and regex:
[x for x in glob.glob("/path/to/file/*") if re.match(some_regex,os.path.basename(x))]
Filesystem independent way of using glob.glob and regular expressions with unicode filenames in Python
I'm assuming you want to match unicode equivalent filenames, e.g. you expect an input pattern of u'\xE9*'
to match both filenames u'\xE9qui'
and u'e\u0301qui'
on any operating system, i.e. character-level pattern matching.
You have to understand that this is not the default on Linux, where bytes are taken as bytes, and where not every filename is a valid unicode string in the current system encoding (although Python 3 uses the 'surrogateescape' error handler to represent these as str
anyway).
With that in mind, this is my solution:
def myglob(pattern, directory=u'.'):
pattern = unicodedata.normalize('NFC', pattern)
results = []
enc = sys.getfilesystemencoding()
for name in os.listdir(directory):
if isinstance(name, bytes):
try:
name = name.decode(enc)
except UnicodeDecodeError:
# Filenames that are not proper unicode won't match any pattern
continue
if fnmatch.filter([unicodedata.normalize('NFC', name)], pattern):
results.append(name)
return results
how can I use a particular regex with glob
I did the following:
local = glob.glob('/Users/tp/Downloads/example/*/[A-Z]*-circle.txt')
for filePath in local:
matches = re.findall("[A-Z]{2,4}-circle.txt", filePath)
if matches:
print(filePath)
and that worked!
Related Topics
How to Write Data into CSV Format as String (Not File)
Plotting Networkx Graph with Node Labels Defaulting to Node Name
Python and Openssl Version Reference Issue on Os X
Print Statement Inside of Input Returns with a "None"
Reduce Left and Right Margins in Matplotlib Plot
Why Does Pyplot.Contour() Require Z to Be a 2D Array
Constructing a Co-Occurrence Matrix in Python Pandas
Return and Yield in the Same Function
Removing Elements from a List Containing Specific Characters
Pandas - Explanation on Apply Function Being Slow
How to Re Import an Updated Package While in Python Interpreter
How to Take the Nth Digit of a Number in Python
Can One Get Hierarchical Graphs from Networkx with Python 3