Get a filtered list of files in a directory
import glob
jpgFilenamesList = glob.glob('145592*.jpg')
See glob
in python documenttion
Filter list of files from directory in python
you can use os
import datetime
import os
thedate = datetime.datetime.now()
filelist = [ f for f in os.listdir(mydir) if thedate.strftime("%Y-%m-%d") in f ]
Getting a list of filtered filenames from directory and sub-directories into an array
Try this:
'Return a collection of file objects given a starting folder and a file pattern
' e.g. "*.txt"
'Pass False for last parameter if don't want to check subfolders
Function GetMatches(startFolder As String, filePattern As String, _
Optional subFolders As Boolean = True) As Collection
Dim fso, fldr, f, subFldr
Dim colFiles As New Collection
Dim colSub As New Collection
Set fso = CreateObject("scripting.filesystemobject")
colSub.Add startFolder
Do While colSub.Count > 0
Set fldr = fso.getfolder(colSub(1))
colSub.Remove 1
For Each f In fldr.Files
'check filename pattern
If UCase(f.Name) Like UCase(filePattern) Then colFiles.Add f
Next f
If subFolders Then
For Each subFldr In fldr.subFolders
colSub.Add subFldr.Path
Next subFldr
End If
Loop
Set GetMatches = colFiles
End Function
Example usage:
Dim colFiles as Collection, f, wb As Workbook
Set colFiles = GetMatches("C:\something\", "*RENS_RES*.xlsx")
For Each f in colFiles
Set wb = Workbooks.Open(f.Path)
'work with wb
wb.Close False
Next f
list files in a directory - and filter results
For filtering - or globbing - I assume I can manually do this in my code
Do not reinvent the wheel - glob and fnmatch and wordexp are standard.
How can I list all "*.txt"
The expansion of *.txt
globbing is done by your shell as part of filename expansion and are spitted into separate arguments before executing the ls
command. Expanding globbing expressions is not part of ls
. Supporting globbing in ls
would be a non-standard extension.
To list files in a directory, use scandir (with alphasort
). A perfect example of scandir
is in linux man pages scandir:
#define _DEFAULT_SOURCE
#include <dirent.h>
#include <stdio.h>
#include <stdlib.h>
int
main(void)
{
struct dirent **namelist;
int n;
n = scandir(".", &namelist, NULL, alphasort);
if (n == -1) {
perror("scandir");
exit(EXIT_FAILURE);
}
while (n--) {
printf("%s\n", namelist[n]->d_name);
free(namelist[n]);
}
free(namelist);
exit(EXIT_SUCCESS);
}
A ls
program would iterate for each argument, see if it's a file - if it is, list it with permissions (one call to stat
+ formatting). If an argument is a dir - it would use scandir
then stat
for each file + formatting. Note that the output format of ls is actually specified in some cases (but still leaves much freedom to implementations).
How can I list all "*.txt" files in a directory in good old C?
There's another example in man-pages glob. The code would be just simple:
glob_t globbuf = {0};
glob("*.txt", 0, NULL, &globbuf);
for (char **file = globbuf.gl_pathv; *file != NULL; ++file) {
printf("%s\n", *file);
}
globfree(&globbuf);
The POSIX links and linux-man pages project are great for studing - linux-man pages has that SEE ALSO
section below that is great for discovering functions. Also see GNU manual.
Filtering specific files in a folder based on a list
You can easily do it with the following function:
import re
def filter_list(string, substr):
return [st for st in string if any(sub in st for sub in substr)]
ls_1 = ['C:/A/results/fie_d_t_group_Jack.xlsx',
'C:/A/results/fie_d_t_group_Bill.xlsx',
'C:/A/results/fie_d_t_group_Cort.xlsx',
'C:/A/results/fie_d_t_group_Niel.xlsx',
'C:/A/results/fie_d_t_group_Van.xlsx',
'C:/A/results/fie_d_t_group_Dick.xlsx',
'C:/A/results/fie_d_t_group_Nick.xlsx']
ls_2 = ["Jack", "Bill", "Cort", "Nick"]
filter_list(ls_1, ls_2)
# ['C:/A/results/fie_d_t_group_Jack.xlsx',
# 'C:/A/results/fie_d_t_group_Bill.xlsx',
# 'C:/A/results/fie_d_t_group_Cort.xlsx',
# 'C:/A/results/fie_d_t_group_Nick.xlsx']
If you want to save them to another location, now that you have the list of the files you want to save, you can just move them using shutil.move()
function.
Filtered list of files with grep, forward that list of files to grep further
Suggesting to awk
script to AND
operation on RegExp (actually any logical expression of one or more RegExp).
awk '/regExp-pattern-1/ && /regExp-pattern-2/ {print FILENAME}' RS="&@&@&@&@" files-1 file-2 ...
The advantage of this approach: every file is scanned only once. And every file is scanned as a single record (grep
is scanning line by line).
The disadvantages: RegExp patterns in awk
are case sensitive. And awk
takes list of files only (no recursive folders).
As for pure grep
Suggesting to nest grep
commands.
grep -il "regExp-pattern-2" $(grep -irl "regExp-pattern-1" folder-path)
Related Topics
Execution of Python Code with -M Option or Not
Python Script to Do Something at the Same Time Every Day
Remove All Special Characters, Punctuation and Spaces from String
Positional Argument V.S. Keyword Argument
How to Percent-Encode Url Parameters in Python
Disable Tensorflow Debugging Information
In Practice, What Are the Main Uses for the "Yield From" Syntax in Python 3.3
What's the How to Install Pip, Virtualenv, and Distribute for Python
What's the Difference Between Select_Related and Prefetch_Related in Django Orm
Python's Equivalent of && (Logical-And) in an If-Statement
Matplotlib Make Tick Labels Font Size Smaller
Can Pandas Automatically Read Dates from a CSV File
Why Is the Subprocess.Popen Argument Length Limit Smaller Than What the Os Reports
How to Install Both Python 2.X and Python 3.X in Windows