Get a Filtered List of Files in a Directory

Get a filtered list of files in a directory

import glob

jpgFilenamesList = glob.glob('145592*.jpg')

See glob in python documenttion

Filter list of files from directory in python

you can use os

import datetime
import os

thedate = datetime.datetime.now()
filelist = [ f for f in os.listdir(mydir) if thedate.strftime("%Y-%m-%d") in f ]

Getting a list of filtered filenames from directory and sub-directories into an array

Try this:

 'Return a collection of file objects given a starting folder and a file pattern
' e.g. "*.txt"
'Pass False for last parameter if don't want to check subfolders
Function GetMatches(startFolder As String, filePattern As String, _
Optional subFolders As Boolean = True) As Collection

Dim fso, fldr, f, subFldr
Dim colFiles As New Collection
Dim colSub As New Collection

Set fso = CreateObject("scripting.filesystemobject")

colSub.Add startFolder

Do While colSub.Count > 0

Set fldr = fso.getfolder(colSub(1))
colSub.Remove 1

For Each f In fldr.Files
'check filename pattern
If UCase(f.Name) Like UCase(filePattern) Then colFiles.Add f
Next f

If subFolders Then
For Each subFldr In fldr.subFolders
colSub.Add subFldr.Path
Next subFldr
End If

Loop

Set GetMatches = colFiles

End Function

Example usage:

Dim colFiles as Collection, f, wb As Workbook
Set colFiles = GetMatches("C:\something\", "*RENS_RES*.xlsx")
For Each f in colFiles
Set wb = Workbooks.Open(f.Path)
'work with wb
wb.Close False
Next f

list files in a directory - and filter results

For filtering - or globbing - I assume I can manually do this in my code

Do not reinvent the wheel - glob and fnmatch and wordexp are standard.

How can I list all "*.txt"

The expansion of *.txt globbing is done by your shell as part of filename expansion and are spitted into separate arguments before executing the ls command. Expanding globbing expressions is not part of ls. Supporting globbing in ls would be a non-standard extension.

To list files in a directory, use scandir (with alphasort). A perfect example of scandir is in linux man pages scandir:

   #define _DEFAULT_SOURCE
#include <dirent.h>
#include <stdio.h>
#include <stdlib.h>

int
main(void)
{
struct dirent **namelist;
int n;

n = scandir(".", &namelist, NULL, alphasort);
if (n == -1) {
perror("scandir");
exit(EXIT_FAILURE);
}

while (n--) {
printf("%s\n", namelist[n]->d_name);
free(namelist[n]);
}
free(namelist);

exit(EXIT_SUCCESS);
}

A ls program would iterate for each argument, see if it's a file - if it is, list it with permissions (one call to stat + formatting). If an argument is a dir - it would use scandir then stat for each file + formatting. Note that the output format of ls is actually specified in some cases (but still leaves much freedom to implementations).

How can I list all "*.txt" files in a directory in good old C?

There's another example in man-pages glob. The code would be just simple:

       glob_t globbuf = {0};
glob("*.txt", 0, NULL, &globbuf);
for (char **file = globbuf.gl_pathv; *file != NULL; ++file) {
printf("%s\n", *file);
}
globfree(&globbuf);

The POSIX links and linux-man pages project are great for studing - linux-man pages has that SEE ALSO section below that is great for discovering functions. Also see GNU manual.

Filtering specific files in a folder based on a list

You can easily do it with the following function:

import re 

def filter_list(string, substr):
return [st for st in string if any(sub in st for sub in substr)]

ls_1 = ['C:/A/results/fie_d_t_group_Jack.xlsx',
'C:/A/results/fie_d_t_group_Bill.xlsx',
'C:/A/results/fie_d_t_group_Cort.xlsx',
'C:/A/results/fie_d_t_group_Niel.xlsx',
'C:/A/results/fie_d_t_group_Van.xlsx',
'C:/A/results/fie_d_t_group_Dick.xlsx',
'C:/A/results/fie_d_t_group_Nick.xlsx']
ls_2 = ["Jack", "Bill", "Cort", "Nick"]

filter_list(ls_1, ls_2)
# ['C:/A/results/fie_d_t_group_Jack.xlsx',
# 'C:/A/results/fie_d_t_group_Bill.xlsx',
# 'C:/A/results/fie_d_t_group_Cort.xlsx',
# 'C:/A/results/fie_d_t_group_Nick.xlsx']

If you want to save them to another location, now that you have the list of the files you want to save, you can just move them using shutil.move() function.

Filtered list of files with grep, forward that list of files to grep further

Suggesting to awk script to AND operation on RegExp (actually any logical expression of one or more RegExp).

 awk '/regExp-pattern-1/ && /regExp-pattern-2/ {print FILENAME}' RS="&@&@&@&@" files-1 file-2 ...

The advantage of this approach: every file is scanned only once. And every file is scanned as a single record (grep is scanning line by line).

The disadvantages: RegExp patterns in awk are case sensitive. And awk takes list of files only (no recursive folders).

As for pure grep

Suggesting to nest grep commands.

 grep -il "regExp-pattern-2" $(grep -irl "regExp-pattern-1" folder-path)


Related Topics



Leave a reply



Submit