Os.Walk Without Hidden Folders

os.walk without hidden folders

No, there is no option to os.walk() that'll skip those. You'll need to do so yourself (which is easy enough):

for root, dirs, files in os.walk(path):
files = [f for f in files if not f[0] == '.']
dirs[:] = [d for d in dirs if not d[0] == '.']
# use files and dirs

Note the dirs[:] = slice assignment; os.walk recursively traverses the subdirectories listed in dirs. By replacing the elements of dirs with those that satisfy a criteria (e.g., directories whose names don't begin with .), os.walk() will not visit directories that fail to meet the criteria.

This only works if you keep the topdown keyword argument to True, from the documentation of os.walk():

When topdown is True, the caller can modify the dirnames list in-place (perhaps using del or slice assignment), and walk() will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search, impose a specific order of visiting, or even to inform walk() about directories the caller creates or renames before it resumes walk() again.

os.walk exclude .svn folders

Try this:

for root, subFolders, files in os.walk(rootdir):
if '.svn' in subFolders:
subFolders.remove('.svn')

And then continue processing.

Exclude hidden files in Python

In your original code, there are multiple boolean args creating different paths. Your extension == '.' path was the only one where check_attributes was being called from what I can tell, so that might have been the issue. I decided to take a crack at rewriting it. The way I rewrote it has 2 phases: 1. get the files, either recursively or not then 2. filter the files with the args provided. Here's what I came up with:

import argparse
import os
import win32api
import win32con


def count_files(args):
files = []

# Get the files differently based on whether recursive or not.
if args.recursive:
# Note here I changed how you're iterating. os.walk returns a list of tuples, so
# you can unpack the tuple in your for. current_dir is the current dir it's in
# while walking and found_files are all the files in that dir
for current_dir, dirs, found_files in os.walk(top=args.path, topdown=True):
files += [os.path.join(current_dir, found_file) for found_file in found_files]
else
# Note the os.path.join with the dir each file is in. It's important to store the
# absolute path of each file.
files += [os.path.join(args.path, found_file) for found_file in os.listdir(args.path)
if os.path.isfile(os.path.join(args.path, found_file))]

filtered_files = []
for found_file in files:
print(found_file)
if not args.hidden and (win32api.GetFileAttributes(found_file) & win32con.FILE_ATTRIBUTE_HIDDEN):
continue # hidden == False and file has hidden attribute, go to next one

if args.extension and not found_file.endswith(args.extension):
continue # File doesn't end in provided extension

filtered_files.append(found_file)

print(f'Length: {len(filtered_files)}')
return len(filtered_files)


if __name__ == '__main__':
parser = argparse.ArgumentParser(description='Process some integers.')
# Note that I took advantage of some other argparse features here like
# required vs optional arguments and boolean types
parser.add_argument('path')
parser.add_argument('--recursive', action='store_true', default=False)
parser.add_argument('--hidden', action='store_true', default=False)
parser.add_argument('--extension', type=str)

args = parser.parse_args()
count_files(args)

How to ignore hidden files using os.listdir()?

You can write one yourself:

import os

def listdir_nohidden(path):
for f in os.listdir(path):
if not f.startswith('.'):
yield f

Or you can use a glob:

import glob
import os

def listdir_nohidden(path):
return glob.glob(os.path.join(path, '*'))

Either of these will ignore all filenames beginning with '.'.

Hello, how does one ignore a directory like .snaptshot

What about checking for '.snapshot' or '.git' in a path and continue if found?

...
for path, dirs, files in os.walk(datasource):
if '.snapshot' in path or '.git' in path:
continue
...

Ignore hidden files while recursively scanning directories

You can just replace if not recentry.path.startswith('.'): with if not recentry.name.startswith('.'):, so that it will ignore your .DS_Store file.

Filtering os.walk() dirs and files

This solution uses fnmatch.translate to convert glob patterns to regular expressions (it assumes the includes only is used for files):

import fnmatch
import os
import os.path
import re

includes = ['*.doc', '*.odt'] # for files only
excludes = ['/home/paulo-freitas/Documents'] # for dirs and files

# transform glob patterns to regular expressions
includes = r'|'.join([fnmatch.translate(x) for x in includes])
excludes = r'|'.join([fnmatch.translate(x) for x in excludes]) or r'$.'

for root, dirs, files in os.walk('/home/paulo-freitas'):

# exclude dirs
dirs[:] = [os.path.join(root, d) for d in dirs]
dirs[:] = [d for d in dirs if not re.match(excludes, d)]

# exclude/include files
files = [os.path.join(root, f) for f in files]
files = [f for f in files if not re.match(excludes, f)]
files = [f for f in files if re.match(includes, f)]

for fname in files:
print fname


Related Topics



Leave a reply



Submit