How to Find All Immediate Sub-Directories of The Current Directory on Linux

How to only find files in a given directory, and ignore subdirectories using bash

If you just want to limit the find to the first level you can do:

 find /dev -maxdepth 1 -name 'abc-*'

... or if you particularly want to exclude the .udev directory, you can do:

 find /dev -name '.udev' -prune -o -name 'abc-*' -print

Getting a list of all subdirectories in the current directory

Do you mean immediate subdirectories, or every directory right down the tree?

Either way, you could use os.walk to do this:

os.walk(directory)

will yield a tuple for each subdirectory. Ths first entry in the 3-tuple is a directory name, so

[x[0] for x in os.walk(directory)]

should give you all of the subdirectories, recursively.

Note that the second entry in the tuple is the list of child directories of the entry in the first position, so you could use this instead, but it's not likely to save you much.

However, you could use it just to give you the immediate child directories:

next(os.walk('.'))[1]

Or see the other solutions already posted, using os.listdir and os.path.isdir, including those at "How to get all of the immediate subdirectories in Python".

Execute command in all immediate subdirectories

Can you try using this simple loop which loops in all sub-directories at one level deep and execute commands on it,

for d in ./*/ ; do (cd "$d" && ls -al); done

(cmd1 && cmd2) opens a sub-shell to run the commands. Since it is a child shell, the parent shell (the shell from which you're running this command) retains its current folder and other environment variables.

Wrap it around in a function in a proper zsh script as

#!/bin/zsh

function runCommand() {
for d in ./*/ ; do /bin/zsh -c "(cd "$d" && "$@")"; done
}

runCommand "ls -al"

should work just fine for you.

Find the name of subdirectories and process files in each

You should be able to do with a single find command with an embedded shell command:

find /PROD -type d -execdir sh -c 'for f in *.json; do /tmp/test.py "$f"; done' \;

Note: -execdir is not POSIX-compliant, but the BSD (OSX) and GNU (Linux) versions of find support it; see below for a POSIX alternative.

  • The approach is to let find match directories, and then, in each matched directory, execute a shell with a file-processing loop (sh -c '<shellCmd>').
  • If not all subdirectories are guaranteed to have *.json files, change the shell command to for f in *.json; do [ -f "$f" ] && /tmp/test.py "$f"; done

Update: Two more considerations; tip of the hat to kenorb's answer:

  • By default, find processes the entire subtree of the input directory. To limit matching to immediate subdirectories, use -maxdepth 1[1]:

    find /PROD -maxdepth 1 -type d ...
  • As stated, -execdir - which runs the command passed to it in the directory currently being processed - is not POSIX compliant; you can work around this by using -exec instead and by including a cd command with the directory path at hand ({}) in the shell command:

    find /PROD -type d -exec sh -c 'cd "{}" && for f in *.json; do /tmp/test.py "$f"; done' \;

[1] Strictly speaking, you can place the -maxdepth option anywhere after the input file paths on the find command line - as an option, it is not positional. However, GNU find will issue a warning unless you place it before tests (such as -type) and actions (such as -exec).

How to get all of the immediate subdirectories in Python

I did some speed testing on various functions to return the full path to all current subdirectories.

tl;dr:
Always use scandir:

list_subfolders_with_paths = [f.path for f in os.scandir(path) if f.is_dir()]

Bonus: With scandir you can also simply only get folder names by using f.name instead of f.path.

This (as well as all other functions below) will not use natural sorting. This means results will be sorted like this: 1, 10, 2. To get natural sorting (1, 2, 10), please have a look at https://stackoverflow.com/a/48030307/2441026




Results:
scandir is: 3x faster than walk, 32x faster than listdir (with filter), 35x faster than Pathlib and 36x faster than listdir and 37x (!) faster than glob.

Scandir:           0.977
Walk: 3.011
Listdir (filter): 31.288
Pathlib: 34.075
Listdir: 35.501
Glob: 36.277

Tested with W7x64, Python 3.8.1. Folder with 440 subfolders.

In case you wonder if listdir could be speed up by not doing os.path.join() twice, yes, but the difference is basically nonexistent.

Code:

import os
import pathlib
import timeit
import glob

path = r"<example_path>"

def a():
list_subfolders_with_paths = [f.path for f in os.scandir(path) if f.is_dir()]
# print(len(list_subfolders_with_paths))

def b():
list_subfolders_with_paths = [os.path.join(path, f) for f in os.listdir(path) if os.path.isdir(os.path.join(path, f))]
# print(len(list_subfolders_with_paths))

def c():
list_subfolders_with_paths = []
for root, dirs, files in os.walk(path):
for dir in dirs:
list_subfolders_with_paths.append( os.path.join(root, dir) )
break
# print(len(list_subfolders_with_paths))

def d():
list_subfolders_with_paths = glob.glob(path + '/*/')
# print(len(list_subfolders_with_paths))

def e():
list_subfolders_with_paths = list(filter(os.path.isdir, [os.path.join(path, f) for f in os.listdir(path)]))
# print(len(list(list_subfolders_with_paths)))

def f():
p = pathlib.Path(path)
list_subfolders_with_paths = [x for x in p.iterdir() if x.is_dir()]
# print(len(list_subfolders_with_paths))

print(f"Scandir: {timeit.timeit(a, number=1000):.3f}")
print(f"Listdir: {timeit.timeit(b, number=1000):.3f}")
print(f"Walk: {timeit.timeit(c, number=1000):.3f}")
print(f"Glob: {timeit.timeit(d, number=1000):.3f}")
print(f"Listdir (filter): {timeit.timeit(e, number=1000):.3f}")
print(f"Pathlib: {timeit.timeit(f, number=1000):.3f}")

How to get list of subdirectories names

I usually check for directories, while assembling a list in one go. Assuming that there is a directory called foo, that I would like to check for sub-directories:

import os
output = [dI for dI in os.listdir('foo') if os.path.isdir(os.path.join('foo',dI))]

How to get a list of all folders in an entire drive

os.walk yields three-tuples for each directory traversed, in the form (currentdir, containeddirs, containedfiles). This listcomp:

[x[0] for x in os.walk(directory)]

just ignores the contents of each directory and just accumulates the directories it enumerates. It would be slightly nicer/more self-documenting if written with unpacking (using _ for stuff you don't care about), e.g.:

dirs = [curdir for curdir, _, _ in os.walk(directory)]

but they're both equivalent. To make it list for the entire drive, just provide the root of the drive as the directory argument to os.walk, e.g. for Windows:

c_drive_dirs = [curdir for curdir, _, _ in os.walk('C:\\')]

or for non-Windows:

alldirs = [curdir for curdir, _, _ in os.walk('/')]

How to count number of files in each directory?

Assuming you have GNU find, let it find the directories and let bash do the rest:

find . -type d -print0 | while read -d '' -r dir; do
files=("$dir"/*)
printf "%5d files in directory %s\n" "${#files[@]}" "$dir"
done


Related Topics



Leave a reply



Submit