Number of Subdirectories in a Directory

How to count number of files in each directory?

Assuming you have GNU find, let it find the directories and let bash do the rest:

find . -type d -print0 | while read -d '' -r dir; do
files=("$dir"/*)
printf "%5d files in directory %s\n" "${#files[@]}" "$dir"
done

count number of subdirectories within subdirectories

You need to use some kind of recursion for that task. What about a sub-routine that loops through the sub-directories and calls itself for each one? What I mean is the following:

@echo off
rem // Define constants here:
set "_PATH=%~1" & rem // (path of the root directory to process)
rem // Define global variables here:
set /A "$DEPTH=0" & rem // (variable to determine the greatest depth)

rem // Initialise variables:
set /A "DEEP=0" & rem // (depth of the current directory branch)
rem // Call recursive sub-routine, avoid empty argument:
if defined _PATH (call :SUB "%_PATH%") else (call :SUB ".")
rem // Return found depth:
echo %$DEPTH%
exit /B

:SUB <root_path>
rem // Loop through all sub-directories of the given one:
for /D %%D in ("%~1\*") do (
rem // For each sub-directory increment depth counter:
set /A "DEEP+=1"
rem // For each sub-directory recursively call the sub-routine:
call :SUB "%%~fD"
)
rem // Check whether current branch has the deepest directory hierarchy:
if %$DEPTH% lss %DEEP% set /A "$DEPTH=DEEP"
rem // Decrement depth counter before returning from sub-routine:
set /A "DEEP-=1"
exit /B

Just as an alternative idea, but with a bit worse performance, you could also determine the number of backslashes (\) in the resolved paths of all sub-directories, retrieve the greatest number and subtract that number of the root directory from the greatest one, like this:

@echo off
rem // Define constants here:
set "_PATH=%~1" & rem // (path of the root directory to process)
rem // Define global variables here:
set /A "$DEPTH=0" & rem // (variable to determine the greatest depth)

rem // Change to root directory:
pushd "%_PATH%" || exit /B 1
rem // Resolve root directory:
call :SUB "."
rem // Store total depth of root directory:
set /A "CROOT=$DEPTH, $DEPTH=0"
rem // Process all sub-directories recursicely:
for /D /R %%D in ("*") do (
rem // Determine greatest depth relative to root:
call :SUB "%%~fD" -%CROOT%
)
rem // Change back to original directory:
popd
rem // Return found depth:
echo %$DEPTH%
exit /B

:SUB <val_path> [<val_offset>]
rem // Resolve provided sub-directory:
set "ITEM=%~f1" & if not defined ITEM set "ITEM=."
rem // Initialise variables, apply count offset:
set "COUNT=%~2" & set /A "COUNT+=0"
rem // Count number of backslashes in provided path:
for %%C in ("%ITEM:\=" "%") do (
set /A "COUNT+=1"
)
rem // Check whether current branch has the deepest directory hierarchy:
if %$DEPTH% lss %COUNT% set /A "$DEPTH=COUNT"
exit /B

Count the number of folders in a directory and subdirectories

I think you want something like:

import os

files = folders = 0

for _, dirnames, filenames in os.walk(path):
# ^ this idiom means "we won't be using this value"
files += len(filenames)
folders += len(dirnames)

print "{:,} files, {:,} folders".format(files, folders)

Note that this only iterates over os.walk once, which will make it much quicker on paths containing lots of files and directories. Running it on my Python directory gives me:

30,183 files, 2,074 folders

which exactly matches what the Windows folder properties view tells me.


Note that your current code calculates the same number twice because the only change is renaming one of the returned values from the call to os.walk:

folder_counter = sum([len(folder) for r, d, folder in os.walk(path)])
# ^ here # ^ and here
file_counter = sum([len(files) for r, d, files in os.walk(path)])
# ^ vs. here # ^ and here

Despite that name change, you're counting the same value (i.e. in both it's the third of the three returned values that you're using)! Python functions do not know what names (if any at all; you could do print list(os.walk(path)), for example) the values they return will be assigned to, and their behaviour certainly won't change because of it. Per the documentation, os.walk returns a three-tuple (dirpath, dirnames, filenames), and the names you use for that, e.g. whether:

for foo, bar, baz in os.walk(...):

or:

for all_three in os.walk(..):

won't change that.

Number of subdirectories in a directory?

The command to use is:
hdfs dfs -ls -R /path/to/mydir/ | grep "^d" | wc -l

But this will also give you the error java.lang.OutOfMemoryError: Java heap space. In order to avoid the error, you need to increase the java heap space and run the same command as:

export HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Xmx5g" and then

hdfs dfs -ls -R /path/to/mydir/ | grep "^d" | wc -l .....#For all sub-directories

OR

hdfs dfs -ls /path/to/mydir/ | grep "^d" | wc -l .....#For maxdepth=1

Count number of subdirectories ignoring parent directories

You could try something like this :

find /path/to/root_dir -maxdepth 2 -mindepth 2 -type d | wc -l

Here we explicitely limit the depth of find to 2 (no more, no less) and list all directories. wc -l counts the number of lines from the output of the find command.

Note:

If your folders names contain newlines, specific characters or unusual encoding, using wc -l will yield incorrect results.

Taking inspiration from this question, you can instead print a character for each folder found, and count the resulting number of characters.

You could use the following snippet :

find /path/to/root_dir -maxdepth 2 -mindepth 2 -type d -printf '.' | wc -c     

Count the number of subdirectories or files in Python

Cue os.walk!

How about something like this:

import os

n_dirs = 0
n_files = 0
for root, dirs, files in os.walk(directory, topdown=False):
n_files += len(files)
n_dirs += len(dirs)

Maybe not the most elegant solution, but it should get the job done. :)

Count number of subdirectiories inside each subdirectory of git repo using simple git commands or shell scripting

ls at best gives you the number of non-hidden entries directly inside the directory. If you have among them a plain file, or an entry containing spaces, or an entry where the name strats with a period, or a directory entry which is a directory, but has itself subdirectories, your count will be wrong.

I would instead do a

shopt -s nullglob
for topdir in tier[1-3]
do
if [[ -d $topdir ]]
then
a=($(find "$topdir" -type d))
echo "Number of subdirectories below $topdir is ${#a[@]}"
fi
done

The purpose of setting nullglob is only to avoid an error message if no tier directory exists.

UPDATE: If you are not interested in subdirectories further down, you could do instead of the find a

shopt -s dotglob
a=("$topdir"/*/)

The dotglob ensures that you are not missing hidden subdirectories, and the final slash in the glob-pattern ensures that the expansion happens only for entries which denote a directory.



Related Topics



Leave a reply



Submit