How to count number of files in each directory?
Assuming you have GNU find, let it find the directories and let bash do the rest:
find . -type d -print0 | while read -d '' -r dir; do
files=("$dir"/*)
printf "%5d files in directory %s\n" "${#files[@]}" "$dir"
done
count number of subdirectories within subdirectories
You need to use some kind of recursion for that task. What about a sub-routine that loops through the sub-directories and calls itself for each one? What I mean is the following:
@echo off
rem // Define constants here:
set "_PATH=%~1" & rem // (path of the root directory to process)
rem // Define global variables here:
set /A "$DEPTH=0" & rem // (variable to determine the greatest depth)
rem // Initialise variables:
set /A "DEEP=0" & rem // (depth of the current directory branch)
rem // Call recursive sub-routine, avoid empty argument:
if defined _PATH (call :SUB "%_PATH%") else (call :SUB ".")
rem // Return found depth:
echo %$DEPTH%
exit /B
:SUB <root_path>
rem // Loop through all sub-directories of the given one:
for /D %%D in ("%~1\*") do (
rem // For each sub-directory increment depth counter:
set /A "DEEP+=1"
rem // For each sub-directory recursively call the sub-routine:
call :SUB "%%~fD"
)
rem // Check whether current branch has the deepest directory hierarchy:
if %$DEPTH% lss %DEEP% set /A "$DEPTH=DEEP"
rem // Decrement depth counter before returning from sub-routine:
set /A "DEEP-=1"
exit /B
Just as an alternative idea, but with a bit worse performance, you could also determine the number of backslashes (\
) in the resolved paths of all sub-directories, retrieve the greatest number and subtract that number of the root directory from the greatest one, like this:
@echo off
rem // Define constants here:
set "_PATH=%~1" & rem // (path of the root directory to process)
rem // Define global variables here:
set /A "$DEPTH=0" & rem // (variable to determine the greatest depth)
rem // Change to root directory:
pushd "%_PATH%" || exit /B 1
rem // Resolve root directory:
call :SUB "."
rem // Store total depth of root directory:
set /A "CROOT=$DEPTH, $DEPTH=0"
rem // Process all sub-directories recursicely:
for /D /R %%D in ("*") do (
rem // Determine greatest depth relative to root:
call :SUB "%%~fD" -%CROOT%
)
rem // Change back to original directory:
popd
rem // Return found depth:
echo %$DEPTH%
exit /B
:SUB <val_path> [<val_offset>]
rem // Resolve provided sub-directory:
set "ITEM=%~f1" & if not defined ITEM set "ITEM=."
rem // Initialise variables, apply count offset:
set "COUNT=%~2" & set /A "COUNT+=0"
rem // Count number of backslashes in provided path:
for %%C in ("%ITEM:\=" "%") do (
set /A "COUNT+=1"
)
rem // Check whether current branch has the deepest directory hierarchy:
if %$DEPTH% lss %COUNT% set /A "$DEPTH=COUNT"
exit /B
Count the number of folders in a directory and subdirectories
I think you want something like:
import os
files = folders = 0
for _, dirnames, filenames in os.walk(path):
# ^ this idiom means "we won't be using this value"
files += len(filenames)
folders += len(dirnames)
print "{:,} files, {:,} folders".format(files, folders)
Note that this only iterates over os.walk
once, which will make it much quicker on paths containing lots of files and directories. Running it on my Python directory gives me:
30,183 files, 2,074 folders
which exactly matches what the Windows folder properties view tells me.
Note that your current code calculates the same number twice because the only change is renaming one of the returned values from the call to os.walk
:
folder_counter = sum([len(folder) for r, d, folder in os.walk(path)])
# ^ here # ^ and here
file_counter = sum([len(files) for r, d, files in os.walk(path)])
# ^ vs. here # ^ and here
Despite that name change, you're counting the same value (i.e. in both it's the third of the three returned values that you're using)! Python functions do not know what names (if any at all; you could do print list(os.walk(path))
, for example) the values they return will be assigned to, and their behaviour certainly won't change because of it. Per the documentation, os.walk
returns a three-tuple (dirpath, dirnames, filenames)
, and the names you use for that, e.g. whether:
for foo, bar, baz in os.walk(...):
or:
for all_three in os.walk(..):
won't change that.
Number of subdirectories in a directory?
The command to use is: hdfs dfs -ls -R /path/to/mydir/ | grep "^d" | wc -l
But this will also give you the error java.lang.OutOfMemoryError: Java heap space
. In order to avoid the error, you need to increase the java heap space and run the same command as:
export HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Xmx5g"
and then
hdfs dfs -ls -R /path/to/mydir/ | grep "^d" | wc -l
.....#For all sub-directories
OR
hdfs dfs -ls /path/to/mydir/ | grep "^d" | wc -l
.....#For maxdepth=1
Count number of subdirectories ignoring parent directories
You could try something like this :
find /path/to/root_dir -maxdepth 2 -mindepth 2 -type d | wc -l
Here we explicitely limit the depth of find
to 2 (no more, no less) and list all directories. wc -l
counts the number of lines from the output of the find
command.
Note:
If your folders names contain newlines, specific characters or unusual encoding, using wc -l
will yield incorrect results.
Taking inspiration from this question, you can instead print a character for each folder found, and count the resulting number of characters.
You could use the following snippet :
find /path/to/root_dir -maxdepth 2 -mindepth 2 -type d -printf '.' | wc -c
Count the number of subdirectories or files in Python
Cue os.walk!
How about something like this:
import os
n_dirs = 0
n_files = 0
for root, dirs, files in os.walk(directory, topdown=False):
n_files += len(files)
n_dirs += len(dirs)
Maybe not the most elegant solution, but it should get the job done. :)
Count number of subdirectiories inside each subdirectory of git repo using simple git commands or shell scripting
ls
at best gives you the number of non-hidden entries directly inside the directory. If you have among them a plain file, or an entry containing spaces, or an entry where the name strats with a period, or a directory entry which is a directory, but has itself subdirectories, your count will be wrong.
I would instead do a
shopt -s nullglob
for topdir in tier[1-3]
do
if [[ -d $topdir ]]
then
a=($(find "$topdir" -type d))
echo "Number of subdirectories below $topdir is ${#a[@]}"
fi
done
The purpose of setting nullglob
is only to avoid an error message if no tier
directory exists.
UPDATE: If you are not interested in subdirectories further down, you could do instead of the find
a
shopt -s dotglob
a=("$topdir"/*/)
The dotglob ensures that you are not missing hidden subdirectories, and the final slash in the glob-pattern ensures that the expansion happens only for entries which denote a directory.
Related Topics
How to Specify a Local Bond Interface to Multicast Socket in Haskell
Jenkins Cannot Run Firefox: No Protocol Specified Error: Cannot Open Display:: 0
How to Capture Unix 'Top' Command Output to a CSV File
Node.Js and Open Files Limit in Linux
Host Doing Unnecessary Dns Lookup for Localhost
Linux Cronjob Doesn't Work (Execute Script)
Ssh Agent Forwarding Inside Cron Jobs
Linux, How to Execute an Executable/Non-Executable File
Clearing Large Apache Domain Logs
Jmp Unexpected Behavior in Shellcode When Next(Skipped) Instruction Is a Variable Definition
How to Finding All Runnable Processes
How to Open The Default Text Editor in Linux
Editing The Sudo File in a Shell Script