List Files Over a Specific Size in Current Directory and All Subdirectories

List files over a specific size in current directory and all subdirectories

find . -size +10k -exec ls -lh {} \+

the first part of this is identical to @sputnicks answer, and sucesffully finds all files in the directory over 10k (don't confuse k with K), my addition, the second part then executes ls -lh or ls that lists(-l) the files by human readable size(-h). negate the h if you prefer. of course the {} is the file itself, and the \+ is simply an alternative to \;

which in practice \; would repeat or:

ls -l found.file; ls -l found.file.2; ls -l found.file.3

where \+ display it as one statement or:

ls -l found.file found.file.2 found.file.3

more on \; vs + with find

Additionaly, you may want the listing ordered by size. Which is relatively easy to accomplish. I would at the -s option to ls, so ls -ls and then pipe it to sort -n to sort numerically

which would become:

find . -size +10k -exec ls -ls {} \+ | sort -n

or in reverse order add an -r :

find . -size +10k -exec ls -ls {} \+ | sort -nr

finally, your title says find biggest file in directory. You can do that by then piping the code to tail

find . -size +10k -exec ls -ls {} \+ | sort -n | tail -1
would find you the largest file in the directory and its sub directories.

note you could also sort files by size by using -S, and negate the need for sort. but to find the largest file you would need to use head so

find . -size +10k -exec ls -lS {} \+ | head -1

the benefit of doing it with -S and not sort is one, you don't have to type sort -n and two you can also use -h the human readable size option. which is one of my favorite to use, but is not available with older versisions of ls, for example we have an old centOs 4 server at work that doesn't have -h

List all folders and subfolders in a given structure with filesize

A PowerShell solution that builds on montonero's helpful answer and improves the following aspects:

  • control over the recursion depth
  • improved performance
  • better integration with other cmdlets for composable functionality

Sample calls, based on function Get-DirectorySize defined below:

# Get the size of the current directory (only).
Get-DirectorySize

# As requested by the OP:
# Recursively report the sizes of all subdirectories in the current directory.
Get-DirectorySize -Recurse -ExcludeSelf

# Get the size of all child directories and sort them by size, from largest
# to smallest, showing only the 5 largest ones:
Get-DirectorySize -Depth 1 -ExcludeSelf |
Sort-Object Size -Descending |
Select-Object -First 5

Sample output from the last command:

FullName                           FriendlySize       Size
-------- ------------ ----
C:\Users\jdoe\AppData 3.27gb 3514782772
C:\Users\jdoe\Desktop 801.40mb 840326199
C:\Users\jdoe\.nuget 778.97mb 816814396
C:\Users\jdoe\.vscode 449.12mb 470931418
C:\Users\jdoe\Projects 104.07mb 109127742

Note that property .FriendlySize contains a friendly, auto-scaled string representation of the size, whereas .Size is a number ([long]) containing the actual byte count, which is what facilitates further programmatic processing.

Note: Adding properties to the output objects that facilitate friendly display only is done here for implementation convenience only. The proper Powershell way would be to instead define formatting instructions based on the output object type - see the docs.

Caveats (apply to the linked answer too):

  • Only logical sizes are reported, i.e., the actual bytes need by the file data, which differs from the size on disk, which is, typically, larger, due to files occupying fixed-size blocks; conversely, compressed and sparse files occupy less disk space.

  • The implementation of the recursion (with -Recurse and / or -Depth) is inefficient, because the subtree of each directory encountered is scanned in full; this is helped somewhat by the filesystem cache.



Get-DirectorySize source code

Note: Requires Windows PowerShell v3+; also compatible with PowerShell Core.

function Get-DirectorySize
{

param(
[Parameter(ValueFromPipeline)] [Alias('PSPath')]
[string] $LiteralPath = '.',
[switch] $Recurse,
[switch] $ExcludeSelf,
[int] $Depth = -1,
[int] $__ThisDepth = 0 # internal use only
)

process {

# Resolve to a full filesystem path, if necessary
$fullName = if ($__ThisDepth) { $LiteralPath } else { Convert-Path -ErrorAction Stop -LiteralPath $LiteralPath }

if ($ExcludeSelf) { # Exclude the input dir. itself; implies -Recurse

$Recurse = $True
$ExcludeSelf = $False

} else { # Process this dir.

# Calculate this dir's total logical size.
# Note: [System.IO.DirectoryInfo].EnumerateFiles() would be faster,
# but cannot handle inaccessible directories.
$size = [Linq.Enumerable]::Sum(
[long[]] (Get-ChildItem -Force -Recurse -File -LiteralPath $fullName).ForEach('Length')
)

# Create a friendly representation of the size.
$decimalPlaces = 2
$padWidth = 8
$scaledSize = switch ([double] $size) {
{$_ -ge 1tb } { $_ / 1tb; $suffix='tb'; break }
{$_ -ge 1gb } { $_ / 1gb; $suffix='gb'; break }
{$_ -ge 1mb } { $_ / 1mb; $suffix='mb'; break }
{$_ -ge 1kb } { $_ / 1kb; $suffix='kb'; break }
default { $_; $suffix='b'; $decimalPlaces = 0; break }
}

# Construct and output an object representing the dir. at hand.
[pscustomobject] @{
FullName = $fullName
FriendlySize = ("{0:N${decimalPlaces}}${suffix}" -f $scaledSize).PadLeft($padWidth, ' ')
Size = $size
}

}

# Recurse, if requested.
if ($Recurse -or $Depth -ge 1) {
if ($Depth -lt 0 -or (++$__ThisDepth) -le $Depth) {
# Note: This top-down recursion is inefficient, because any given directory's
# subtree is processed in full.
Get-ChildItem -Force -Directory -LiteralPath $fullName |
ForEach-Object { Get-DirectorySize -LiteralPath $_.FullName -Recurse -Depth $Depth -__ThisDepth $__ThisDepth }
}
}

}

}

Here's the comment-based help for the function; if you add the function to, say, your $PROFILE, place the help directly above the function or just inside the function body in order to get support for -? and automatic integration with Get-Help.

<#
.SYNOPSIS
Gets the logical size of directories in bytes.

.DESCRIPTION
Given a literal directory path, output that directory's logical size, i.e.,
the sum of all files contained in the directory, including hidden ones.

NOTE:
* The logical size is distinct from the size on disk, given that files
are stored in fixed-size blocks. Furthermore, files can be compressed
or sparse.
Thus, the size of regular files on disk is typically greater than
their logical size; conversely, compressed and sparse files require less
disk space.
Finally, the list of child items maintained by the filesystem for each
directory requires disk space too.

* Wildcard expressions aren't directly supported, but you may pipe in
Output from Get-ChildItem / Get-Item; if files rather than directotries
happen to be among the input objects, their size is reported as-is.

CAVEATS:
* Can take a long time to run with large directory trees, especially with
-Recurse.
* Recursion is implemented inefficently.

.PARAMETER LiteralPath
The literal path of a directory. May be provided via the pipeline.

.PARAMETER Recurse
Calculates the logical size not only of the input directory itself, but of
all subdirectories in its subtree too.
To limit the recursion depth, use -Depth.

.PARAMETER Depth
Limits the recursion depth to the specified number of levels. Implies -Recurse.
Note that 0 means no recursion. Use just -Recurse in order not to limit the
recursion.

.PARAMETER ExcludeSelf
Excludes the target directory itself from the size calculation.
Implies -Recurse. Since -Depth implies -Recurse, you could use -ExcludeSelf
-Depth 1 to report only the sizes of the immediate subdirectories.

.OUTPUTS
[pscustomobject] instances with properties FullName, Size, and FriendlySize.

.EXAMPLE
Get-DirectorySize

Gets the logical size of the current directory.

.EXAMPLE
Get-DirectorySize -Recurse

Gets the logical size of the current directory and all its subdirectories.

.EXAMPLE
Get-DirectorySize /path/to -ExcludeSelf -Depth 1 | Sort-Object Size

Gets the logical size of all child directories in /path/to without including
/path/to itself, and sorts the result by size (largest last).
#>

Using ls to list directories and their total sizes

Try something like:

du -sh *

short version of:

du --summarize --human-readable *

Explanation:

du: Disk Usage

-s: Display a summary for each specified file. (Equivalent to -d 0)

-h: "Human-readable" output. Use unit suffixes: Byte, Kibibyte (KiB), Mebibyte (MiB), Gibibyte (GiB), Tebibyte (TiB) and Pebibyte (PiB). (BASE2)

Python list directory, subdirectory, and files

Use os.path.join to concatenate the directory and file name:

for path, subdirs, files in os.walk(root):
for name in files:
print(os.path.join(path, name))

Note the usage of path and not root in the concatenation, since using root would be incorrect.


In Python 3.4, the pathlib module was added for easier path manipulations. So the equivalent to os.path.join would be:

pathlib.PurePath(path, name)

The advantage of pathlib is that you can use a variety of useful methods on paths. If you use the concrete Path variant you can also do actual OS calls through them, like changing into a directory, deleting the path, opening the file it points to and much more.

How do I get the size of sub directory from a directory in python?

To print the size of each immediate subdirectory and the total size for the parent directory similar to du -bcs */ command:

#!/usr/bin/env python3.6
"""Usage: du-bcs <parent-dir>"""
import os
import sys

if len(sys.argv) != 2:
sys.exit(__doc__) # print usage

parent_dir = sys.argv[1]
total = 0
for entry in os.scandir(parent_dir):
if entry.is_dir(follow_symlinks=False): # directory
size = get_tree_size_scandir(entry)
# print the size of each immediate subdirectory
print(size, entry.name, sep='\t')
elif entry.is_file(follow_symlinks=False): # regular file
size = entry.stat(follow_symlinks=False).st_size
else:
continue
total += size
print(total, parent_dir, sep='\t') # print the total size for the parent dir

where get_tree_size_scandir()[text in Russian, code in Python, C, C++, bash].

The size of a directory here is the apparent size of all regular files in it and its subdirectories recursively. It doesn't count the size for the directory entries themselves or the actual disk usage for the files. Related: why is the output of du often so different from du -b.

List files with path and file size only in Command Line

Get-ChildItem -Recurse | select FullName,Length | Format-Table -HideTableHeaders | Out-File filelist.txt

How do I list all the files in a directory and subdirectories in reverse chronological order?

Try this one:

find . -type f -printf "%T@ %p\n" | sort -nr | cut -d\  -f2-

How to list the size of each file and directory and sort by descending size in Bash?

Simply navigate to directory and run following command:

du -a --max-depth=1 | sort -n

OR add -h for human readable sizes and -r to print bigger directories/files first.

du -a -h --max-depth=1 | sort -hr


Related Topics



Leave a reply



Submit