When Running Ls -L, Why Does the Filesize on a Directory Not Match the Output of Du

Size() vs ls -la vs du -h which one is correct size?

They are all correct, they just show different sizes.

  • ls shows size of the file (when you open and read it, that's how many bytes you will get)
  • du shows actual disk usage which can be smaller than the file size due to holes
  • size shows the size of the runtime image of an object/executable which is not directly related to the size of the file (bss uses no bytes in the file no matter how large, the file may contain debugging information that is not part of the runtime image, etc.)

If you want to know how much RAM/ROM an executable will take excluding dynamic memory allocation, size gives you the information you need.

Using ls to list directories and their total sizes

Try something like:

du -sh *

short version of:

du --summarize --human-readable *

Explanation:

du: Disk Usage

-s: Display a summary for each specified file. (Equivalent to -d 0)

-h: "Human-readable" output. Use unit suffixes: Byte, Kibibyte (KiB), Mebibyte (MiB), Gibibyte (GiB), Tebibyte (TiB) and Pebibyte (PiB). (BASE2)

Using find and du, to display folder size

I have a proposal in two steps.

Step 1, get the find that works for you and identify the columns that you are going to need. Try with:

cd /srv/nexus-data/storage
find . -name '*.nupkg' -mtime +175 -type f -exec ls -l {} \;

I prefer to cd to the directory you want and avoid all the subdirectories that you are not interested in.

Step 2. Given the output to that, locate the columns that contain the size of the file (in my case, #5) and the name of the file (in my case #9). Then, run the find through the following:

find . -name '*.nupkg' -mtime +175 -type f -exec ls -l {} \; | awk '
{
split($9, a, "/");
sum[a[2]] += $5;
}

END {
for (i in sum) {
print i, sum[i];
}
}
'

If necessary, change $5 and $9 with the columns for the size and the file name, respectively (awk will count columns around white space, $1 being the first and so on.) awk will process all the files, splitting the name around the character / so that array a has the result and a[2] the first subdirectory that you want summed. The END bit will output the accumulated sums along with the subdirectories they belong to.

The awk script may be turned into a one-liner, but I leave it in its expanded form for clariry.

EDIT: Using @CharlesDuffy excellent suggestion and assuming you have GNU find (and no filenames with newlines in them), you could use:

find . -name '*.nupkg' -mtime +175 -type f -printf '%s\t%h\n' | awk '
{
split($2, a, "/");
sum[a[2]] += $1;
}

END {
for (i in sum) {
print i, sum[i];
}
}
'

Finally, just for the sake of completeness, this one can do with newlines too:

find . -name '*.nupkg' -mtime +175 -type f -printf '%s\t%h\0' | awk '
BEGIN {
RS="\0";
}

{
split($2, a, "/");
sum[a[2]] += $1;
}

END {
for (i in sum) {
print i, sum[i];
}
}
'

linux command to get size of files and directories present in a particular folder?

Use ls command for files and du command for directories.

Checking File Sizes

ls -l filename   #Displays Size of the specified file
ls -l * #Displays Size of All the files in the current directory
ls -al * #Displays Size of All the files including hidden files in the current directory
ls -al dir/ #Displays Size of All the files including hidden files in the 'dir' directory

ls command will not list the actual size of directories(why?). Therefore, we use du for this purpose.

Checking Directory sizes

du -sh directory_name    #Gives you the summarized(-s) size of the directory in human readable(-h) format
du -bsh * #Gives you the apparent(-b) summarized(-s) size of all the files and directories in the current directory in human readable(-h) format

Including -h option in any of the above commands (for Ex: ls -lh * or du -sh) will give you size in human readable format (kb, mb,gb, ...)

For more information see man ls and man du

hadoop hdfs directory size shown as 0

It is working as designed. Hadoop is designed for big files and one should not expect it to give the size of each and every time one run hadoop fs -ls command. If Hadoop works in way you want then try to think from another person point of view who might just want to see whether directory exists or not; but end up waiting long time just because Hadoop is calculating size of folder; not so good.

git - show filenames and filesize of changed files?

By this:

git diff --name-only

You should get:

path/to/changed/file.1
path/to/changed/file.2
path/to/changed/file.3

Since this command shows changed files relative to the project's root directory, you must change directory to root of the project (e.g. where file .gitignore exists).

Now you can get also size of each changed file besides by running this:

git diff --name-only  | xargs du -hs

A sample output is:

5.0K    path/to/changed/file.1
15.0K path/to/changed/file.2
14.0K path/to/changed/file.3

If you want exactly your favorite result, run this:

git diff --name-only  | xargs du -hs | awk '{ for (i=NF; i>1; i--) printf("%s ",$i); print $1; }'

And the output:

path/to/changed/file.1  5.0K
path/to/changed/file.2 15.0K
path/to/changed/file.3 14.0K

How to list the size of each file and directory and sort by descending size in Bash?

Simply navigate to directory and run following command:

du -a --max-depth=1 | sort -n

OR add -h for human readable sizes and -r to print bigger directories/files first.

du -a -h --max-depth=1 | sort -hr


Related Topics



Leave a reply



Submit