Size() VS Ls -La VS Du -H Which One Is Correct Size

Size() vs ls -la vs du -h which one is correct size?

They are all correct, they just show different sizes.

  • ls shows size of the file (when you open and read it, that's how many bytes you will get)
  • du shows actual disk usage which can be smaller than the file size due to holes
  • size shows the size of the runtime image of an object/executable which is not directly related to the size of the file (bss uses no bytes in the file no matter how large, the file may contain debugging information that is not part of the runtime image, etc.)

If you want to know how much RAM/ROM an executable will take excluding dynamic memory allocation, size gives you the information you need.

finding size of a file using ls and du .what is difference

du shows how much disk the file uses. ls shows how big the file is. These two values can be different. Files with holes can take up less space than their size. Most files do not completely fill the blocks of the filesystem, so they take up more space than their size. A file with a single byte still takes up at least one full block. (512 or 1024 bytes, typically.) As an examle, consider a file with a single byte at position 183738475 (randomly typed numbers). That file can be stored on disk using a single block (whenever the kernel queries the filesystem for bytes other than the single byte in the file, the filesystem reports them as being zero, and there is no need to store anything. Not all filesystems work this way.) But the size of the file is 183738475, so ls will report that and du will report how many blocks are used by the filesystem. du -h will report the number of blocks used times the block size converted to a human readable format. Keep in mind that the actual numbers will vary depending on your filesystem. For example:

$ echo > foo; ls -l foo |awk '{print $5}'; du foo; du -h foo
1
8 foo
4.0K foo

This file is one byte in size but consumes 8 blocks on disk, and the block size is 512 so those 8 blocks consume 4k. (My filesystem has been optimized for large files, and small files waste a lot of space.)

Using ls to list directories and their total sizes

Try something like:

du -sh *

short version of:

du --summarize --human-readable *

Explanation:

du: Disk Usage

-s: Display a summary for each specified file. (Equivalent to -d 0)

-h: "Human-readable" output. Use unit suffixes: Byte, Kibibyte (KiB), Mebibyte (MiB), Gibibyte (GiB), Tebibyte (TiB) and Pebibyte (PiB). (BASE2)

How can I sort du -h output by size

As of GNU coreutils 7.5 released in August 2009, sort allows a -h parameter, which allows numeric suffixes of the kind produced by du -h:

du -hs * | sort -h

If you are using a sort that does not support -h, you can install GNU Coreutils. E.g. on an older Mac OS X:

brew install coreutils
du -hs * | gsort -h

From sort manual:

-h, --human-numeric-sort compare human readable numbers (e.g., 2K 1G)

Why doesn't total from ls -l add up to total file sizes listed?

You can find the definition of that line in the ls documentation for your platform. For coreutils ls (the one found on a lot of Linux systems), the information can be found via info coreutils ls:

For each directory that is listed, preface the files with a line
`total BLOCKS', where BLOCKS is the total disk allocation for all
files in that directory.

Investigating the size of an extremely small C program

du reports the disk space used by a file whereas ls reports the actual size of a file. Typically the size reported by du is significantly larger for small files.

You can significantly reduce the size of the binary by changing compile and linking options and stripping out unnecessary sections.

$ cat test.c
void _start() {
asm("movl $1,%eax;"
"xorl %ebx,%ebx;"
"int $0x80");
}

$ gcc -s -nostdlib test.c -o test
$ ./test
$ ls -l test
-rwxrwxr-x 1 fpm fpm 8840 Dec 9 04:09 test

$ readelf -W --section-headers test
There are 7 section headers, starting at offset 0x20c8:

Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .note.gnu.build-id NOTE 0000000000400190 000190 000024 00 A 0 0 4
[ 2] .text PROGBITS 0000000000401000 001000 000010 00 AX 0 0 1
[ 3] .eh_frame_hdr PROGBITS 0000000000402000 002000 000014 00 A 0 0 4
[ 4] .eh_frame PROGBITS 0000000000402018 002018 000038 00 A 0 0 8
[ 5] .comment PROGBITS 0000000000000000 002050 00002e 01 MS 0 0 1
[ 6] .shstrtab STRTAB 0000000000000000 00207e 000045 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
l (large), p (processor specific)
$

$ gcc -s -nostdlib -Wl,--nmagic test.c -o test
$ ls -l test
-rwxrwxr-x 1 fpm fpm 984 Dec 9 16:55 test
$ strip -R .comment -R .note.gnu.build-id test
$ strip -R .eh_frame_hdr -R .eh_frame test
$ ls -l test
-rwxrwxr-x 1 fpm fpm 520 Dec 9 17:03 test
$

Note that clang can produce a significantly smaller binary than gcc by default in this particular instance. However, after compiling with clang and stripping unnecessary sections, the final size of the binary is 736 bytes, which is bigger than the 520 bytes possible with gcc -s -nostdlib -Wl,--nmagic test.c -o test.

$ clang -static -nostdlib -flto -fuse-ld=lld -o test test.c
$ ls -l test
-rwxrwxr-x 1 fpm fpm 1344 Dec 9 04:15 test
$

$ readelf -W --section-headers test
There are 9 section headers, starting at offset 0x300:

Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .note.gnu.build-id NOTE 0000000000200190 000190 000018 00 A 0 0 4
[ 2] .eh_frame_hdr PROGBITS 00000000002001a8 0001a8 000014 00 A 0 0 4
[ 3] .eh_frame PROGBITS 00000000002001c0 0001c0 00003c 00 A 0 0 8
[ 4] .text PROGBITS 0000000000201200 000200 00000f 00 AX 0 0 16
[ 5] .comment PROGBITS 0000000000000000 00020f 000040 01 MS 0 0 1
[ 6] .symtab SYMTAB 0000000000000000 000250 000048 18 8 2 8
[ 7] .shstrtab STRTAB 0000000000000000 000298 000055 00 0 0 1
[ 8] .strtab STRTAB 0000000000000000 0002ed 000012 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
l (large), p (processor specific)
$

$ strip -R .eh_frame_hdr -R .eh_frame test
$ strip -R .comment -R .note.gnu.build-id test
strip: test: warning: empty loadable segment detected at vaddr=0x200000, is this intentional?
$ ls -l test
-rwxrwxr-x 1 fpm fpm 736 Dec 9 04:19 test
$ readelf -W --section-headers test
There are 3 section headers, starting at offset 0x220:

Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 0000000000201200 000200 00000f 00 AX 0 0 16
[ 2] .shstrtab STRTAB 0000000000000000 00020f 000011 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
l (large), p (processor specific)
$

.text is your code, .shstrtab is the Section Header String table. Each ElfHeader structure contains an e_shstrndx member which is an index into the .shstrtab table. If you use this index, you can find the name of that section.

Rsync - destination is smaller, even though files are the same

The sizes that du and ls report are different: du reports the amount of space actually allocated on the filesystem while ls reports the the logical file size.

There are several questions on various StackExchange sites about this.

Why does du report different sizes on your two machines? Because they are either using different filesystems or they are configured differently. It all boils down to the block sizes used on the filesystem, which is what du reports.



Related Topics



Leave a reply



Submit