File Size in Human Readable Format

File size in human readable format

GNU Coreutils contains an apparently rather unknown little tool called numfmt for numeric conversion, that does what you need:

$ numfmt --to=iec-i --suffix=B --format="%.3f" 4953205820
4.614GiB

I think that suits your needs well, and isn’t as large or hackish as the other answers.

If you want a more powerful solution, look at my other answer.

Get human readable version of file size?

Addressing the above "too small a task to require a library" issue by a straightforward implementation (using f-strings, so Python 3.6+):

def sizeof_fmt(num, suffix="B"):
for unit in ["", "Ki", "Mi", "Gi", "Ti", "Pi", "Ei", "Zi"]:
if abs(num) < 1024.0:
return f"{num:3.1f}{unit}{suffix}"
num /= 1024.0
return f"{num:.1f}Yi{suffix}"

Supports:

  • all currently known binary prefixes
  • negative and positive numbers
  • numbers larger than 1000 Yobibytes
  • arbitrary units (maybe you like to count in Gibibits!)

Example:

>>> sizeof_fmt(168963795964)
'157.4GiB'

by Fred Cirera

human readable file size

Try something like this:

function humanFileSize($size,$unit="") {
if( (!$unit && $size >= 1<<30) || $unit == "GB")
return number_format($size/(1<<30),2)."GB";
if( (!$unit && $size >= 1<<20) || $unit == "MB")
return number_format($size/(1<<20),2)."MB";
if( (!$unit && $size >= 1<<10) || $unit == "KB")
return number_format($size/(1<<10),2)."KB";
return number_format($size)." bytes";
}

Converting file size in bytes to human-readable string

It depends on whether you want to use the binary or decimal convention.

RAM, for instance, is always measured in binary, so to express 1551859712 as ~1.4GiB would be correct.

On the other hand, hard disk manufacturers like to use decimal, so they would call it ~1.6GB.

And just to be confusing, floppy disks use a mixture of the two systems - their 1MB is actually 1024000 bytes.

Can I use stat to show human readable size of file?

As far as I know, the stat program cannot display human readable sizes by itself. But you can always pipe it to another program that does it, such as numfmt:

stat -c %s /path/to/file | numfmt --to=iec

Applied to your example, it would be:

filelist=$(ls -p | grep -v/)
filesize=$(
stat -c "%s %n" $filelist | sort -nr -k1 | while read filesize filename; do
printf '%s : %s\n' "$(numfmt --to=iec <<< $filesize)" "$filename"
done
)

Please note I added the -k1 option when calling sort because I assume you want to sort using the size, not the name.

numfmt has the advantage that you can choose how to want to display the human readable size. I suggested --to=iec because this is the most common for file sizes, but you may want to use other conversions. Please refer to the numfmt man page.

As a last note, I would advise you against storing files directly out of the $() capture because it will not work when a filename contains a space character. You could use find to list the files and get the size at the same time, e.g.:

find . -mindepth 1 -maxdepth 1 -not -type d -printf '%s %f\n' |
sort -nr -k1 |
while read filesize filename
do
printf '%s : %s\n' "$(numfmt --to=iec <<< $filesize)" "$filename"
done

How to convert file size to human readable and print with other columns?

Just use -h --si

  -h, --human-readable       with -l and -s, print sizes like 1K 234M 2G etc.
--si likewise, but use powers of 1000 not 1024

So the command would be

ls -lh --si | tail -n +2

If you don't use ls and the command you intend to run doesn't have an option similar to -h --si in ls then numfmt already has the --field option to specify which column you want to format. For example


$ df | LC_ALL=en_US.UTF-8 numfmt --header --field 2-4 --to=si
Filesystem 1K-blocks Used Available Use% Mounted on
udev 66M 0 66M 0% /dev
tmpfs 14M 7.2K 14M 1% /run
/dev/mapper/vg0-lv--0 4.1G 3.7G 416M 90% /
tmpfs 5.2K 4 5.2K 1% /run/lock
/dev/nvme2n1p1 524K 5.4K 518K 2% /boot/efi

Unfortunately although numfmt does try to preserve the columnation, it fails if there are some large variation in the line length after inserting group separators like you can see above. So sometimes you might still need to reformat the table with column

df | LC_ALL=en_US.UTF-8 numfmt --header --field 2-4 --to=si | column -t -R 2,3,4,5

The -R 2,3,4,5 option is for right alignment, but some column versions like the default one in Ubuntu don't support it so you need to remove that


Alternatively you can also use awk to format only the columns you want, for example column 5 in case of ls


$ ls -l demo* | awk -v K=1e3 -v M=1e6 -v G=1e9 'func format(v) {
if (v > G) return v/G "G"; else if (v > M) return v/M "M";
else if (v > K) return v/K "K"; else return v
} { $5 = format($5); print $0 }' | column -t

-rw-rw-r-- 1 ph ph 280K Jun 18 09:23 demo1
-rw-rw-r-- 1 ph ph 2.8M Jun 18 09:24 demo2
-rw-rw-r-- 1 ph ph 28M Jun 18 09:23 demo3
-rw-rw-r-- 1 ph ph 2.8G Jun 18 09:30 demo4

And column 2, 3, 4 in case of df


# M=1000 and G=1000000 because df output is 1K-block, not bytes
$ df | awk -v M=1000 -v G=1000000 'func format(v) {
if (v > G) return v/G "G"; else if (v > M) return v/M "M"; else return v
}
{
# Format only columns 2, 3 and 4, ignore header
if (NR > 1) { $2 = format($2); $3 = format($3); $4 = format($4) }
print $0
}'
OFS="\t" | column -t

Filesystem 1K-blocks Used Available Use% Mounted on
udev 65.8273G 0 65.8273G 0% /dev
tmpfs 13.1772G 7M 13.1702G 1% /run
/dev/mapper/vg0-lv--0 4073.78G 3619.05G 415.651G 90% /
tmpfs 65.8861G 0 65.8861G 0% /dev/shm
tmpfs 5.12M 4 5.116M 1% /run/lock
tmpfs 65.8861G 0 65.8861G 0% /sys/fs/cgroup
/dev/nvme2n1p2 999.32M 363.412M 567.096M 40% /boot

How do I get a human-readable file size in bytes abbreviation using .NET?

This may not the most efficient or optimized way to do it, but it's easier to read if you are not familiar with log maths, and should be fast enough for most scenarios.

string[] sizes = { "B", "KB", "MB", "GB", "TB" };
double len = new FileInfo(filename).Length;
int order = 0;
while (len >= 1024 && order < sizes.Length - 1) {
order++;
len = len/1024;
}

// Adjust the format string to your preferences. For example "{0:0.#}{1}" would
// show a single decimal place, and no space.
string result = String.Format("{0:0.##} {1}", len, sizes[order]);

command to print out large files, sorted, with sizes in human readable format


find ... | sort -rn | cut -d\  -f2 | xargs df -h

for instance :) or

find $dir -type -f size +$size -print0 | xargs -0 ls -1hsS

(with a little inspiration borrowed from olibre).

How can I convert byte size into a human-readable format in Java?


Fun fact: The original snippet posted here was the most copied Java snippet of all time on Stack Overflow, and it was flawed. It was fixed, but it got messy.

Full story in this article: The most copied Stack Overflow snippet of all time is flawed!

Source: Formatting byte size to human readable format | Programming.Guide

SI (1 k = 1,000)

public static String humanReadableByteCountSI(long bytes) {
if (-1000 < bytes && bytes < 1000) {
return bytes + " B";
}
CharacterIterator ci = new StringCharacterIterator("kMGTPE");
while (bytes <= -999_950 || bytes >= 999_950) {
bytes /= 1000;
ci.next();
}
return String.format("%.1f %cB", bytes / 1000.0, ci.current());
}

Binary (1 Ki = 1,024)

public static String humanReadableByteCountBin(long bytes) {
long absB = bytes == Long.MIN_VALUE ? Long.MAX_VALUE : Math.abs(bytes);
if (absB < 1024) {
return bytes + " B";
}
long value = absB;
CharacterIterator ci = new StringCharacterIterator("KMGTPE");
for (int i = 40; i >= 0 && absB > 0xfffccccccccccccL >> i; i -= 10) {
value >>= 10;
ci.next();
}
value *= Long.signum(bytes);
return String.format("%.1f %ciB", value / 1024.0, ci.current());
}

Example output:

                             SI     BINARY

0: 0 B 0 B
27: 27 B 27 B
999: 999 B 999 B
1000: 1.0 kB 1000 B
1023: 1.0 kB 1023 B
1024: 1.0 kB 1.0 KiB
1728: 1.7 kB 1.7 KiB
110592: 110.6 kB 108.0 KiB
7077888: 7.1 MB 6.8 MiB
452984832: 453.0 MB 432.0 MiB
28991029248: 29.0 GB 27.0 GiB
1855425871872: 1.9 TB 1.7 TiB
9223372036854775807: 9.2 EB 8.0 EiB (Long.MAX_VALUE)

Human readable size units (file sizes) for scala code (like Duration)

I've just stumbled upon squants. As stated in their own site:

Squants is a framework of data types and a domain specific language
(DSL) for representing Quantities, their Units of Measure, and their
Dimensional relationships. The API supports typesafe dimensional
analysis, improved domain models and more. All types are immutable and
thread-safe.

With squants you can do:

10.kib
10.kibibytes
50.mib
100.gib

Although i didn't like that the unit symbols are all lowercase (i.e. gib instead of GiB)



Related Topics



Leave a reply



Submit