Get Human Readable Version of File Size

Get human readable version of file size?

Addressing the above "too small a task to require a library" issue by a straightforward implementation (using f-strings, so Python 3.6+):

def sizeof_fmt(num, suffix="B"):
for unit in ["", "Ki", "Mi", "Gi", "Ti", "Pi", "Ei", "Zi"]:
if abs(num) < 1024.0:
return f"{num:3.1f}{unit}{suffix}"
num /= 1024.0
return f"{num:.1f}Yi{suffix}"

Supports:

  • all currently known binary prefixes
  • negative and positive numbers
  • numbers larger than 1000 Yobibytes
  • arbitrary units (maybe you like to count in Gibibits!)

Example:

>>> sizeof_fmt(168963795964)
'157.4GiB'

by Fred Cirera

File size in human readable format

GNU Coreutils contains an apparently rather unknown little tool called numfmt for numeric conversion, that does what you need:

$ numfmt --to=iec-i --suffix=B --format="%.3f" 4953205820
4.614GiB

I think that suits your needs well, and isn’t as large or hackish as the other answers.

If you want a more powerful solution, look at my other answer.

Converting file size in bytes to human-readable string

It depends on whether you want to use the binary or decimal convention.

RAM, for instance, is always measured in binary, so to express 1551859712 as ~1.4GiB would be correct.

On the other hand, hard disk manufacturers like to use decimal, so they would call it ~1.6GB.

And just to be confusing, floppy disks use a mixture of the two systems - their 1MB is actually 1024000 bytes.

How do I get a human-readable file size in bytes abbreviation using .NET?

This may not the most efficient or optimized way to do it, but it's easier to read if you are not familiar with log maths, and should be fast enough for most scenarios.

string[] sizes = { "B", "KB", "MB", "GB", "TB" };
double len = new FileInfo(filename).Length;
int order = 0;
while (len >= 1024 && order < sizes.Length - 1) {
order++;
len = len/1024;
}

// Adjust the format string to your preferences. For example "{0:0.#}{1}" would
// show a single decimal place, and no space.
string result = String.Format("{0:0.##} {1}", len, sizes[order]);

human readable file size

Try something like this:

function humanFileSize($size,$unit="") {
if( (!$unit && $size >= 1<<30) || $unit == "GB")
return number_format($size/(1<<30),2)."GB";
if( (!$unit && $size >= 1<<20) || $unit == "MB")
return number_format($size/(1<<20),2)."MB";
if( (!$unit && $size >= 1<<10) || $unit == "KB")
return number_format($size/(1<<10),2)."KB";
return number_format($size)." bytes";
}

Parse human-readable filesizes into bytes

Here's a slightly prettier version. There's probably no module for this, just define the function inline. It's very small and readable.

units = {"B": 1, "KB": 10**3, "MB": 10**6, "GB": 10**9, "TB": 10**12}

# Alternative unit definitions, notably used by Windows:
# units = {"B": 1, "KB": 2**10, "MB": 2**20, "GB": 2**30, "TB": 2**40}

def parse_size(size):
number, unit = [string.strip() for string in size.split()]
return int(float(number)*units[unit])

example_strings = ["10.43 KB", "11 GB", "343.1 MB"]

for example_string in example_strings:
print(parse_size(example_string))

10680
11811160064
359766426

(Note that different places use slightly different conventions for the definitions of KB, MB, etc -- either using powers of 10**3 = 1000 or powers of 2**10 = 1024. If your context is Windows, you will want to use the latter. If your context is Mac OS, you will want to use the former.)

Python libraries to calculate human readable filesize from bytes?

This isn't really hard to implement yourself:

suffixes = ['B', 'KB', 'MB', 'GB', 'TB', 'PB']
def humansize(nbytes):
i = 0
while nbytes >= 1024 and i < len(suffixes)-1:
nbytes /= 1024.
i += 1
f = ('%.2f' % nbytes).rstrip('0').rstrip('.')
return '%s %s' % (f, suffixes[i])

Examples:

>>> humansize(131)
'131 B'
>>> humansize(1049)
'1.02 KB'
>>> humansize(58812)
'57.43 KB'
>>> humansize(68819826)
'65.63 MB'
>>> humansize(39756861649)
'37.03 GB'
>>> humansize(18754875155724)
'17.06 TB'

How to convert file size to human readable and print with other columns?

Just use -h --si

  -h, --human-readable       with -l and -s, print sizes like 1K 234M 2G etc.
--si likewise, but use powers of 1000 not 1024

So the command would be

ls -lh --si | tail -n +2

If you don't use ls and the command you intend to run doesn't have an option similar to -h --si in ls then numfmt already has the --field option to specify which column you want to format. For example


$ df | LC_ALL=en_US.UTF-8 numfmt --header --field 2-4 --to=si
Filesystem 1K-blocks Used Available Use% Mounted on
udev 66M 0 66M 0% /dev
tmpfs 14M 7.2K 14M 1% /run
/dev/mapper/vg0-lv--0 4.1G 3.7G 416M 90% /
tmpfs 5.2K 4 5.2K 1% /run/lock
/dev/nvme2n1p1 524K 5.4K 518K 2% /boot/efi

Unfortunately although numfmt does try to preserve the columnation, it fails if there are some large variation in the line length after inserting group separators like you can see above. So sometimes you might still need to reformat the table with column

df | LC_ALL=en_US.UTF-8 numfmt --header --field 2-4 --to=si | column -t -R 2,3,4,5

The -R 2,3,4,5 option is for right alignment, but some column versions like the default one in Ubuntu don't support it so you need to remove that


Alternatively you can also use awk to format only the columns you want, for example column 5 in case of ls


$ ls -l demo* | awk -v K=1e3 -v M=1e6 -v G=1e9 'func format(v) {
if (v > G) return v/G "G"; else if (v > M) return v/M "M";
else if (v > K) return v/K "K"; else return v
} { $5 = format($5); print $0 }' | column -t

-rw-rw-r-- 1 ph ph 280K Jun 18 09:23 demo1
-rw-rw-r-- 1 ph ph 2.8M Jun 18 09:24 demo2
-rw-rw-r-- 1 ph ph 28M Jun 18 09:23 demo3
-rw-rw-r-- 1 ph ph 2.8G Jun 18 09:30 demo4

And column 2, 3, 4 in case of df


# M=1000 and G=1000000 because df output is 1K-block, not bytes
$ df | awk -v M=1000 -v G=1000000 'func format(v) {
if (v > G) return v/G "G"; else if (v > M) return v/M "M"; else return v
}
{
# Format only columns 2, 3 and 4, ignore header
if (NR > 1) { $2 = format($2); $3 = format($3); $4 = format($4) }
print $0
}'
OFS="\t" | column -t

Filesystem 1K-blocks Used Available Use% Mounted on
udev 65.8273G 0 65.8273G 0% /dev
tmpfs 13.1772G 7M 13.1702G 1% /run
/dev/mapper/vg0-lv--0 4073.78G 3619.05G 415.651G 90% /
tmpfs 65.8861G 0 65.8861G 0% /dev/shm
tmpfs 5.12M 4 5.116M 1% /run/lock
tmpfs 65.8861G 0 65.8861G 0% /sys/fs/cgroup
/dev/nvme2n1p2 999.32M 363.412M 567.096M 40% /boot

How can I convert byte size into a human-readable format in Java?

Fun fact: The original snippet posted here was the most copied Java snippet of all time on Stack Overflow, and it was flawed. It was fixed, but it got messy.

Full story in this article: The most copied Stack Overflow snippet of all time is flawed!

Source: Formatting byte size to human readable format | Programming.Guide

SI (1 k = 1,000)

public static String humanReadableByteCountSI(long bytes) {
if (-1000 < bytes && bytes < 1000) {
return bytes + " B";
}
CharacterIterator ci = new StringCharacterIterator("kMGTPE");
while (bytes <= -999_950 || bytes >= 999_950) {
bytes /= 1000;
ci.next();
}
return String.format("%.1f %cB", bytes / 1000.0, ci.current());
}

Binary (1 Ki = 1,024)

public static String humanReadableByteCountBin(long bytes) {
long absB = bytes == Long.MIN_VALUE ? Long.MAX_VALUE : Math.abs(bytes);
if (absB < 1024) {
return bytes + " B";
}
long value = absB;
CharacterIterator ci = new StringCharacterIterator("KMGTPE");
for (int i = 40; i >= 0 && absB > 0xfffccccccccccccL >> i; i -= 10) {
value >>= 10;
ci.next();
}
value *= Long.signum(bytes);
return String.format("%.1f %ciB", value / 1024.0, ci.current());
}

Example output:

                             SI     BINARY

0: 0 B 0 B
27: 27 B 27 B
999: 999 B 999 B
1000: 1.0 kB 1000 B
1023: 1.0 kB 1023 B
1024: 1.0 kB 1.0 KiB
1728: 1.7 kB 1.7 KiB
110592: 110.6 kB 108.0 KiB
7077888: 7.1 MB 6.8 MiB
452984832: 453.0 MB 432.0 MiB
28991029248: 29.0 GB 27.0 GiB
1855425871872: 1.9 TB 1.7 TiB
9223372036854775807: 9.2 EB 8.0 EiB (Long.MAX_VALUE)

byte to human readable size with npm package filesize

The SI units are (mostly) based on decimal fractions, so do their prefixes:

  • kilo (K): 103 = 1,000
  • mega (M): 106 = 1,000,000
  • giga (G): 109 = 1,000,000,000

When digital base 2 computers were developed they invented new prefixes. Agreement about values was soon reached but it wasn't easy to find catchy names. Unfortunately, the names that eventually spread where the SI ones, so we ended up with a nice confusion:

  • kilo (K): 210 = 1,024
  • mega (M): 220 = 1,048,576
  • giga (G): 230 = 1,073,741,824

Then, someone invented some new names that were arguably not as bad as previous ones, but it was too late and almost nobody uses them:

  • kibi (Ki): 210 = 1,024
  • mebi (Mi): 220 = 1,048,576
  • gibi (Gi): 230 = 1,073,741,824

In computers almost everything is a power of 2 so decimal-based units are usually avoided because they are never round.

In your example, using base 2 and base 10 prefixes renders this:

  • 10485760000 / 230 = 9.765625 GiB
  • 10485760000 / 109 = 10.48576 GB

The value you want it probably the first one given that it's a file size.



Related Topics



Leave a reply



Submit