Get Human Readable Version of File Size

Get human readable version of file size?

Addressing the above "too small a task to require a library" issue by a straightforward implementation (using f-strings, so Python 3.6+):

def sizeof_fmt(num, suffix="B"):
    for unit in ["", "Ki", "Mi", "Gi", "Ti", "Pi", "Ei", "Zi"]:
        if abs(num) < 1024.0:
            return f"{num:3.1f}{unit}{suffix}"
        num /= 1024.0
    return f"{num:.1f}Yi{suffix}"

Supports:

all currently known binary prefixes
negative and positive numbers
numbers larger than 1000 Yobibytes
arbitrary units (maybe you like to count in Gibibits!)

Example:

>>> sizeof_fmt(168963795964)
'157.4GiB'

by Fred Cirera

File size in human readable format

GNU Coreutils contains an apparently rather unknown little tool called numfmt for numeric conversion, that does what you need:

$ numfmt --to=iec-i --suffix=B --format="%.3f" 4953205820
4.614GiB

I think that suits your needs well, and isn’t as large or hackish as the other answers.

If you want a more powerful solution, look at my other answer.

Converting file size in bytes to human-readable string

It depends on whether you want to use the binary or decimal convention.

RAM, for instance, is always measured in binary, so to express 1551859712 as ~1.4GiB would be correct.

On the other hand, hard disk manufacturers like to use decimal, so they would call it ~1.6GB.

And just to be confusing, floppy disks use a mixture of the two systems - their 1MB is actually 1024000 bytes.

How do I get a human-readable file size in bytes abbreviation using .NET?

This may not the most efficient or optimized way to do it, but it's easier to read if you are not familiar with log maths, and should be fast enough for most scenarios.

string[] sizes = { "B", "KB", "MB", "GB", "TB" };
double len = new FileInfo(filename).Length;
int order = 0;
while (len >= 1024 && order < sizes.Length - 1) {
    order++;
    len = len/1024;
}

// Adjust the format string to your preferences. For example "{0:0.#}{1}" would
// show a single decimal place, and no space.
string result = String.Format("{0:0.##} {1}", len, sizes[order]);

human readable file size

Try something like this:

function humanFileSize($size,$unit="") {
  if( (!$unit && $size >= 1<<30) || $unit == "GB")
    return number_format($size/(1<<30),2)."GB";
  if( (!$unit && $size >= 1<<20) || $unit == "MB")
    return number_format($size/(1<<20),2)."MB";
  if( (!$unit && $size >= 1<<10) || $unit == "KB")
    return number_format($size/(1<<10),2)."KB";
  return number_format($size)." bytes";
}

Parse human-readable filesizes into bytes

Here's a slightly prettier version. There's probably no module for this, just define the function inline. It's very small and readable.

units = {"B": 1, "KB": 10**3, "MB": 10**6, "GB": 10**9, "TB": 10**12}

# Alternative unit definitions, notably used by Windows:
# units = {"B": 1, "KB": 2**10, "MB": 2**20, "GB": 2**30, "TB": 2**40}

def parse_size(size):
    number, unit = [string.strip() for string in size.split()]
    return int(float(number)*units[unit])

example_strings = ["10.43 KB", "11 GB", "343.1 MB"]

for example_string in example_strings:
    print(parse_size(example_string))

10680
11811160064
359766426

(Note that different places use slightly different conventions for the definitions of KB, MB, etc -- either using powers of 10**3 = 1000 or powers of 2**10 = 1024. If your context is Windows, you will want to use the latter. If your context is Mac OS, you will want to use the former.)

Python libraries to calculate human readable filesize from bytes?

This isn't really hard to implement yourself:

suffixes = ['B', 'KB', 'MB', 'GB', 'TB', 'PB']
def humansize(nbytes):
    i = 0
    while nbytes >= 1024 and i < len(suffixes)-1:
        nbytes /= 1024.
        i += 1
    f = ('%.2f' % nbytes).rstrip('0').rstrip('.')
    return '%s %s' % (f, suffixes[i])

Examples:

>>> humansize(131)
'131 B'
>>> humansize(1049)
'1.02 KB'
>>> humansize(58812)
'57.43 KB'
>>> humansize(68819826)
'65.63 MB'
>>> humansize(39756861649)
'37.03 GB'
>>> humansize(18754875155724)
'17.06 TB'

How to convert file size to human readable and print with other columns?

Just use -h --si

  -h, --human-readable       with -l and -s, print sizes like 1K 234M 2G etc.
      --si                   likewise, but use powers of 1000 not 1024

So the command would be

ls -lh --si | tail -n +2

If you don't use ls and the command you intend to run doesn't have an option similar to -h --si in ls then numfmt already has the --field option to specify which column you want to format. For example


$ df | LC_ALL=en_US.UTF-8 numfmt --header --field 2-4 --to=si
Filesystem              1K-blocks       Used  Available Use% Mounted on
udev                          66M          0        66M   0% /dev
tmpfs                         14M       7.2K        14M   1% /run
/dev/mapper/vg0-lv--0        4.1G       3.7G       416M  90% /
tmpfs                        5.2K          4       5.2K   1% /run/lock
/dev/nvme2n1p1               524K       5.4K       518K   2% /boot/efi

Unfortunately although numfmt does try to preserve the columnation, it fails if there are some large variation in the line length after inserting group separators like you can see above. So sometimes you might still need to reformat the table with column

df | LC_ALL=en_US.UTF-8 numfmt --header --field 2-4 --to=si | column -t -R 2,3,4,5

The -R 2,3,4,5 option is for right alignment, but some column versions like the default one in Ubuntu don't support it so you need to remove that

Alternatively you can also use awk to format only the columns you want, for example column 5 in case of ls


$ ls -l demo* | awk -v K=1e3 -v M=1e6 -v G=1e9 'func format(v) {
  if (v > G) return v/G "G"; else if (v > M) return v/M "M";
  else if (v > K) return v/K "K"; else return v
} { $5 = format($5); print $0 }' | column -t
-rw-rw-r--  1  ph  ph  280K  Jun  18  09:23  demo1
-rw-rw-r--  1  ph  ph  2.8M  Jun  18  09:24  demo2
-rw-rw-r--  1  ph  ph  28M   Jun  18  09:23  demo3
-rw-rw-r--  1  ph  ph  2.8G  Jun  18  09:30  demo4

And column 2, 3, 4 in case of df


# M=1000 and G=1000000 because df output is 1K-block, not bytes
$ df | awk -v M=1000 -v G=1000000 'func format(v) {
  if (v > G) return v/G "G"; else if (v > M) return v/M "M"; else return v
}
{
  # Format only columns 2, 3 and 4, ignore header
  if (NR > 1) { $2 = format($2); $3 = format($3); $4 = format($4) }
  print $0
}' OFS="\t" | column -t
Filesystem              1K-blocks  Used      Available  Use%  Mounted                 on
udev                    65.8273G   0         65.8273G   0%    /dev
tmpfs                   13.1772G   7M        13.1702G   1%    /run
/dev/mapper/vg0-lv--0   4073.78G   3619.05G  415.651G   90%   /
tmpfs                   65.8861G   0         65.8861G   0%    /dev/shm
tmpfs                   5.12M      4         5.116M     1%    /run/lock
tmpfs                   65.8861G   0         65.8861G   0%    /sys/fs/cgroup
/dev/nvme2n1p2          999.32M    363.412M  567.096M   40%   /boot

How can I convert byte size into a human-readable format in Java?

Fun fact: The original snippet posted here was the most copied Java snippet of all time on Stack Overflow, and it was flawed. It was fixed, but it got messy.
Full story in this article: The most copied Stack Overflow snippet of all time is flawed!

Source: Formatting byte size to human readable format | Programming.Guide

SI (1 k = 1,000)

public static String humanReadableByteCountSI(long bytes) {
    if (-1000 < bytes && bytes < 1000) {
        return bytes + " B";
    }
    CharacterIterator ci = new StringCharacterIterator("kMGTPE");
    while (bytes <= -999_950 || bytes >= 999_950) {
        bytes /= 1000;
        ci.next();
    }
    return String.format("%.1f %cB", bytes / 1000.0, ci.current());
}

Binary (1 Ki = 1,024)

public static String humanReadableByteCountBin(long bytes) {
    long absB = bytes == Long.MIN_VALUE ? Long.MAX_VALUE : Math.abs(bytes);
    if (absB < 1024) {
        return bytes + " B";
    }
    long value = absB;
    CharacterIterator ci = new StringCharacterIterator("KMGTPE");
    for (int i = 40; i >= 0 && absB > 0xfffccccccccccccL >> i; i -= 10) {
        value >>= 10;
        ci.next();
    }
    value *= Long.signum(bytes);
    return String.format("%.1f %ciB", value / 1024.0, ci.current());
}

Example output:

                             SI     BINARY

                  0:        0 B        0 B
                 27:       27 B       27 B
                999:      999 B      999 B
               1000:     1.0 kB     1000 B
               1023:     1.0 kB     1023 B
               1024:     1.0 kB    1.0 KiB
               1728:     1.7 kB    1.7 KiB
             110592:   110.6 kB  108.0 KiB
            7077888:     7.1 MB    6.8 MiB
          452984832:   453.0 MB  432.0 MiB
        28991029248:    29.0 GB   27.0 GiB
      1855425871872:     1.9 TB    1.7 TiB
9223372036854775807:     9.2 EB    8.0 EiB   (Long.MAX_VALUE)

byte to human readable size with npm package filesize

The SI units are (mostly) based on decimal fractions, so do their prefixes:

kilo (K): 10³ = 1,000
mega (M): 10⁶ = 1,000,000
giga (G): 10⁹ = 1,000,000,000

When digital base 2 computers were developed they invented new prefixes. Agreement about values was soon reached but it wasn't easy to find catchy names. Unfortunately, the names that eventually spread where the SI ones, so we ended up with a nice confusion:

kilo (K): 2¹⁰ = 1,024
mega (M): 2²⁰ = 1,048,576
giga (G): 2³⁰ = 1,073,741,824

Then, someone invented some new names that were arguably not as bad as previous ones, but it was too late and almost nobody uses them:

kibi (Ki): 2¹⁰ = 1,024
mebi (Mi): 2²⁰ = 1,048,576
gibi (Gi): 2³⁰ = 1,073,741,824

In computers almost everything is a power of 2 so decimal-based units are usually avoided because they are never round.

In your example, using base 2 and base 10 prefixes renders this:

10485760000 / 2³⁰ = 9.765625 GiB
10485760000 / 10⁹ = 10.48576 GB

The value you want it probably the first one given that it's a file size.

Get Human Readable Version of File Size