Why Does Os.Path.Getsize() Return a Negative Number for a 10Gb File

Why does os.path.getsize() return a negative number for a 10gb file?

Your Linux kernel obviously has large file support, since ls -l works correctly. Thus, it's your Python installation that is lacking the support. (Are you using your distribution's Python package? What distribution is it?)

The documentation on POSIX large file support in Python states that Python should typically make use of large file support if it is available on Linux. It also suggests to try and configure Python with the command line

CFLAGS='-D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64' OPT="-g -O2 $CFLAGS" \
./configure

And finally, quoting the man page of the stat system call:

This can occur when an application compiled on a 32-bit platform without -D_FILE_OFFSET_BITS=64 calls stat() on a file whose size exceeds (1<<31)-1 bits.

(I believe the last word should be "bytes".)

os.path.getsize() returns negative filesize for large files (for 3GB file size)

Clearly, something is wrong with your Linux distribution's build of Python. Rather than fix the actual problem, it might be easier to just work around it:

def getsize_workaround( filename ):
size = os.path.getsize( filename )
if size < 0:
import subprocess as s
size = long( s.Popen("ls -l %s | cut -d ' ' -f5" % filename,
shell=True, stdout=s.PIPE).communicate()[0] )
return size

How to check whether a file is empty or not

>>> import os
>>> os.stat("file").st_size == 0
True

Get human readable version of file size?

Addressing the above "too small a task to require a library" issue by a straightforward implementation (using f-strings, so Python 3.6+):

def sizeof_fmt(num, suffix="B"):
for unit in ["", "Ki", "Mi", "Gi", "Ti", "Pi", "Ei", "Zi"]:
if abs(num) < 1024.0:
return f"{num:3.1f}{unit}{suffix}"
num /= 1024.0
return f"{num:.1f}Yi{suffix}"

Supports:

  • all currently known binary prefixes
  • negative and positive numbers
  • numbers larger than 1000 Yobibytes
  • arbitrary units (maybe you like to count in Gibibits!)

Example:

>>> sizeof_fmt(168963795964)
'157.4GiB'

by Fred Cirera

Missing bytes in Python when writing binary files?

If you actually try to compare the two binary files (if you are under unix you use the cmp command) you will see the two files are identical.

EDIT: As correctly pointed out by John in his answer, the difference in byte size is due to not closing the file before measuring its length. The correct line in the code should be copyFile.close() [invoking the method] instead of copyFile.close [which is the method object].

How can I get a file's size in C?

You need to seek to the end of the file and then ask for the position:

fseek(fp, 0L, SEEK_END);
sz = ftell(fp);

You can then seek back, e.g.:

fseek(fp, 0L, SEEK_SET);

or (if seeking to go to the beginning)

rewind(fp);


Related Topics



Leave a reply



Submit