Why does os.path.getsize() return a negative number for a 10gb file?
Your Linux kernel obviously has large file support, since ls -l
works correctly. Thus, it's your Python installation that is lacking the support. (Are you using your distribution's Python package? What distribution is it?)
The documentation on POSIX large file support in Python states that Python should typically make use of large file support if it is available on Linux. It also suggests to try and configure Python with the command line
CFLAGS='-D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64' OPT="-g -O2 $CFLAGS" \
./configure
And finally, quoting the man page of the stat
system call:
This can occur when an application compiled on a 32-bit platform without
-D_FILE_OFFSET_BITS=64
callsstat()
on a file whose size exceeds(1<<31)-1
bits.
(I believe the last word should be "bytes".)
os.path.getsize() returns negative filesize for large files (for 3GB file size)
Clearly, something is wrong with your Linux distribution's build of Python. Rather than fix the actual problem, it might be easier to just work around it:
def getsize_workaround( filename ):
size = os.path.getsize( filename )
if size < 0:
import subprocess as s
size = long( s.Popen("ls -l %s | cut -d ' ' -f5" % filename,
shell=True, stdout=s.PIPE).communicate()[0] )
return size
How to check whether a file is empty or not
>>> import os
>>> os.stat("file").st_size == 0
True
Get human readable version of file size?
Addressing the above "too small a task to require a library" issue by a straightforward implementation (using f-strings, so Python 3.6+):
def sizeof_fmt(num, suffix="B"):
for unit in ["", "Ki", "Mi", "Gi", "Ti", "Pi", "Ei", "Zi"]:
if abs(num) < 1024.0:
return f"{num:3.1f}{unit}{suffix}"
num /= 1024.0
return f"{num:.1f}Yi{suffix}"
Supports:
- all currently known binary prefixes
- negative and positive numbers
- numbers larger than 1000 Yobibytes
- arbitrary units (maybe you like to count in Gibibits!)
Example:
>>> sizeof_fmt(168963795964)
'157.4GiB'
by Fred Cirera
Missing bytes in Python when writing binary files?
If you actually try to compare the two binary files (if you are under unix you use the cmp
command) you will see the two files are identical.
EDIT: As correctly pointed out by John in his answer, the difference in byte size is due to not closing the file before measuring its length. The correct line in the code should be copyFile.close()
[invoking the method] instead of copyFile.close
[which is the method object].
How can I get a file's size in C?
You need to seek to the end of the file and then ask for the position:
fseek(fp, 0L, SEEK_END);
sz = ftell(fp);
You can then seek back, e.g.:
fseek(fp, 0L, SEEK_SET);
or (if seeking to go to the beginning)
rewind(fp);
Related Topics
How to Check the Operating System in Python
On Linux Suse or Redhat, How to Load Python 2.7
Linux/Python: Encoding a Unicode String for Print
Pyserial Works Fine in Python Interpreter, But Not Standalone
Running Python Script as a Systemd Service
Auto.Arima() Equivalent for Python
Converting Python Objects for Rpy2
Matplotlib-Animation "No Moviewriters Available"
How to Retrieve the Process Start Time (Or Uptime) in Python
Setting Ld_Library_Path from Inside Python
Find the Oldest File (Recursively) in a Directory
Running a Python Script Using Cron
How to Check the Data Transfer on a Network Interface in Python
When Using Os.Execlp, Why 'Python' Needs 'Python' as Argv[0]
A Way to "Listen" for Changes to a File System from Python on Linux