Reading binary file and looping over each byte
Python 2.4 and Earlier
f = open("myfile", "rb")
try:
byte = f.read(1)
while byte != "":
# Do stuff with byte.
byte = f.read(1)
finally:
f.close()
Python 2.5-2.7
with open("myfile", "rb") as f:
byte = f.read(1)
while byte != "":
# Do stuff with byte.
byte = f.read(1)
Note that the with statement is not available in versions of Python below 2.5. To use it in v 2.5 you'll need to import it:
from __future__ import with_statement
In 2.6 this is not needed.
Python 3
In Python 3, it's a bit different. We will no longer get raw characters from the stream in byte mode but byte objects, thus we need to alter the condition:
with open("myfile", "rb") as f:
byte = f.read(1)
while byte != b"":
# Do stuff with byte.
byte = f.read(1)
Or as benhoyt says, skip the not equal and take advantage of the fact that b""
evaluates to false. This makes the code compatible between 2.6 and 3.x without any changes. It would also save you from changing the condition if you go from byte mode to text or the reverse.
with open("myfile", "rb") as f:
byte = f.read(1)
while byte:
# Do stuff with byte.
byte = f.read(1)
python 3.8
From now on thanks to := operator the above code can be written in a shorter way.
with open("myfile", "rb") as f:
while (byte := f.read(1)):
# Do stuff with byte.
First byte skipped when reading binary file in python
The problem is:
You read the first byte before the loop, and when you enter the loop
you read another byte -> causing you to skip the first byte.
You should change it to:
f = open("GoldenFPGA.bit", "rb")
count = 0
print("#ifndef __CL_NX_BITSTREAM_HEADER_H");
print("#define __CL_NX_BITSTREAM_HEADER_H");
print("const uint8_t cl_nx_bitstream[] = ");
print("{");
print(" 0x7A, 0x00, 0x00, 0x00,");
print(" ", end='')
try:
byte = f.read(1)
while byte:
# Do stuff with byte.
if byte:
print("0x" + byte.hex() + ", ", end='')
count = count + 1
if count % 8 == 0:
print("\n ", end='')
# read next byte
byte = f.read(1)
finally:
f.close()
print("\n};");
print("#endif");
Reading a binary file with python
Read the binary file content like this:
with open(fileName, mode='rb') as file: # b is important -> binary
fileContent = file.read()
then "unpack" binary data using struct.unpack:
The start bytes: struct.unpack("iiiii", fileContent[:20])
The body: ignore the heading bytes and the trailing byte (= 24); The remaining part forms the body, to know the number of bytes in the body do an integer division by 4; The obtained quotient is multiplied by the string 'i'
to create the correct format for the unpack method:
struct.unpack("i" * ((len(fileContent) -24) // 4), fileContent[20:-4])
The end byte: struct.unpack("i", fileContent[-4:])
What is the idiomatic way to iterate over a binary file?
I don't know of any built-in way to do this, but a wrapper function is easy enough to write:
def read_in_chunks(infile, chunk_size=1024*64):
while True:
chunk = infile.read(chunk_size)
if chunk:
yield chunk
else:
# The chunk was empty, which means we're at the end
# of the file
return
Then at the interactive prompt:
>>> from chunks import read_in_chunks
>>> infile = open('quicklisp.lisp')
>>> for chunk in read_in_chunks(infile):
... print chunk
...
<contents of quicklisp.lisp in chunks>
Of course, you can easily adapt this to use a with block:
with open('quicklisp.lisp') as infile:
for chunk in read_in_chunks(infile):
print chunk
And you can eliminate the if statement like this.
def read_in_chunks(infile, chunk_size=1024*64):
chunk = infile.read(chunk_size)
while chunk:
yield chunk
chunk = infile.read(chunk_size)
Related Topics
"Large Data" Workflows Using Pandas
Changing the "Tick Frequency" on X or Y Axis in Matplotlib
How to Measure Elapsed Time in Python
Can a Variable Number of Arguments Be Passed to a Function
Efficient Way to Rotate a List in Python
Pip Install from Git Repo Branch
How to Play Wav File in Python
Setting the Correct Encoding When Piping Stdout in Python
Convert Dataframe Column Type from String to Datetime
Django Template How to Look Up a Dictionary Value With a Variable
How to Search For a String in Text Files
Using Both Python 2.X and Python 3.X in Ipython Notebook
How to Chain the Movement of a Snake'S Body
What Does It Mean If a Python Object Is "Subscriptable" or Not
Pygame Doesn't Let Me Use Float For Rect.Move, But I Need It