Iterate over individual bytes in Python 3
If you are concerned about performance of this code and an int
as a byte is not suitable interface in your case then you should probably reconsider data structures that you use e.g., use str
objects instead.
You could slice the bytes
object to get 1-length bytes
objects:
L = [bytes_obj[i:i+1] for i in range(len(bytes_obj))]
There is PEP 0467 -- Minor API improvements for binary sequences that proposes bytes.iterbytes()
method:
>>> list(b'123'.iterbytes())
[b'1', b'2', b'3']
Iterate over individual bytes then save it into a file without alternating the content
I see your problem on using bytes(x)
. change it to x.to_bytes(1, 'big')
solve your problem
Use below code to reveal what difference
a = b'\xcf\x84o\xcf\x81\xce\xbdo\xcf\x82'
a.decode('utf-8') # τoρνoς
with open('./save.txt', 'wb') as save_file:
for i in a:
print(i.to_bytes(1, 'big')) # write it to file, not the others
print(i)
print(bytes(i))
print('----')
How to iterate over a bytes object in Python?
You do receive bytes, yes, but you then have the requests
library decode it (via the response.text
attribute, which automatically decodes the data), which you then re-encode yourself:
response = filter_url(user).text.encode('utf-8')
Apart from just using the response.content
attribute instead to avoid the decode -> encode round-trip, you should really just decode the data as JSON:
data = filter_url(user).json()
Now data
is a list of dictionaries, and your perform_count()
function can operate on that directly.
Why does a bytes object converts to int when get only one byte?
The bytes()
function returns a bytes object. So, b
in your example is an immutable sequence of bytes. Each element of the sequence is unsigned int and accessible via index.
Iterate over bytes with findall
The key is to look at the result of your findall()
, which is just going to be:
[b'\x03LQ', b'\x03LQ', b'\x03LQ', ...]
You're only telling it to find a static string, so that's all it's going to return. To make the results useful, you can tell it to instead capture what comes after the given string. Here's an example that will grab everything after the given string until the next \x03
byte:
findall(rb'\x03LQ([^\x03]*)', data)
The parens tell findall()
what part of the match you want, and [^\x03]*
means "match any number of bytes that are not \x03
". The result from your example should be:
[b'\x00\x00\x00\\\\Media\\Render_Drive\\mediafiles\\mxf\\k70255.2\\a08.56d829a7_56d82956d829a0.mxf\n',
b'\x00\x00\x00\\\\Media\\Render_Drive\\mediafiles\\mxf\\k70255.2\\a07.56d829a6_56d82956d829a0.mxf']
Related Topics
How to Construct a Set Out of List Items in Python
Python Library 'Unittest': Generate Multiple Tests Programmatically
Pymongo Keeps Refusing the Connection at 27017
Python Subprocess and User Interaction
Tkinter Canvas Zoom + Move/Pan
Python: Excluding Modules Pyinstaller
Best Way to Parse a Url Query String
Which Version of Python Do I Have Installed
Can a Decorator of an Instance Method Access the Class
Attributeerror: Can Only Use .Dt Accessor with Datetimelike Values
How to Install Pil on MAC Os X 10.7.2 Lion
Python Time to Age, Part 2: Timezones
Prepend a Level to a Pandas Multiindex
Is There a Multi-Dimensional Version of Arange/Linspace in Numpy