Iterate Over Individual Bytes in Python 3

Iterate over individual bytes in Python 3

If you are concerned about performance of this code and an int as a byte is not suitable interface in your case then you should probably reconsider data structures that you use e.g., use str objects instead.

You could slice the bytes object to get 1-length bytes objects:

L = [bytes_obj[i:i+1] for i in range(len(bytes_obj))]

There is PEP 0467 -- Minor API improvements for binary sequences that proposes bytes.iterbytes() method:

>>> list(b'123'.iterbytes())
[b'1', b'2', b'3']

Iterate over individual bytes then save it into a file without alternating the content

I see your problem on using bytes(x). change it to x.to_bytes(1, 'big') solve your problem

Use below code to reveal what difference

a = b'\xcf\x84o\xcf\x81\xce\xbdo\xcf\x82'
a.decode('utf-8') # τoρνoς

with open('./save.txt', 'wb') as save_file:
for i in a:
print(i.to_bytes(1, 'big')) # write it to file, not the others
print(i)
print(bytes(i))
print('----')

Sample Image

How to iterate over a bytes object in Python?

You do receive bytes, yes, but you then have the requests library decode it (via the response.text attribute, which automatically decodes the data), which you then re-encode yourself:

response = filter_url(user).text.encode('utf-8')

Apart from just using the response.content attribute instead to avoid the decode -> encode round-trip, you should really just decode the data as JSON:

data = filter_url(user).json()

Now data is a list of dictionaries, and your perform_count() function can operate on that directly.

Why does a bytes object converts to int when get only one byte?

The bytes() function returns a bytes object. So, b in your example is an immutable sequence of bytes. Each element of the sequence is unsigned int and accessible via index.

Iterate over bytes with findall

The key is to look at the result of your findall(), which is just going to be:

[b'\x03LQ', b'\x03LQ', b'\x03LQ', ...]

You're only telling it to find a static string, so that's all it's going to return. To make the results useful, you can tell it to instead capture what comes after the given string. Here's an example that will grab everything after the given string until the next \x03 byte:

findall(rb'\x03LQ([^\x03]*)', data)

The parens tell findall() what part of the match you want, and [^\x03]* means "match any number of bytes that are not \x03". The result from your example should be:

[b'\x00\x00\x00\\\\Media\\Render_Drive\\mediafiles\\mxf\\k70255.2\\a08.56d829a7_56d82956d829a0.mxf\n', 
b'\x00\x00\x00\\\\Media\\Render_Drive\\mediafiles\\mxf\\k70255.2\\a07.56d829a6_56d82956d829a0.mxf']


Related Topics



Leave a reply



Submit