How to Convert 'Binary String' to Normal String in Python3

How to convert 'binary string' to normal string in Python3?

Decode it.

>>> b'a string'.decode('ascii')
'a string'

To get bytes from string, encode it.

>>> 'a string'.encode('ascii')
b'a string'

How to convert a binary String to Original String

str_data='Hi'
binarystr = ''.join(format(ord(x),'b') for x in str_data)
String=''
for i in range(0,len(binarystr),7):
String+=chr(int(binarystr[i:i+7],2))
print(String)

Binary to String/Text in Python

It looks like you are trying to decode ASCII characters from a binary string representation (bit string) of each character.

You can take each block of eight characters (a byte), convert that to an integer, and then convert that to a character with chr():

>>> X = "0110100001101001"
>>> print(chr(int(X[:8], 2)))
h
>>> print(chr(int(X[8:], 2)))
i

Assuming that the values encoded in the string are ASCII this will give you the characters. You can generalise it like this:

def decode_binary_string(s):
return ''.join(chr(int(s[i*8:i*8+8],2)) for i in range(len(s)//8))

>>> decode_binary_string(X)
hi

If you want to keep it in the original encoding you don't need to decode any further. Usually you would convert the incoming string into a Python unicode string and that can be done like this (Python 2):

def decode_binary_string(s, encoding='UTF-8'):
byte_string = ''.join(chr(int(s[i*8:i*8+8],2)) for i in range(len(s)//8))
return byte_string.decode(encoding)

Converting binary to string

You need to set the base for int:

''.join(chr(int(val, 2)) for val in res.split(' '))

Output:

'This is a string'

Convert base-2 binary number string to int

You use the built-in int() function, and pass it the base of the input number, i.e. 2 for a binary number:

>>> int('11111111', 2)
255

Here is documentation for Python 2, and for Python 3.

Binary data gets written as string literal - how to convert it back to bytes?

Assuming type str for your original string, you have the following raw string (literal length 4 escape codes not an actual escape code representing 1 byte):

s = r"b'x\x9c\xabV*HL\xd1\xcd\xccK\xcbW\xb2RPJ\xcb\xcfOJ,R\xaa\x05\x00T\x83\x07b'"

If you remove the leading b' and ', you can use the latin1 encoding to convert to bytes. latin1 is a 1:1 mapping of Unicode code points to byte values, because the first 256 Unicode code points represent the latin1 character set:

>>> s[2:-1].encode('latin1')
b'x\\x9c\\xabV*HL\\xd1\\xcd\\xccK\\xcbW\\xb2RPJ\\xcb\\xcfOJ,R\\xaa\\x05\\x00T\\x83\\x07b'

This is now a byte string, but contains literal escape codes. Now apply the unicode_escape encoding to translate back to a str of the actual code points:

>>> s2 = b.decode('unicode_escape')
>>> s2
'x\x9c«V*HLÑÍÌKËW²RPJËÏOJ,Rª\x05\x00T\x83\x07b'

This is now a Unicode string, with code points, but we still need a byte string. Encode with latin1 again:

>>> b2 = s2.encode('latin1')
>>> b2
b'x\x9c\xabV*HL\xd1\xcd\xccK\xcbW\xb2RPJ\xcb\xcfOJ,R\xaa\x05\x00T\x83\x07b'

In one step:

>>> s = r"b'x\x9c\xabV*HL\xd1\xcd\xccK\xcbW\xb2RPJ\xcb\xcfOJ,R\xaa\x05\x00T\x83\x07b'"
>>> b = s[2:-1].encode('latin1').decode('unicode_escape').encode('latin1')
>>> b
b'x\x9c\xabV*HL\xd1\xcd\xccK\xcbW\xb2RPJ\xcb\xcfOJ,R\xaa\x05\x00T\x83\x07b'

It appears this sample data is a zlib-compressed JSON string:

>>> import zlib,json
>>> json.loads(zlib.decompress(b))
{'pad-info': 'foobar'}

Convert bytes to a string

Decode the bytes object to produce a string:

>>> b"abcde".decode("utf-8") 
'abcde'

The above example assumes that the bytes object is in UTF-8, because it is a common encoding. However, you should use the encoding your data is actually in!

Python: Converting a Binary String to a Text File

You need some way of identifying character borders. If you limit this to a set bit length — like only 8-bit, you can pad the binary and then you'll know the character size. If you don't want to do this you need some other way.

Here is a method that doesn't care about the input — it handles spaces, emojis, etc. It does this by separating the characters in the binary with a space:

test_str = "Dies ist eine binäre Übersetzung. br>
Binary = ' '.join(format(ord(i), 'b') for i in test_str)

print("original:")
print(test_str)

print("\nThe string after Binary conversion : \n" + Binary)

text = "".join(chr(int(s, 2)) for s in Binary.split())
print(f'\nString after conversion back to text:\n{text}')

This prints:

original:

Dies ist eine binäre Übersetzung. /p>

The string after Binary conversion :

1000100 1101001 1100101 1110011
100000 1101001 1110011 1110100 100000 1100101 1101001 1101110 1100101
100000 1100010 1101001 1101110 11100100 1110010 1100101 100000
11011100 1100010 1100101 1110010 1110011 1100101 1110100 1111010
1110101 1101110 1100111 101110 100000 11111010000111011

String after conversion back to text:

Dies ist eine binäre Übersetzung. /p>

Notice the last character for the emoji and how long the binary is. That could be a bear emoji or a couple ascii characters. Without a separator, there's now way to know.



Related Topics



Leave a reply



Submit