How to convert 'binary string' to normal string in Python3?
Decode it.
>>> b'a string'.decode('ascii')
'a string'
To get bytes from string, encode it.
>>> 'a string'.encode('ascii')
b'a string'
How to convert a binary String to Original String
str_data='Hi'
binarystr = ''.join(format(ord(x),'b') for x in str_data)
String=''
for i in range(0,len(binarystr),7):
String+=chr(int(binarystr[i:i+7],2))
print(String)
Binary to String/Text in Python
It looks like you are trying to decode ASCII characters from a binary string representation (bit string) of each character.
You can take each block of eight characters (a byte), convert that to an integer, and then convert that to a character with chr()
:
>>> X = "0110100001101001"
>>> print(chr(int(X[:8], 2)))
h
>>> print(chr(int(X[8:], 2)))
i
Assuming that the values encoded in the string are ASCII this will give you the characters. You can generalise it like this:
def decode_binary_string(s):
return ''.join(chr(int(s[i*8:i*8+8],2)) for i in range(len(s)//8))
>>> decode_binary_string(X)
hi
If you want to keep it in the original encoding you don't need to decode any further. Usually you would convert the incoming string into a Python unicode string and that can be done like this (Python 2):
def decode_binary_string(s, encoding='UTF-8'):
byte_string = ''.join(chr(int(s[i*8:i*8+8],2)) for i in range(len(s)//8))
return byte_string.decode(encoding)
Converting binary to string
You need to set the base
for int
:
''.join(chr(int(val, 2)) for val in res.split(' '))
Output:
'This is a string'
Convert base-2 binary number string to int
You use the built-in int()
function, and pass it the base of the input number, i.e. 2
for a binary number:
>>> int('11111111', 2)
255
Here is documentation for Python 2, and for Python 3.
Binary data gets written as string literal - how to convert it back to bytes?
Assuming type str
for your original string, you have the following raw string (literal length 4 escape codes not an actual escape code representing 1 byte):
s = r"b'x\x9c\xabV*HL\xd1\xcd\xccK\xcbW\xb2RPJ\xcb\xcfOJ,R\xaa\x05\x00T\x83\x07b'"
If you remove the leading b'
and '
, you can use the latin1
encoding to convert to bytes. latin1
is a 1:1 mapping of Unicode code points to byte values, because the first 256 Unicode code points represent the latin1
character set:
>>> s[2:-1].encode('latin1')
b'x\\x9c\\xabV*HL\\xd1\\xcd\\xccK\\xcbW\\xb2RPJ\\xcb\\xcfOJ,R\\xaa\\x05\\x00T\\x83\\x07b'
This is now a byte string, but contains literal escape codes. Now apply the unicode_escape
encoding to translate back to a str
of the actual code points:
>>> s2 = b.decode('unicode_escape')
>>> s2
'x\x9c«V*HLÑÍÌKËW²RPJËÏOJ,Rª\x05\x00T\x83\x07b'
This is now a Unicode string, with code points, but we still need a byte string. Encode with latin1
again:
>>> b2 = s2.encode('latin1')
>>> b2
b'x\x9c\xabV*HL\xd1\xcd\xccK\xcbW\xb2RPJ\xcb\xcfOJ,R\xaa\x05\x00T\x83\x07b'
In one step:
>>> s = r"b'x\x9c\xabV*HL\xd1\xcd\xccK\xcbW\xb2RPJ\xcb\xcfOJ,R\xaa\x05\x00T\x83\x07b'"
>>> b = s[2:-1].encode('latin1').decode('unicode_escape').encode('latin1')
>>> b
b'x\x9c\xabV*HL\xd1\xcd\xccK\xcbW\xb2RPJ\xcb\xcfOJ,R\xaa\x05\x00T\x83\x07b'
It appears this sample data is a zlib-compressed JSON string:
>>> import zlib,json
>>> json.loads(zlib.decompress(b))
{'pad-info': 'foobar'}
Convert bytes to a string
Decode the bytes
object to produce a string:
>>> b"abcde".decode("utf-8")
'abcde'
The above example assumes that the bytes
object is in UTF-8, because it is a common encoding. However, you should use the encoding your data is actually in!
Python: Converting a Binary String to a Text File
You need some way of identifying character borders. If you limit this to a set bit length — like only 8-bit, you can pad the binary and then you'll know the character size. If you don't want to do this you need some other way.
Here is a method that doesn't care about the input — it handles spaces, emojis, etc. It does this by separating the characters in the binary with a space:
test_str = "Dies ist eine binäre Übersetzung. br>
Binary = ' '.join(format(ord(i), 'b') for i in test_str)
print("original:")
print(test_str)
print("\nThe string after Binary conversion : \n" + Binary)
text = "".join(chr(int(s, 2)) for s in Binary.split())
print(f'\nString after conversion back to text:\n{text}')
This prints:
original:
Dies ist eine binäre Übersetzung. /p>The string after Binary conversion :
1000100 1101001 1100101 1110011
100000 1101001 1110011 1110100 100000 1100101 1101001 1101110 1100101
100000 1100010 1101001 1101110 11100100 1110010 1100101 100000
11011100 1100010 1100101 1110010 1110011 1100101 1110100 1111010
1110101 1101110 1100111 101110 100000 11111010000111011String after conversion back to text:
Dies ist eine binäre Übersetzung. /p>
Notice the last character for the emoji and how long the binary is. That could be a bear emoji or a couple ascii characters. Without a separator, there's now way to know.
Related Topics
How to Write a Multidimensional Array to a Text File
Including Non-Python Files with Setup.Py
Insert a Row to Pandas Dataframe
Numpy "Where" with Multiple Conditions
How to Access the Child Classes of an Object in Django Without Knowing the Name of the Child Class
Generating a List of Random Numbers, Summing to 1
Finding Median of List in Python
Running Selenium with Headless Chrome Webdriver
Python: JSON.Loads Returns Items Prefixing with 'U'
Most Recent Previous Business Day in Python
Differencebetween Class and Instance Variables
Windows Is Not Passing Command Line Arguments to Python Programs Executed from the Shell
Adding a Module (Specifically Pymorph) to Spyder (Python Ide)
What Does 'Valueerror: Cannot Reindex from a Duplicate Axis' Mean