How to .Decode('String-Escape') in Python 3

How do I .decode('string-escape') in Python 3?

If you want str-to-str decoding of escape sequences, so both input and output are Unicode:

def string_escape(s, encoding='utf-8'):
return (s.encode('latin1') # To bytes, required by 'unicode-escape'
.decode('unicode-escape') # Perform the actual octal-escaping decode
.encode('latin1') # 1:1 mapping back to bytes
.decode(encoding)) # Decode original encoding

Testing:

>>> string_escape('\\123omething special')
'Something special'

>>> string_escape(r's\000u\000p\000p\000o\000r\000t\000@'
r'\000p\000s\000i\000l\000o\000c\000.\000c\000o\000m\000',
'utf-16-le')
'support@psiloc.com'

decode(unicode_escape) in python 3 a string

decode applies to bytes, which you can create by encoding from your string.

I would encode (using default) then decode with unicode-escape

>>> s = "Hello\\nWorld"
>>> s.encode()
b'Hello\\nWorld'
>>> s.encode().decode("unicode-escape")
'Hello\nWorld'
>>> print(s.encode().decode("unicode-escape"))
Hello
World
>>>

Process escape sequences in a string in Python

The correct thing to do is use the 'string-escape' code to decode the string.

>>> myString = "spam\\neggs"
>>> decoded_string = bytes(myString, "utf-8").decode("unicode_escape") # python3
>>> decoded_string = myString.decode('string_escape') # python2
>>> print(decoded_string)
spam
eggs

Don't use the AST or eval. Using the string codecs is much safer.

How to encode Python 3 string using \u escape code?

You can use unicode_escape:

>>> thai_string.encode('unicode_escape')
b'\\u0e2a\\u0e35\\u0e40'

Note that encode() will always return a byte string (bytes) and the unicode_escape encoding is intended to:

Produce a string that is suitable as Unicode literal in Python source code

Decoding escaped unicode in Python 3 from a non-ascii string

I was still very new to Python when I asked this question. Now I understand that these fallback mechanisms are just meant for handling unexpected errors, not something to save and restore data. If you really need a simple and reliable way to encode single unicode characters in ASCII, have a look at the quote and unquote functions from the urllib.parse module.

How to un-escape a backslash-escaped string?

>>> print '"Hello,\\nworld!"'.decode('string_escape')
"Hello,
world!"

Decoding string in python3

Thanks, @Mark Tolonen for help on the regex. In your output, I was getting 'u' also in the name along with the decoded symbol. So, I fixed the edge cases using the below code by

  1. Finding the substring with 'u' and 4 digit/characters next to it.
  2. converting this substring to Unicode string using replace function
  3. decoding using Unicode-escape

Below code works:

def convert(s):
# return re.sub(r'[0-9A-F]{4}',lambda m: chr(int(m.group(),16)), s)
return str.encode(re.sub(r'u[0-9A-F]{4}',lambda m:(m.group().replace('u','\\u')),s),'utf-8').decode('unicode-escape')

Input:

 str1 = 'Sabrau00AE Family Size Roasted Pine Nut Hummus - 17 oz'

Code:

str2=convert(str1)
print (str2)
print(type(str2))

Output:

Sabra® Family Size Roasted Pine Nut Hummus - 17 oz
<class 'str'>

Revert unicode escape encoding in string (Python3)

Since you're first encoding with unicode_escape, then decoding with utf-8, the inverse operation would be the inverse of the individual operations, in reverse order:

>>> x = '\n'
>>> y = x.encode('unicode_escape').decode('utf-8')
>>> y.encode('utf-8').decode('unicode_escape')
'\n'

(The socks/shoes principle: You first put on your socks and then your shoes; to undo that you first remove your shoes and then your socks.)

Python 3 - decode escaped string

trouble_string = '{\"N\": \"Centr\\u00e1lna nervov\\u00e1 s\\u00fastava\"}'
result = trouble_string.encode().decode("unicode-escape")

Quote from docs:

unicode_escape - Produce a string that is suitable as Unicode literal in Python source code.



Related Topics



Leave a reply



Submit