Process escape sequences in a string in Python
The correct thing to do is use the 'string-escape' code to decode the string.
>>> myString = "spam\\neggs"
>>> decoded_string = bytes(myString, "utf-8").decode("unicode_escape") # python3
>>> decoded_string = myString.decode('string_escape') # python2
>>> print(decoded_string)
spam
eggs
Don't use the AST or eval. Using the string codecs is much safer.
How do you input escape sequences in Python?
The input
statement takes the input that the user typed literally. The \
-escaping convention is something that happens in Python string literals: it is not a universal convention that applies to data stored in variables. If it were, then you could never store in a string variable the two characters \
followed by n
because they would be interpreted as ASCII 13.
You can do what you want this way:
import ast
import shlex
a=input("Input: ")
print(ast.literal_eval(shlex.quote(a)))
If in response to the Input:
prompt you type one\ntwo
, then this code will print
one
two
This works by turning the contents of a
which is one\ntwo
back into a quoted string that looks like "one\ntwo"
and then evaluating it as if it were a string literal. That brings the \
-escaping convention back into play.
But it is very roundabout. Are you sure you want users of your program feeding it control characters?
Why print returns \\, not a escape character \ in Python
Referring to String and Bytes literals, when python sees a backslash in a string literal while compiling the program, it looks to the next character to see how the following characters are to be escaped. In the first case the following character is U
so python knows its a unicode escape. In the final case, it sees {
, realizes there is no escape, and just emits the backslash and that {
character.
In print('\{}'.format('U0001F602'))
there are two different string literals '\{}'
and 'U0001F602'
. That the first string will be parsed at runtime with .format
doesn't make the result a string literal at all - its a composite value.
How do I convert a string to an escape sequence in Python?
What you're trying to do is interpret the escape sequences in the original string, to get the corresponding character(s). Don't compute them yourself, call a decode()
method. In Python 3 you'll only find it on bytes
objects (not str
), so you need to convert to a bytes
object and back:
>>> bytes("\\xf0\\xfa", "utf-8").decode("unicode_escape")
'ðú'
See here for a more complete answer to your question.
How to un-escape a backslash-escaped string?
>>> print '"Hello,\\nworld!"'.decode('string_escape')
"Hello,
world!"
Python - How do I split a string that includes an escape character as a delimiter?
Convert your string to raw string by doing r'string'
Try this:
MyString = r'A\x92\xa4\xbf'
delim = '\\' + 'x' #OR simply: delim = '\\x'
MyList = MyString.split(delim)
print(MyList)
Output:
['A', '92', 'a4', 'bf']
This technique works for any escape sequence (let me know otherwise xD) \x
, just set delimiter as \\x
. Working sample : https://repl.it/@stupidlylogical/RawStringPython
Works because:
Python raw string treats backslash (\) as a literal character. This is
useful when we want to have a string that contains backslash and don't
want it to be treated as an escape character.
Explanation:
When an 'r' or 'R' prefix is present, a character following a
backslash is included in the string without change, and all
backslashes are left in the string.
More: https://docs.python.org/2/reference/lexical_analysis.html#string-literals
How to format escape sequences inside a function
Since you aren't working with string literals, don't use escape sequences in the function.
def vhf(c):
print "...I want this %s escape sequence" % (c,)
vhf('\n')
How do I .decode('string-escape') in Python 3?
If you want str-to-str decoding of escape sequences, so both input and output are Unicode:
def string_escape(s, encoding='utf-8'):
return (s.encode('latin1') # To bytes, required by 'unicode-escape'
.decode('unicode-escape') # Perform the actual octal-escaping decode
.encode('latin1') # 1:1 mapping back to bytes
.decode(encoding)) # Decode original encoding
Testing:
>>> string_escape('\\123omething special')
'Something special'
>>> string_escape(r's\000u\000p\000p\000o\000r\000t\000@'
r'\000p\000s\000i\000l\000o\000c\000.\000c\000o\000m\000',
'utf-16-le')
'support@psiloc.com'
Related Topics
Find Unique Rows in Numpy.Array
Most Efficient Way to Map Function Over Numpy Array
Tkinter: How to Use After Method
How to Add a New Column to an Existing Dataframe
How to Properly Determine the Current Script Directory
Annotate Bars With Values on Pandas Bar Plots
How Are Python'S Built in Dictionaries Implemented
Find If 24 Hrs Have Passed Between Datetimes
How to Disable Python Warnings
Python Rounding Error With Float Numbers
Installing Specific Package Version With Pip
Speed Up Millions of Regex Replacements in Python 3
How to Plot Data from Multiple Two Column Text Files With Legends in Matplotlib