How to Un-Escape a Backslash-Escaped String

How to un-escape a backslash-escaped string?

>>> print '"Hello,\\nworld!"'.decode('string_escape')
"Hello,
world!"

How to store backslash unescaped string in a variable?

'hello\\world' is not doubly escaped - it is simply that when showing the internal representation (aka "repr") of strings, Python does escape backslashes so that you, the person viewing this representation know that \\ represent an actual, single, backslash character inside the string, and not a escape sequence for another character.

When you call print, the string representation is done through another method, which is meant for program output - i.e. for the users of the program to consume. On this representation, the "\" is properly rendered as "\", and other sequences, such as "\n", "\t", "\b" are rendered as the real characters they represent ("\x0a", "\x09" and "\x07" in this case - or "LINE FEED", "TAB" and "BACKSPACE").

The former is rendered by Python through the call of the __repr__ method in any object, and it is what any Python interactive environment uses to show the result of expressions. The later rendering, used by print takes place calling an object's __str__ method instead. In code, instead of calling these methods directly, one should call respectively the built-ins repr(...) and str(...).

Also, by using f-strings it is easy to interpolate the desired view of an object in another text-snippet. If you want the "str" view, just place the object as an expression between {} inside the f-string. If the internal representation is desired, before the closing }, include the !r sequence:

In [192]: a = "Hello\world!"                                                                                             

In [193]: a
Out[193]: 'Hello\\world!'

In [194]: print(a)
Hello\world!

In [195]: print(repr(a))
'Hello\\world!'

In [196]: print(f"*{a}*{a!r}*")
*Hello\world!*'Hello\\world!'*

As you can see, even typing a single "\", if the character following it does not form a known escape sequence, the "\" is taken alone - but shown as "\", because we, humans, are in no obligation to know by heart which are valid escape sequences and which are not. On the other hand, typing a single "\" meaning a backlash in literal strings is quite dangerous, as there is a big chance of creating an unintended other character. In Python 3.8 (currently in beta), this even yields syntax warning:

Python 3.8.0b2+ (heads/3.8:028f1d2479, Jul 17 2019, 22:42:16) 
[GCC 9.1.1 20190503 (Red Hat 9.1.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> a = "hello\world!"
<stdin>:1: SyntaxWarning: invalid escape sequence \w

The way to avoid this warning is to either always type a double \\ or use the r' prefix for the string:

>>> a = r"hello\world!"

Unescaping backslashes in a string

Assuming I have this super simple YAML file

lane: _L(\d\d\d)[_.]

and load it with PyYAML like this:

import yaml
import re

with open('test.yaml', 'rb') as stream:
data = yaml.safe_load(stream)

lane_pattern = data['lane']
print(lane_pattern)

lane_expr = re.compile(data['lane'])
print(lane_expr)

Then the result is exactly as one would expect:

_L(\d\d\d)[_.]
re.compile('_L(\\d\\d\\d)[_.]')

There is no double escaping of strings going on when YAML is parsed, so there is nothing for you to unescape.

Remove escape character from string

The character '\a' is the ASCII BEL character, chr(7).

To do the conversion in Python 2:

from __future__ import print_function
a = '\\a'
c = a.decode('string-escape')
print(repr(a), repr(c))

output

'\\a' '\x07'

And for future reference, in Python 3:

a = '\\a'
b = bytes(a, encoding='ascii')
c = b.decode('unicode-escape')
print(repr(a), repr(c))

This gives identical output to the above snippet.

In Python 3, if you were working with bytes objects you'd do something like this:

a = b'\\a'
c = bytes(a.decode('unicode-escape'), 'ascii')
print(repr(a), repr(c))

output

b'\\a' b'\x07'

As Antti Haapala mentions, this simple strategy for Python 3 won't work if the source string contains unicode characters too. In tha case, please see his answer for a more robust solution.

How to remove backslash escaping from a javascript var?

You can replace a backslash followed by a quote with just a quote via a regular expression and the String#replace function:

var x = "<div class=\\\"abcdef\\\">";x = x.replace(/\\"/g, '"');document.body.appendChild(  document.createTextNode("After: " + x));

Convert backslash-escaped characters to literals, within a string

Try using Regex.Unescape

using System.Text.RegularExpressions;
...

string result=Regex.Unescape(@"this\x20is a\ntest");

This results in:

this is a 
test

https://dotnetfiddle.net/y2f5GE

It might not work all the time as expected, please read the docs for details

Swift string literal - remove escaping back slashes

You may be confusing how the IDE displays strings and the actual strings that would be printed. Setting breakpoints and looking at the debugger will show you C-escaped strings. If you print those values, the escaping will be gone. For example, if I create the literal string you're describing:

let string = #"[{"crawlLogic":{"startURL":"https://somesite.com/start","someParam":"\r\n\r\n/** Retrieves element's text either by class name""#

The debugger will print this value's debugDescription which will include escaping (such as \" and \\r, note the double-backslash):

print(string.debugDescription)
"[{\"crawlLogic\":{\"startURL\":\"https://somesite.com/start\",\"someParam\":\"\\r\\n\\r\\n/** Retrieves element\'s text either by class name\""

But those extra escapes aren't actually in the string:

print(string)
[{"crawlLogic":{"startURL":"https://somesite.com/start","someParam":"\r\n\r\n/** Retrieves element's text either by class name"

If the debugDescription has \r rather than \\r, then that's indicating you have a carriage return (UTF-8: 0x0d) in your string, not \r (UTF-8: 0x5c 0x72). If you need to convert carriage return + newline into \r\n, then you can do that with with replaceOccurrences(of:with:).

string.replaceOccurrences(of: "\r\n", with: #"\r\n"#)

This says to replace the string "carriage return, newline" with "backslash r backslash n."

(But I would first investigate why the string is in the wrong form to start with. I'm guessing you're constructing it with "\r\n" at some point when you meant #"\r\n"# or "\\r\\n". Or perhaps you're unescaping a string you were given when you shouldn't. Escaping and unescaping strings is very tricky. Try to build the string to hold the characters you want in the first place rather than trying to fix it later.)

If you continue to have trouble, I recommend converting your string into UTF-8 (Array(string.utf8)) and looking at the actual bytes. This will remove all ambiguity about what is in the string.

Unescaping backslash in Swift

I don't think it is possible to do this automatically, however, as there are only a few escaped characters in Swift, you can put them into an array, loop through them, and then replace all instances with the unescaped version. Here's a String extension I made that does this:

extension String {
var unescaped: String {
let entities = ["\0", "\t", "\n", "\r", "\"", "\'", "\\"]
var current = self
for entity in entities {
let descriptionCharacters = entity.debugDescription.characters.dropFirst().dropLast()
let description = String(descriptionCharacters)
current = current.replacingOccurrences(of: description, with: entity)
}
return current
}
}

To use it, simply access the property. For example,

print("Hello,\\nWorld!".unescaped) 

will print

Hello,
World!

Unescaping escaped characters in a string using Python 3.2

To prevent special treatment of \ in a literal string you could use r prefix:

s = r'\n'
print(s)
# -> \n

If you have a string that contains a newline symbol (ord(s) == 10) and you would like to convert it to a form suitable as a Python literal:

s = '\n'
s = s.encode('unicode-escape').decode()
print(s)
# -> \n


Related Topics



Leave a reply



Submit