Using Backslash in Python (Not to Escape)

How can I put an actual backslash in a string literal (not use it for an escape sequence)?

To answer your question directly, put r in front of the string.

final= path + r'\xulrunner.exe ' + path + r'\application.ini'

But a better solution would be os.path.join:

final = os.path.join(path, 'xulrunner.exe') + ' ' + \
os.path.join(path, 'application.ini')

(the backslash there is escaping a newline, but you could put the whole thing on one line if you want)

I will mention that you can use forward slashes in file paths, and Python will automatically convert them to the correct separator (backslash on Windows) as necessary. So

final = path + '/xulrunner.exe ' + path + '/application.ini'

should work. But it's still preferable to use os.path.join because that makes it clear what you're trying to do.

ignoring backslash character in python

You don't have fwd in b. You have wd, preceded by ASCII codepoint 0C, the FORM FEED character. That's the value Python puts there when you use a \f escape sequence in a regular string literal.

Double the backslash if you want to include a backslash or use a raw string literal:

b = '\\fwd'
b = r'\fwd'

Now a in b works:

>>> 'fwd' in '\\fwd'
True
>>> 'fwd' in r'\fwd'
True

See the String literals documentation:

Unless an 'r' or 'R' prefix is present, escape sequences in strings are interpreted according to rules similar to those used by Standard C. The recognized escape sequences are:

[...]

\f ASCII Formfeed (FF)

How can I print a single backslash?

You need to escape your backslash by preceding it with, yes, another backslash:

print("\\")

And for versions prior to Python 3:

print "\\"

The \ character is called an escape character, which interprets the character following it differently. For example, n by itself is simply a letter, but when you precede it with a backslash, it becomes \n, which is the newline character.

As you can probably guess, \ also needs to be escaped so it doesn't function like an escape character. You have to... escape the escape, essentially.

See the Python 3 documentation for string literals.

Quoting backslashes in Python string literals

You're being mislead by output -- the second approach you're taking actually does what you want, you just aren't believing it. :)

>>> foo = 'baz "\\"'
>>> foo
'baz "\\"'
>>> print(foo)
baz "\"

Incidentally, there's another string form which might be a bit clearer:

>>> print(r'baz "\"')
baz "\"

How to ignore backslashes as escape characters in Python?

Preface the string with r (for "raw", I think) and it will be interpreted literally without substitutions:

>>> # Your original
>>> print('''
... /\\/\\/\\/\\/\\
... \\/\\/\\/\\/\\/
... ''')

/\/\/\/\/\
\/\/\/\/\/

>>> # as a raw string instead
>>> print(r'''
... /\\/\\/\\/\\/\\
... \\/\\/\\/\\/\\/
... ''')

/\\/\\/\\/\\/\\
\\/\\/\\/\\/\\/

These are often used for regular expressions, where it gets tedious to have to double-escape backslashes. There are a couple other letters you can do this with, including f (for format strings, which act differently), b (a literal bytes object, instead of a string), and u, which used to designate Unicode strings in python 2 and I don't think does anything special in python 3.

How does Python interpret backslash in string?

From your follow-up comment:

What puzzled me is in my example, it doesn't escape. Single backslash produces double backslashes. Double backslashes produce Double backslashes. Triple backslashes produce quadruple backslashes.....

To be clear: your first output is a string with one backslash in it. Python displays two backslashes in its representation of the string.

When you input the string with a single backslash, Python does not treat the sequence \] in the input as any special escape sequence, and therefore the \ is turned into an actual backslash in the actual string, and the ] into a closing square bracket. Quoting from the documentation linked by Klaus D.:

Unlike Standard C, all unrecognized escape sequences are left in the string unchanged, i.e., the backslash is left in the result. (This behavior is useful when debugging: if an escape sequence is mistyped, the resulting output is more easily recognized as broken.)

When you input the string with a double backslash, the sequence \\ is an escape sequence for a single backslash, and then the ] is just a ].

Either way, when Python displays the string back to you, it uses \\ for the single actual backslash, because it does not look ahead to determine that a single backslash would work - the backslash always gets escaped.


To go into a little more detail: Python doesn't care about how you specified the string in the first place - it has a specific "normalized" form that depends only on what the string actually contains. We can see this by playing around with the different ways to quote a string:

>>> 'foo'
'foo'
>>> "foo"
'foo'
>>> r'foo'
'foo'
>>> """foo"""
'foo'

The normalized form will use double quotes if that avoids escape sequences for single quotes:

>>> '\'\'\''
"'''"

But it will switch back to single quotes if the string contains both types of quote:

>>> '\'"'
'\'"'
>>> "'\"'
'\'"'

(Exercise: how many characters are actually in this string, and what are they? How many backslashes does the string contain?)


It contains two characters - a single-quote and a double-quote - and no backslashes.

How to escape with backslash only not escaped characters

Use re.sub() to capture all occurrences of the characters you want to escape that isn't prefixed by a slash \.

  1. Initialize CHARS_TO_ESCAPE with all the characters that you want to escape e.g. @[. No need to escape them with slash \ at this point, just put the characters themselves (unless you want to escape the slash character itself which means you might need to make it \\ as python strings also use it as an escape character).
  2. Since we will be using regex, we have to escape the special characters in CHARS_TO_ESCAPE that are used in regex patterns such as [, ], (, ), {, }, -, ^, etc. We can use re.escape() for this.
  3. Construct a regex pattern that will capture all occurrences of characters in CHARS_TO_ESCAPE that isn't prefixed by a slash \. Here we used (?<!\\)(@|\[).
    • (?<!\\) - Match if the previous character is a non-slash character.
    • (@|\[) - Capture group 1 which is any of the characters in CHARS_TO_ESCAPE. Notice that [ is prefixed here with \. This is not your escape character but rather is a regex escape character (a bit confusing term as they are both slash \).
  4. Substitute all found occurrences of the regex pattern (those that aren't prefixed with slash \) to have a slash \ prefix via \\\1 where group 1 is as described in the previous step.
import re

CHARS_TO_ESCAPE = "@[" # Add here all characters that you want to escape
CHARS_TO_ESCAPE_RE = ( # This whole clause is equivalent to: CHARS_TO_ESCAPE_RE = r"(?<!\\)(@|\[)"
r"(?<!\\)("
+ r"|".join(map(lambda value: re.escape(value), CHARS_TO_ESCAPE))
+ r")"
)
print(f"{CHARS_TO_ESCAPE_RE=}")

text = "@gmail\.com\> \@hotmail.com @yahoomail.com test1[ test2\["
text = re.sub(CHARS_TO_ESCAPE_RE, r"\\\1", text)
print(text)

Output:

CHARS_TO_ESCAPE_RE='(?<!\\\\)(@|\\[)'
\@gmail\.com\> \@hotmail.com \@yahoomail.com test1\[ test2\[


Related Topics



Leave a reply



Submit