Why Do Backslashes Appear Twice

Why do backslashes appear twice?

What you are seeing is the representation of my_string created by its __repr__() method. If you print it, you can see that you've actually got single backslashes, just as you intended:

>>> print(my_string)
why\does\it\happen?

The string below has three characters in it, not four:

>>> 'a\\b'
'a\\b'
>>> len('a\\b')
3

You can get the standard representation of a string (or any other object) with the repr() built-in function:

>>> print(repr(my_string))
'why\\does\\it\\happen?'

Python represents backslashes in strings as \\ because the backslash is an escape character - for instance, \n represents a newline, and \t represents a tab.

This can sometimes get you into trouble:

>>> print("this\text\is\not\what\it\seems")
this ext\is
ot\what\it\seems

Because of this, there needs to be a way to tell Python you really want the two characters \n rather than a newline, and you do that by escaping the backslash itself, with another one:

>>> print("this\\text\is\what\you\\need")
this\text\is\what\you\need

When Python returns the representation of a string, it plays safe, escaping all backslashes (even if they wouldn't otherwise be part of an escape sequence), and that's what you're seeing. However, the string itself contains only single backslashes.

More information about Python's string literals can be found at: String and Bytes literals in the Python documentation.

Why is an additional backslash '\' being added to substring in list when re.split is used?

This has nothing to with re.split. \ usually defines an escape sequence. To use a literal \ you'll need to double it:

Consider your original string:

In [15]: s = r'/dir/hello\/hell/dir2/hello\end'

In [16]: s
Out[16]: '/dir/hello\\/hell/dir2/hello\\end'

In [17]: len(s)
Out[17]: 31

The extra \ are not counted with len. They only help to specify that the \ does not define any other escape sequence; asides \\ which is also an escape sequence.

How to get rid of double backslash in python windows file path string?

The double backslash is not wrong, python represents it way that to the user. In each double backslash \\, the first one escapes the second to imply an actual backslash. If a = r'raw s\tring' and b = 'raw s\\tring' (no 'r' and explicit double slash) then they are both represented as 'raw s\\tring'.

>>> a = r'raw s\tring'
>>> b = 'raw s\\tring'
>>> a
'raw s\\tring'
>>> b
'raw s\\tring'

For clarification, when you print the string, you'd see it as it would get used, like in a path - with just one backslash:

>>> print(a)
raw s\tring
>>> print(b)
raw s\tring

And in this printed string case, the \t doesn't imply a tab, it's a backslash \ followed by the letter 't'.

Otherwise, a string with no 'r' prefix and a single backslash would escape the character after it, making it evaluate the 't' following it == tab:

>>> t = 'not raw s\tring'  # here '\t' = tab
>>> t
'not raw s\tring'
>>> print(t) # will print a tab (and no letter 't' in 's\tring')
not raw s ring

So in the PDF path+name:

>>> item = 'xyz'
>>> PDF = r'C:\Users\user\Desktop\File_%s.pdf' % item
>>> PDF # the representation of the string, also in error messages
'C:\\Users\\user\\Desktop\\File_xyz.pdf'
>>> print(PDF) # "as used"
C:\Users\user\Desktop\File_xyz.pdf

More info about escape sequences in the table here. Also see __str__ vs __repr__.

Why can String.raw handle double backslashes but regular escaping can't?

That’s exactly what String.raw is for: it does not interpret escape sequences. A backslash has a special meaning in a string, so you need to double it to get one actual backslash. With String.raw, (most) special characters lose their special meaning, so two backslashes are actually two backslashes. It’s used precisely when you need a string with many special characters and don’t want to worry about escaping them correctly too much.



Related Topics



Leave a reply



Submit