Python How to Remove Escape Characters from a String

How to remove all the escape sequences from a list of strings?

Something like this?

>>> from ast import literal_eval
>>> s = r'Hello,\nworld!'
>>> print(literal_eval("'%s'" % s))
Hello,
world!

Edit: ok, that's not what you want. What you want can't be done in general, because, as @Sven Marnach explained, strings don't actually contain escape sequences. Those are just notation in string literals.

You can filter all strings with non-ASCII characters from your list with

def is_ascii(s):
try:
s.decode('ascii')
return True
except UnicodeDecodeError:
return False

[s for s in ['william', 'short', '\x80', 'twitter', '\xaa',
'\xe2', 'video', 'guy', 'ray']
if is_ascii(s)]

How to remove escape characters from string in python?

It seems you have a unicode string like in python 2.x we have unicode strings like

inp_str = u'\xd7\nRecord has been added successfully, record id: 92'

if you want to remove escape charecters which means almost special charecters, i hope this is one of the way for getting only ascii charecters without using any regex or any Hardcoded.

inp_str = u'\xd7\nRecord has been added successfully, record id: 92'
print inp_str.encode('ascii',errors='ignore').strip('\n')

Results : 'Record has been added successfully, record id: 92'

First i did encode because it is already a unicode, So while encoding to ascii if any charecters not in ascii level,It will Ignore.And you just strip '\n'

Hope this helps you :)

How do I remove escape character (\) from a list in python?

You can convert the string to bytes and then use the bytes.decode method with unicode_escape as the encoding to un-escape a given string:

cmd = [bytes(s, 'utf-8').decode('unicode_escape') for s in cmd]

Python how to remove escape characters from a string

Maybe the regex module is the way to go

>>> s = 'test\x06\x06\x06\x06'
>>> s1 = 'test2\x04\x04\x04\x04'
>>> import re
>>> re.sub('[^A-Za-z0-9]+', '', s)
'test'
>>> re.sub('[^A-Za-z0-9]+', '', s1)
'test2'

Remove escape character from string

The character '\a' is the ASCII BEL character, chr(7).

To do the conversion in Python 2:

from __future__ import print_function
a = '\\a'
c = a.decode('string-escape')
print(repr(a), repr(c))

output

'\\a' '\x07'

And for future reference, in Python 3:

a = '\\a'
b = bytes(a, encoding='ascii')
c = b.decode('unicode-escape')
print(repr(a), repr(c))

This gives identical output to the above snippet.

In Python 3, if you were working with bytes objects you'd do something like this:

a = b'\\a'
c = bytes(a.decode('unicode-escape'), 'ascii')
print(repr(a), repr(c))

output

b'\\a' b'\x07'

As Antti Haapala mentions, this simple strategy for Python 3 won't work if the source string contains unicode characters too. In tha case, please see his answer for a more robust solution.

Remove escaped characters like new line, tabs, carriage returns, etc. inside a string

While, for example, \n is an escape character, \\n is not. This is why you are left with strings like \\n \\\\n \\t\\\\t \\r\\\\r after sentence.split().

This will return the desired output:

result=" ".join(word for word in sentence.split() if not word.startswith("\\"))

It breaks the sentence down into words, striping any leading or trailing whitespace, but only considering words that do not start with a backslash. Remember things like \\n are not escape characters but representation of literal string \n.

Btw I wouldn't call your attempt "brute force", as string functions like split(), strip(), join(), replace() etc. are intended for solving exactly this type of problem.

remove the escape character and get part of string

The string looks like a json after unicode-escape decoding:

>>> s = '{"type":"2","question_id":"...","text":"\\u5fcd \\u8b93\\u5c0d\\u65b9"}'
>>> s.encode().decode('unicode-escape') # `encode` is not needed in python 2.x
'{"type":"2","question_id":"對於經營一段感情,妳覺得最重要的關鍵是什麼呢?","text":"忍 讓對方"}'

You can use json.loads to deserialize the json:

>>> import json
>>> print(json.loads(s.encode().decode('unicode-escape'))['text'])
'忍 讓對方'

how to Remove Escaping character ( Back slash "\") from pandas dataframe

You can try replace -


>>> import pandas as pd
>>>
>>> val = [r"ALTRAN CONSULTING & \NENGINEERING GMBH",r"NANOVO KERESKEDELMI KFT \KENYSZERTORLES ALATT"]
>>>
>>> d = {'name':val}
>>>
>>> df = pd.DataFrame(d)
>>> df['name'] = df['name'].replace(to_replace= r'\\', value= '', regex=True)
>>> df
name
0 ALTRAN CONSULTING & NENGINEERING GMBH
1 NANOVO KERESKEDELMI KFT KENYSZERTORLES ALATT
>>>



Related Topics



Leave a reply



Submit