Remove specific characters from a string in Python
Strings in Python are immutable (can't be changed). Because of this, the effect of line.replace(...)
is just to create a new string, rather than changing the old one. You need to rebind (assign) it to line
in order to have that variable take the new value, with those characters removed.
Also, the way you are doing it is going to be kind of slow, relatively. It's also likely to be a bit confusing to experienced pythonators, who will see a doubly-nested structure and think for a moment that something more complicated is going on.
Starting in Python 2.6 and newer Python 2.x versions *, you can instead use str.translate
, (see Python 3 answer below):
line = line.translate(None, '!@#$')
or regular expression replacement with re.sub
import re
line = re.sub('[!@#$]', '', line)
The characters enclosed in brackets constitute a character class. Any characters in line
which are in that class are replaced with the second parameter to sub
: an empty string.
Python 3 answer
In Python 3, strings are Unicode. You'll have to translate a little differently. kevpie mentions this in a comment on one of the answers, and it's noted in the documentation for str.translate
.
When calling the translate
method of a Unicode string, you cannot pass the second parameter that we used above. You also can't pass None
as the first parameter. Instead, you pass a translation table (usually a dictionary) as the only parameter. This table maps the ordinal values of characters (i.e. the result of calling ord
on them) to the ordinal values of the characters which should replace them, or—usefully to us—None
to indicate that they should be deleted.
So to do the above dance with a Unicode string you would call something like
translation_table = dict.fromkeys(map(ord, '!@#$'), None)
unicode_line = unicode_line.translate(translation_table)
Here dict.fromkeys
and map
are used to succinctly generate a dictionary containing
{ord('!'): None, ord('@'): None, ...}
Even simpler, as another answer puts it, create the translation table in place:
unicode_line = unicode_line.translate({ord(c): None for c in '!@#$'})
Or, as brought up by Joseph Lee, create the same translation table with str.maketrans
:
unicode_line = unicode_line.translate(str.maketrans('', '', '!@#$'))
* for compatibility with earlier Pythons, you can create a "null" translation table to pass in place of None
:
import string
line = line.translate(string.maketrans('', ''), '!@#$')
Here string.maketrans
is used to create a translation table, which is just a string containing the characters with ordinal values 0 to 255.
How to remove special characters from a string?
That depends on what you define as special characters, but try replaceAll(...)
:
String result = yourString.replaceAll("[-+.^:,]","");
Note that the ^
character must not be the first one in the list, since you'd then either have to escape it or it would mean "any but these characters".
Another note: the -
character needs to be the first or last one on the list, otherwise you'd have to escape it or it would define a range ( e.g. :-,
would mean "all characters in the range :
to ,
).
So, in order to keep consistency and not depend on character positioning, you might want to escape all those characters that have a special meaning in regular expressions (the following list is not complete, so be aware of other characters like (
, {
, $
etc.):
String result = yourString.replaceAll("[\\-\\+\\.\\^:,]","");
If you want to get rid of all punctuation and symbols, try this regex: \p{P}\p{S}
(keep in mind that in Java strings you'd have to escape back slashes: "\\p{P}\\p{S}"
).
A third way could be something like this, if you can exactly define what should be left in your string:
String result = yourString.replaceAll("[^\\w\\s]","");
This means: replace everything that is not a word character (a-z in any case, 0-9 or _) or whitespace.
Edit: please note that there are a couple of other patterns that might prove helpful. However, I can't explain them all, so have a look at the reference section of regular-expressions.info.
Here's less restrictive alternative to the "define allowed characters" approach, as suggested by Ray:
String result = yourString.replaceAll("[^\\p{L}\\p{Z}]","");
The regex matches everything that is not a letter in any language and not a separator (whitespace, linebreak etc.). Note that you can't use [\P{L}\P{Z}]
(upper case P means not having that property), since that would mean "everything that is not a letter or not whitespace", which almost matches everything, since letters are not whitespace and vice versa.
Additional information on Unicode
Some unicode characters seem to cause problems due to different possible ways to encode them (as a single code point or a combination of code points). Please refer to regular-expressions.info for more information.
Remove specific char from String
use :
NewString = OldString.replaceAll("char", "");
in your Example in comment use:
NewString = OldString.replaceAll("d", "");
for removing Arabic character please see following link
how could i remove arabic punctuation form a String in java
removing characters of a specific unicode range from a string
Removing certain characters from a string
I guess, the below code will help you.
String input = "Just to clarify, I will have strings of varying "
+ "lengths. I want to strip characters from it, the exact "
+ "ones to be determined at runtime, and return the "
+ "resulting string.";
String regx = ",.";
char[] ca = regx.toCharArray();
for (char c : ca) {
input = input.replace(""+c, "");
}
System.out.println(input);
How to remove single character from a String by index
You can also use the StringBuilder
class which is mutable.
StringBuilder sb = new StringBuilder(inputString);
It has the method deleteCharAt()
, along with many other mutator methods.
Just delete the characters that you need to delete and then get the result as follows:
String resultString = sb.toString();
This avoids creation of unnecessary string objects.
Remove specific characters from String List - Python
It can be implemented much simpler by directly traversing the file and writing its content to a variable with filtering out unwanted characters.
For example, here is the 'file1.txt'
file with the content:
Hello how are you? Very good!
Then we can do the following:
def main():
characters = '!?¿-.:;'
with open('file1.txt') as f:
aux = ''.join(c for c in f.read() if c not in characters)
# print(aux) # Hello how are you Very good
As we see aux
is the file's content without unwanted chars and it can be easily edited based on the desired output format.
For example, if we want a list of words, we can do this:
def main():
characters = '!?¿-.:;'
with open('file1.txt') as f:
aux = ''.join(c for c in f.read() if c not in characters)
aux = aux.split()
# print(aux) # ['Hello', 'how', 'are', 'you', 'Very', 'good']
How to remove all characters before a specific character in Java?
You can use .substring()
:
String s = "the text=text";
String s1 = s.substring(s.indexOf("=") + 1);
s1.trim();
then s1
contains everything after =
in the original string.
s1.trim()
.trim()
removes spaces before the first character (which isn't a whitespace, such as letters, numbers etc.) of a string (leading spaces) and also removes spaces after the last character (trailing spaces).
How to remove certain characters from a string? [Python]
Since strings are immutable, use the replace function to reassign cool
cool = "cool°"
cool = cool.replace("°","")
cool
'cool'
Related Topics
Executing a PHP Script with a Cron Job
Set Session in Database in PHP
Understanding MVC: Whats the Concept of "Fat" on Models, "Skinny" on Controllers
How to Convert a String to JSON Object in PHP
Apache Error [Notice] Parent: Child Process Exited with Status 3221225477 -- Restarting
Regex to Strip Comments and Multi-Line Comments and Empty Lines
What Is the Format for E-Mail Headers That Display a Name Rather Than the E-Mail
Pdo Fetchall Group Key-Value Pairs into Assoc Array
How to Build Unlimited Level of Menu Through PHP and MySQL
Request Headers Bag Is Missing Authorization Header in Symfony 2
PHP Regular Expression for Strong Password Validation
How to Check That an Object Is Empty in PHP
Get the First Letter of Each Word in a String
Convert Jpg/Gif Image to Png in PHP
Wamp/MySQL Errors Not in Correct Language
Which Is the Best Way to Generate Excel Output in PHP