Remove All Special Characters with Regexp

Regex remove all special characters except numbers?

Use the global flag:

var name = name.replace(/[^a-zA-Z ]/g, "");
^

If you don't want to remove numbers, add it to the class:

var name = name.replace(/[^a-zA-Z0-9 ]/g, "");

Remove all special characters with RegExp

var desired = stringToReplace.replace(/[^\w\s]/gi, '')

As was mentioned in the comments it's easier to do this as a whitelist - replace the characters which aren't in your safelist.

The caret (^) character is the negation of the set [...], gi say global and case-insensitive (the latter is a bit redundant but I wanted to mention it) and the safelist in this example is digits, word characters, underscores (\w) and whitespace (\s).

Regex to remove all special characters from string?

It really depends on your definition of special characters. I find that a whitelist rather than a blacklist is the best approach in most situations:

tmp = Regex.Replace(n, "[^0-9a-zA-Z]+", "");

You should be careful with your current approach because the following two items will be converted to the same string and will therefore be indistinguishable:

"TRA-12:123"
"TRA-121:23"

How to remove all special characters except for underscore between two words in java?

The RegEx pattern, [^\p{Alnum}_]|^_|_$ meets your requirement.

import java.util.stream.Stream;

public class Main {
public static void main(String[] args) {
Stream.of(
"OIL~",
"_OIL_GAS",
"OIL_GAS_",
"_OIL_GAS_",
"*OIL_GAS$",
"OIL_GAS"
).forEach(s -> System.out.println(s.replaceAll("[^\\p{Alnum}_]|^_|_$", "")));
}
}

Output:

OIL
OIL_GAS
OIL_GAS
OIL_GAS
OIL_GAS
OIL_GAS

Explanation of the regex at regex101:

Sample Image

Note: As you must have already understood from the documentation, \p{Alnum} is same as A-Za-z0-9.

Python Regex - remove all . and special characters EXCEPT the decimal point

Use a capturing group to capture only the decimal numbers and at the same time match special chars (ie. not of space and word characters).

Upon replacement, just refer to the capturing group in-order to make use of only the captured chars. ie. the whole match would be removed and replaced by the decimal number if exists.

s = 'What? The Census Says It’s Counted 99.9 Percent of Households. Don’t Be Fooled.'
import re
rgx = re.compile(r'(\d\.\d)|[^\s\w]')
rgx.sub(lambda x: x.group(1), s)
# 'What The Census Says Its Counted 99.9 Percent of Households Dont Be Fooled'

OR

Match all the dots except the one exists between the numbers and all chars except special chars and then finally replace those match chars with an empty string.

re.sub(r'(?!<\d)\.(?!\d)|[^\s\w.]', '', s)
# 'What The Census Says Its Counted 99.9 Percent of Households Dont Be Fooled'

Regex to remove all special characters except periods

You can use

"""[\p{P}\p{S}&&[^.]]+""".toRegex()

The [\p{P}\p{S}&&[^.]]+ pattern matches one or more (+) punctuation proper (\p{P}) or symbol (\p{S}) chars other than dots (&&[^.], using character class subtraction).

See a Kotlin demo:

println("a-b)h.".replace("""[\p{P}\p{S}&&[^.]]+""".toRegex(), ""))
// => abh.

C# how to remove all special characters EXCEPT underscore

You can put your special characters in a Regex pattern, then remove all special characters from your text by using the Replace method.

var regex = new Regex("[!@#$%\^&*\(\)-+=\/\\\{\}\[\]\|:;\"'<>,.\?\~`;]");

var result = regex.Replace("Some!D#Text_With%Special$Character", string.Empty);

The result would be "SomeText_WithSpecialCharacter".

Replacing multiple special characters in oracle

As per the regular expression operators and metasymbols documentation:

  • Put ] as the first character of the (negated) character group;
  • - as the last; and
  • Do not put . immediately after [ or it can be matched as the start of a coalition element [..] if there is a second . later in the expression.

Also:

  • Double up the single quote (to escape it, so it does not terminate the string literal); and
  • Include the non-special characters a-zA-Z0-9 in the capture group too otherwise they will be matched.

Which gives you the regular expression:

SELECT emp_address,
REGEXP_REPLACE(
emp_address,
'^[^][,.$''\*&!%^{}?a-zA-Z0-9-]|[^][,.$''\*&!%^{}?a-zA-Z0-9-]$'
) AS simplified_emp_address
FROM table_name

Which, for the sample data:

CREATE TABLE table_name (emp_address) AS
SELECT '"test1"' FROM DUAL UNION ALL
SELECT '$test2$' FROM DUAL UNION ALL
SELECT '[test3]' FROM DUAL UNION ALL
SELECT 'test4' FROM DUAL UNION ALL
SELECT '|test5|' FROM DUAL;

Outputs:































EMP_ADDRESSSIMPLIFIED_EMP_ADDRESS
"test1"test1
$test2$$test2$
[test3][test3]
test4test4
|test5|test5

Regex remove special characters in filename except extension

You may remove any chars other than word and dot chars with [^\w.] and any dot not followed with 1+ non-dot chars at the end of the string:

filename = filename.replace(/(?:\.(?![^.]+$)|[^\w.])+/g, "-");

See the regex demo

Details

  • (?: - start of a non-capturing group:

    • \.(?![^.]+$) - any dot not followed with 1+ non-dot chars at the end of the string
    • | - or
    • [^\w.] - any char other than a word char and a dot char
  • )+ - end of the group, repeat 1 or more times.

Another solution (if extensions are always present): split out the extension, run your simpler regex on the first chunk then join back:

var filename = "manuel fernandex – Index Prot.bla.otype 5 (pepito grillo).jpg";var ext = filename.substr(filename.lastIndexOf('.') + 1);var name = filename.substr(0, filename.lastIndexOf('.')); console.log(name.replace(/\W+/g, "-") + "." + ext);


Related Topics



Leave a reply



Submit