List of All Special Characters That Need to Be Escaped in a Regex

List of all characters that should be escaped before put in to RegEx?

Take a look at PHP.JS's implementation of PHP's preg_quote function, that should do what you need:

http://phpjs.org/functions/preg_quote:491

The special regular expression characters are: . \ + * ? [ ^ ] $ ( ) { } = ! < > | : -

List of all special characters that need to be escaped in a regex

You can look at the javadoc of the Pattern class: http://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html

You need to escape any char listed there if you want the regular char and not the special meaning.

As a maybe simpler solution, you can put the template between \Q and \E - everything between them is considered as escaped.

What special characters must be escaped in regular expressions?

Which characters you must and which you mustn't escape indeed depends on the regex flavor you're working with.

For PCRE, and most other so-called Perl-compatible flavors, escape these outside character classes:

.^$*+?()[{\|

and these inside character classes:

^-]\

For POSIX extended regexes (ERE), escape these outside character classes (same as PCRE):

.^$*+?()[{\|

Escaping any other characters is an error with POSIX ERE.

Inside character classes, the backslash is a literal character in POSIX regular expressions. You cannot use it to escape anything. You have to use "clever placement" if you want to include character class metacharacters as literals. Put the ^ anywhere except at the start, the ] at the start, and the - at the start or the end of the character class to match these literally, e.g.:

[]^-]

In POSIX basic regular expressions (BRE), these are metacharacters that you need to escape to suppress their meaning:

.^$*[\

Escaping parentheses and curly brackets in BREs gives them the special meaning their unescaped versions have in EREs. Some implementations (e.g. GNU) also give special meaning to other characters when escaped, such as \? and +. Escaping a character other than .^$*(){} is normally an error with BREs.

Inside character classes, BREs follow the same rule as EREs.

If all this makes your head spin, grab a copy of RegexBuddy. On the Create tab, click Insert Token, and then Literal. RegexBuddy will add escapes as needed.

Which special characters must be escaped when using Python regex module re?

After the opening square bracket the only special characters are -, ^ and ]:

[|\^&+\-%*/=!>]

You can find the list of special characters here:

  • Regular Expression Syntax
  • python-regex-cheatsheet

What characters need to be escaped in .NET Regex?

I don't know the complete set of characters - but I wouldn't rely on the knowledge anyway, and I wouldn't put it into code. Instead, I would use Regex.Escape whenever I wanted some literal text that I wasn't sure about:

// Don't actually do this to check containment... it's just a little example.
public bool RegexContains(string haystack, string needle)
{
Regex regex = new Regex("^.*" + Regex.Escape(needle) + ".*$");
return regex.IsMatch(haystack);
}

Escaping special characters in Java Regular Expressions

Is there any method in Java or any open source library for escaping (not quoting) a special character (meta-character), in order to use it as a regular expression?

If you are looking for a way to create constants that you can use in your regex patterns, then just prepending them with "\\" should work but there is no nice Pattern.escape('.') function to help with this.

So if you are trying to match "\\d" (the string \d instead of a decimal character) then you would do:

// this will match on \d as opposed to a decimal character
String matchBackslashD = "\\\\d";
// as opposed to
String matchDecimalDigit = "\\d";

The 4 slashes in the Java string turn into 2 slashes in the regex pattern. 2 backslashes in a regex pattern matches the backslash itself. Prepending any special character with backslash turns it into a normal character instead of a special one.

matchPeriod = "\\.";
matchPlus = "\\+";
matchParens = "\\(\\)";
...

In your post you use the Pattern.quote(string) method. This method wraps your pattern between "\\Q" and "\\E" so you can match a string even if it happens to have a special regex character in it (+, ., \\d, etc.)

Regex pattern including all special characters

Please don't do that... little Unicode BABY ANGELs like this one are dying! ◕◡◕ (← these are not images) (nor is the arrow!)

And you are killing 20 years of DOS :-) (the last smiley is called WHITE SMILING FACE... Now it's at 263A... But in ancient times it was ALT-1)

and his friend

BLACK SMILING FACE... Now it's at 263B... But in ancient times it was ALT-2

Try a negative match:

Pattern regex = Pattern.compile("[^A-Za-z0-9]");

(this will ok only A-Z "standard" letters and "standard" 0-9 digits.)



Related Topics



Leave a reply



Submit