What is the use of Pattern.quote method?
\Q
means "start of literal text" (i.e. regex "open quote")\E
means "end of literal text" (i.e. regex "close quote")
Calling the Pattern.quote()
method wraps the string in \Q...\E
, which turns the text is into a regex literal. For example, Pattern.quote(".*")
would match a dot and then an asterisk:
System.out.println("foo".matches(".*")); // true
System.out.println("foo".matches(Pattern.quote(".*"))); // false
System.out.println(".*".matches(Pattern.quote(".*"))); // true
The method's purpose is to not require the programmer to have to remember the special terms \Q
and \E
and to add a bit of readability to the code - regex is hard enough to read already. Compare:
someString.matches(Pattern.quote(someLiteral));
someString.matches("\\Q" + someLiteral + "\\E"));
Referring to the javadoc:
Returns a literal pattern String for the specified String.
This method produces a String that can be used to create a Pattern that would match the string s as if it were a literal pattern.
Metacharacters or escape sequences in the input sequence will be given no special meaning.
Difference between Pattern.quote() and its String concatenation equivalent?
The statement in the answer that:
Calling the
Pattern.quote()
method wraps the string in\Q...\E
, which turns the text is into a regex literal.
Is strictly speaking not correct. Indeed. Because that would give weird results if \Q
and \E
are already in the original string.
If you call for instance Pattern.quote("\\Q[r.e.g.e.x]\\E")
it will produce "\\Q\\Q[r.e.g.e.x]\\E\\\\E\\Q\\E"
.
As a result wrapping "\\Q"
and "\\E"
is obviously incorrect (for some edge-cases, I admit that). You better use Pattern.quote
if you want to be safe.
The wrapping with "\\Q"
and "\\E"
you do yourself will be a bit faster (since you save on a method call, an indexOf(..)
and an if
statement in case there is no "\\E"
), but usually you better use libraries since they tend to contain less bugs, and if there are bugs, these are resolved eventually.
You can find the source code here:
public static String quote(String s) {
int slashEIndex = s.indexOf("\\E");
if (slashEIndex == -1)
return "\\Q" + s + "\\E";
StringBuilder sb = new StringBuilder(s.length() * 2);
sb.append("\\Q");
slashEIndex = 0;
int current = 0;
while ((slashEIndex = s.indexOf("\\E", current)) != -1) {
sb.append(s.substring(current, slashEIndex));
current = slashEIndex + 2;
sb.append("\\E\\\\E\\Q");
}
sb.append(s.substring(current, s.length()));
sb.append("\\E");
return sb.toString();
}
So as long as there is no "\\E"
, we are fine. But in the other case, we have to substitute every "\\E"
with "\\E\\\\E\\Q"
...
What is the equivalent of Pattern.quote() for MessageFormat?
There isn’t any method for it, but enclosing the entire text in ASCII single-quote characters will accomplish the same thing. You can do '
→ ''
substitution as you’ve described, then surround the text with '
. From the MessageFormat documentation:
For example, pattern string
"'{''}'"
is interpreted as a sequence of'{
(start of quoting and a left curly brace),''
(a single quote), and}'
(a right curly brace and end of quoting), not'{'
and'}'
(quoted left and right curly braces): representing string"{'}"
, not"{}"
.
Java Pattern.quote
The expression ".*/live/.*"
matches paths with the pattern you describe. You can create a Pattern with that.
Alternatively, as Peter said, you could simply ask path.contains("/live/");
Pattern.quote adds \\Q and \\E to the string java
Adding \Q
and \E
is exactly what Pattern.quote()
does! Why would you not want that?
If you need to quote only some characters of that string, then you must do so manually.
Does including Pattern.LITERAL flag as part of the Pattern.compile(String regex, int flags) method in Java mitigate String regex injection?
Check the Pattern.LITERAL
documentation:
When this flag is specified then the input string that specifies the pattern is treated as a sequence of literal characters. Metacharacters or escape sequences in the input sequence will be given no special meaning.
So, this flag makes any pattern a plain text. \s
will match \s
text, not any whitespace.
What you need to make sure of is:
- Try to write patterns where each subsequent part cannot match the same text as the preceding part to avoid excessive backtracking
- Escape the user-written literal parts of the pattern using
Pattern.quote
.
In your case, you can use
Pattern patternCheck = Pattern.compile("check\\s+test\\s+([\\w\\s-]+)cd(\\s+" + Pattern.quote(variable1) + "|\\s+abc\\s+" + Pattern.quote(variable2) + ")\\s+to\\s+(abc|xyz)\\s+test\\s+ab\\s+xyz", Pattern.CASE_INSENSITIVE);
Does using Pattern.LITERAL mean the same as Pattern.quote?
Given the question as is, the answer is no, because of setting x=Pattern.LITERAL leading to quoting s
twice in the second expression. With double quoting and s="A"
the String "A"
won't be matched, but the String "\\QA\\E"
will. However,
Pattern.compile(s, x | Pattern.LITERAL)
seem to be equivalent to
Pattern.compile(Pattern.quote(s), x & ~Pattern.LITERAL)
When replacing backslashes with Java regex, why does the Pattern class not recognize single backslashes?
The problem is that \
is also used as escape character within the regular expression. To match a single \
you need a literal regular expression \\
which must be specified as the Java string literal "\\\\"
. Ugly, I know, but that's how it is.
Related Topics
Incompatible Jvm in Ggts (Eclipse) and Java 1.8
Java: Why am I Required to Initialize a Primitive Local Variable
String Parsing in Java with Delimiter Tab "\T" Using Split
How to Retrieve a List of Available/Installed Fonts in Android
How to Use a Variable of One Method in Another Method
How to Change Java Logging Console Output from Std Err to Std Out
Create Simple Pojo Classes (Bytecode) at Runtime (Dynamically)
How to Check the Type of a Value from a JSONobject
Rejectedexecutionexception Inside Single Executor Service
Maximum Size of a Method in Java
Why Functional Interfaces in Java 8 Have One Abstract Method
Jaxb Mapping Cyclic References to Xml
Jsp Generating Excel Spreadsheet (Xls) to Download
Checked VS Unchecked Exceptions in Java
Why Is 09 "Too Large" of an Integer Number