String.replaceAll(regex) makes the same replacement twice
This is not an anomaly: .*
can match anything.
You ask to replace all occurrences:
- the first occurrence does match the whole string, the regex engine therefore starts from the end of input for the next match;
- but
.*
also matches an empty string! It therefore matches an empty string at the end of the input, and replaces it witha
.
Using .+
instead will not exhibit this problem since this regex cannot match an empty string (it requires at least one character to match).
Or, use .replaceFirst()
to only replace the first occurrence:
"test".replaceFirst(".*", "a")
^^^^^^^^^^^^
Now, why .*
behaves like it does and does not match more than twice (it theoretically could) is an interesting thing to consider. See below:
# Before first run
regex: |.*
input: |whatever
# After first run
regex: .*|
input: whatever|
#before second run
regex: |.*
input: whatever|
#after second run: since .* can match an empty string, it it satisfied...
regex: .*|
input: whatever|
# However, this means the regex engine matched an empty input.
# All regex engines, in this situation, will shift
# one character further in the input.
# So, before third run, the situation is:
regex: |.*
input: whatever<|ExhaustionOfInput>
# Nothing can ever match here: out
Note that, as @A.H. notes in the comments, not all regex engines behave this way. GNU sed
for instance will consider that it has exhausted the input after the first match.
Why does my Regex.Replace string contain the replacement value twice?
There are actually 2 matches in your Regex. You defined your match like this:
string match = "(.*)";
It means match zero or more characters, so you have 2 matches - empty string and your text. In order to fix it change the pattern to
string match = "(.+)";
It means match one or more characters - in that case you will only get a single match
How can I more efficiently call replaceAll twice on a single string
Because of the first line, output
is basically the equivalent of
Pattern.compile("(\\r|\\n|\\t)").matcher(obj).replaceAll("")
Because of that, you can replace the variable output
in the second line with Pattern.compile("(\\r|\\n|\\t)").matcher(obj).replaceAll("")
. Then the line would become
Pattern.compile("[^\\p{Print}]").matcher(Pattern.compile("(\\r|\\n|\\t)").matcher(obj).replaceAll("")).replaceAll(replacement);
However, this does not really improve performance, and has a negative impact on readability. Unless you have a really good reason, it would be best to just use the first two lines.
Why does String.replaceAll( .* , REPLACEMENT ) give unexpected behavior in Java 8?
// specify start and end of line
String regexStr = "^.*$";
String replacementStr = "REPLACEMENT"
String initialStr = "hello";
String finalStr = initialStr.replaceAll(regexStr, replacementStr);
Replace multiple characters in one replace call
If you want to replace multiple characters you can call the String.prototype.replace()
with the replacement argument being a function that gets called for each match. All you need is an object representing the character mapping that you will use in that function.
For example, if you want a
replaced with x
, b
with y
, and c
with z
, you can do something like this:
const chars = {'a':'x','b':'y','c':'z'};
let s = '234abc567bbbbac';
s = s.replace(/[abc]/g, m => chars[m]);
console.log(s);
Output: 234xyz567yyyyxz
replaceAll() works once, but not twice?
Because .
mean anything so escape it.
.
in regex will match any character so it will replace everything in string so simply you should take advantage of replace
instead of costly regex
book.replace(",", "");
or
remove both ,
and .
in single step
book.replaceAll("[.,]", "");
[.,]
: []
mean a character class
which mean match both comma and dot
Just in case , if you want to use replace
to remove single-single character then you can apply a chain of replace
function as
String book ="The .bo..ok of ,eli..";
book.replace(",","").replace(".",""); // The book of eli
How to replace a String twice with oldValues having same text within?
You should use a regex, like this: (VB, tested)
Regex.Replace(str, "(Chairman\s+)?(Joe\s+)?Smith", _
"<a href='smithbio.html'>$0</a>")
$0
is one of several expressions that can be included in the replacement string.
If you only know the names at runtime, you should make sure to call Regex.Escape.
String.replaceAll single backslashes with double backslashes
The String#replaceAll()
interprets the argument as a regular expression. The \
is an escape character in both String
and regex
. You need to double-escape it for regex:
string.replaceAll("\\\\", "\\\\\\\\");
But you don't necessarily need regex for this, simply because you want an exact character-by-character replacement and you don't need patterns here. So String#replace()
should suffice:
string.replace("\\", "\\\\");
Update: as per the comments, you appear to want to use the string in JavaScript context. You'd perhaps better use StringEscapeUtils#escapeEcmaScript()
instead to cover more characters.
Python + Regex + Replace pattern with multiple copies of that pattern
You should use single-quotes, raw strings, and re.sub
:
string = r'\\\" asdf \" \ \ \\"'
new_string = re.sub(r'(\\+)"', r'\1\1"', string)
print(new_string)
Output:
\\\\\\" asdf \\" \ \ \\\\"
The Pattern
To explain the pattern, first let's remove the parentheses; they don't affect what's matched, and we'll put them back later. The pattern r'\\+"'
means "one or more backslashes followed by a double-quote". Even though it's a raw string, we still have to escape the backslash because backslashes have special meaning in regular expressions; that's why it's r'\\+"'
instead of r'\+"'
.
The Parentheses
The parentheses around the \\+
in the actual pattern just mean "capture the part of the match inside these parentheses". This will put the substring of all backslashes in this match into a capture group. We're going to use this capture group in the replacement string.
The Replacement String
The replacement string, r'\1\1"'
, just means "two copies of the first capture group followed by a double-quote" (in this case there's only one capture group, but there can be more). The reason the replacement string has a double-quote is because the match had a double-quote; since the entire match is replaced by the replacement string, if the replacement string didn't have a double-quote, the double-quotes would be removed.
Related Topics
How to Pass a Parameter to a Java Thread
Differencebetween Dynamic and Static Polymorphism in Java
Practical Uses for Atomicinteger
No @Xmlrootelement Generated by Jaxb
How to Resize an Image Using Java
How to Return a JSON Object from a Java Servlet
In Java, How to Parse Xml as a String Instead of a File
Java Sending and Receiving File (Byte[]) Over Sockets
Easiest Way to Merge a Release into One Jar File
Collections.Sort with Multiple Fields
Read Url to String in Few Lines of Java Code
Java.Util.Date Format Conversion Yyyy-Mm-Dd to Mm-Dd-Yyyy
List of All Special Characters That Need to Be Escaped in a Regex
Quickly Read the Last Line of a Text File
How to Tackle Daylight Savings Using Timezone in Java
How to Find Difference Between Two Joda-Time Datetimes in Minutes