Regular Expression to Match a Backslash Followed by a Quote

Regular expression to match a backslash followed by a quote

If you don't need any of regex mechanisms like predefined character classes \d, quantifiers etc. instead of replaceAll which expects regex use replace which expects literals

str = str.replace("\\\"","\"");

Both methods will replace all occurrences of targets, but replace will treat targets literally.


BUT if you really must use regex you are looking for

str = str.replaceAll("\\\\\"", "\"")

\ is special character in regex (used for instance to create \d - character class representing digits). To make regex treat \ as normal character you need to place another \ before it to turn off its special meaning (you need to escape it). So regex which we are trying to create is \\.

But to create string literal representing text \\ so you could pass it to regex engine you need to write it as four \ ("\\\\"), because \ is also special character in String literals (part of code written using "...") since it can be used for instance as \t to represent tabulator.
That is why you also need to escape \ there.

In short you need to escape \ twice:

  • in regex \\
  • and then in String literal "\\\\"

Regular expression for matching double-quote but not backslash-double-quote

You can use a negative look-behind: (?<!\\)".

(?<!reg1)reg2 means that reg2 must be preceeded by reg1. Note that reg1 will not be captured.

Now in Java code, your regex will look slightly different since you need to escape the double quotes and the two backslashes :

String regex = "(?<!\\\\)\"";

regex for string with backslash for escape

You can use this regex to match single or double quotes string ignoring all escaped quotes:

(["'])([^\\]*?(?:\\.[^\\]*?)*)\1

RegEx Demo

RegEx Breakup:

  • (["']): Match single or double quote and capture it in group #1
  • (: Start Capturing group #2

    • [^\\]*?: Match 0 or more of any characters that is not a \
    • (?:`: Start non-capturing group

      • \\: Match a \
      • .: Followed by any character that is escaped
      • [^\\]*?: Followed by 0 or more of any non-\ characters
    • )*: End non-capturing group. Match 0 or more of this non-capturing group
  • ): End capturing group #2
  • \1: Match closing single or double quote matches in group #1

Regex match double quote which does not follow slash character

Use this regex:

[^\\]?"(.*?[^\\])"

Explanation:

[^\\]?   match an optional single character which is not backslash
"(.*? match a quote followed by anything (non-greedy)
[^\\])" match a quote preceded by anything other than backslash

This regex will match the least content between an opening quote and closing quote which does not have a backslash.

Regex101

Get numbers between quotes and backslash with regex in C#

If you needed to match a digit sequence between two double quotes, you could use var resultfinal = Regex.Match(input, @"(?<="")[0-9]+(?="")")?.Value;. However, your original string contains escaped quotation marks, so you need to extract a digit sequence in between two \" substrings.

You can use

var input = "\\\"26201\\\",7\\0\\0";
var firstPattern = @"(?<=\\"")[0-9]+(?=\\"")";
var resultfinal = Regex.Match(input, firstPattern)?.Value;
Console.WriteLine($"final result: '{resultfinal}'");
// => final result: '26201'

See the C# demo. The pattern is (?<=\\")[0-9]+(?=\\"), see its online demo. Details:

  • (?<=\\") - a positive lookbehind that requires a \" substring to occur immediately to the left of the current location
  • [0-9]+ - one or more ASCII digits (note \d+ matches any one or more Unicode digits including Hindi, Persian etc. digits unless the RegexOptions.ECMAScript option is used)
  • (?=\\") - a positive lookahead that requires a \" substring to occur immediately to the right of the current location.

Detecting a double-quote-enclosed string with double-quote and backslash escaping, in a Perl Compatible Regular Expression

Here you go:

"(?:\\.|[^"])*"

Demo

For each character in the string, match either a backslash followed by anything, or a character that is not a quote.

And if you need something optimized, here's an alternative:

"(?>[^\\"]++|\\.)*+"

Demo

It basically uses possessive quantifiers to avoid backtracking.

Why escaping double quote with single and triple backslashes in a Java regular expression yields identical results

To define a " char in a string literal in Java, you need to escape it for the string parsing engine, like "\"".

The " char is not a special regex metacharacter, so you needn't escape this character for the regex engine. However, you may do it:

A backslash may be used prior to a non-alphabetic character regardless of whether that character is part of an unescaped construct.

To define a regex escape a literal backslash is used, and it is defined with double backslash in a Java string literal, "\\":

It is therefore necessary to double backslashes in string literals that represent regular expressions to protect them from interpretation by the Java bytecode compiler.

So, both "\"" (a literal " string) and "\\\"" (a literal \" string) form a regex pattern that matches a single " char.

Add backslash before single and double quote

If perl is okay:

perl -pe 's/"{3}(*SKIP)(*F)|[\x27"]/\\$&/g'
  • "{3}(*SKIP)(*F) don't change triple double quotes
    • use (\x27{3}|"{3})(*SKIP)(*F) if you shouldn't change triple single/double quotes
  • |[\x27"] match single or double quotes
  • \\$& prefix \ to the matched portion

With sed, you can replace the triple quotes with newline character (since newline character cannot be present in pattern space for default line-by-line usage), then replace the single/double quote characters and then change newline characters back to triple quotes.

# assuming only triple double quotes are present
sed 's/"""/\n/g; s/[\x27"]/\\&/g; s/\n/"""/g'


Related Topics



Leave a reply



Submit