Regex match unescaped quotes
You can use this:
(?<!\\)(?:\\{2})*\K"
(?<!\\)
checks there is no backslash before (negative lookbehind)
(?:\\{2})*
matches all even numbers of backslashes
\K
removes all on the left from the match result (the backslashes here)
Regex non-escaped quotation marks
It seems to me you want to replace those unescaped quotes and to do that you do not need \K
nor lookbehinds. Replace the lookbehind with a corresponding alternation group and capture what you need to restore with a capturing group and use a replacement backreference.
s.replace(/((?:^|[^\\])(?:\\{2})*)"/g, "$1'")
See the regex demo.
Details
((?:^|[^\\])(?:\\{2})*)
- Group 1 (its value can be accessed with$1
placeholder from the replacement pattern):(?:^|[^\\])
- either start of the string or any char other than\
(?:\\{2})*
- 0+ occurrences of double backslash
"
- a double quote.
JS demo:
var rx = /((?:^|[^\\])(?:\\{2})*)"/g;var s = "hello\"there\\\"boo\\\\\\\\\"elephant";console.log("String:", s);console.log("Result:", s.replace(rx, "$1'"));
regex to match anything except an unescaped quote
Anything that is escaped has to be matched with an escape that is not itself
escaped.
(?<!\\)(?:\\\\)*\\
some Character here
Furthermore, since escapes can be escaped, you have to match anything that
is escaped inside the quotes.
To that end, it is basically this form:
(?<!\\)(?:\\\\)*"[^\\"]*(?:\\[\S\s][^\\"]*)*"
see https://regex101.com/r/LRgBlQ/1
Note that the beginning part (?<!\\)(?:\\\\)*
can be ommited if you
are taking care of (incorporating) the pre-quote part with another sub-expression.
(?<! \\ ) # Not an escape behind
(?: \\\\ )* # Optional even escapes
" # Open quote
[^\\"]* # Not an escape nor double quote
(?:
\\ [\S\s] [^\\"]* # Escape anything then more not escaped, etc ...
)*
" # Close quote
Match unescaped quotes in quoted csv
EDIT: Updated with regex from @sundance to avoid beginning of line and newline.
You could try substituting only quotes that aren't next to a comma, start of line, or newline:
import re
newline = re.sub(r'(?<!^)(?<!,)"(?!,|$)', '', line)
Regex to remove unescaped quotes from a CSV
The following solution only meets your current requirements and is not a universal solution to fix quotes in CSV:
(^"|"$|";+"|";\d+;")|"
Replace with $1
(or \1
, depending on where you use this regex).
See the regex demo.
Details
(^"|"$|";+"|";\d+;")
- Group 1:^"|
-"
at the start of the string, or"$|
-"
at the end of the string, or";+"|
-"
, 1+;
chars, and then"
, or";\d+;"
-";
, 1+ digits, then;"
|
- or"
- a"
char.
Javascript Regex: count unescaped quotes in string
You need a small parser to deal with this task as there is no \G
operator that could anchor the subsequent matches to the end of the previous successful match.
var s = "\"some text\" with 5 unescaped double quotes... \\\"extras\" \\some \\\"string \\\" right\" here \"";
var res = 0;var in_entity = false;for (var i=0; i<s.length; i++) { if ((s[i] === '\\' && !in_entity) || in_entity) { // reverse the flag in_entity = !in_entity; } else if (s[i] === '"' && !in_entity) { // an unescaped " res += 1; }}console.log(s,": ", res);
Regular expression to find unescaped double quotes in CSV file
Try this:
(?m)""(?![ \t]*(,|$))
Explanation:
(?m) // enable multi-line matching (^ will act as the start of the line and $ will act as the end of the line (i))
"" // match two successive double quotes
(?! // start negative look ahead
[ \t]* // zero or more spaces or tabs
( // open group 1
, // match a comma
| // OR
$ // the end of the line or string
) // close group 1
) // stop negative look ahead
So, in plain English: "match two successive double quotes, only if they DON'T have a comma or end-of-the-line ahead of them with optionally spaces and tabs in between".
(i) besides being the normal start-of-the-string and end-of-the-string meta characters.
Related Topics
How to Generate a Human Readable Time Range Using Ruby on Rails
How to Configure Ruby on Rails with Oracle
Size, Length and Count in Rails
How to Simulate Java-Like Annotations in Ruby
Rails - Whenever Gem - Dynamic Values
What's the Difference Between Colon ":" and Fat Arrow "=>"
Weird Imoperfection in Ruby Blocks
Ruby on Rails "Invalid Byte Sequence in Utf-8" Due to Bot
Spinning Background Tasks in Rails
How to Uninstall Ruby from /Usr/Local
Differencebetween 'After_Create' and 'After_Save' and When to Use Which
How to Read the Content of an Excel Spreadsheet Using Ruby
How to Install the Ruby Ri Documentation
How to Uninstall Ruby on Rails on MAC Os X
Project Euler 1:Find the Sum of All the Multiples of 3 or 5 Below 1000
How to Install JSON Gem - Failed to Build Gem Native Extension(MAC 10.10)