Regex for matching something if it is not preceded by something else
You want to use negative lookbehind
like this:
\w*(?<!foo)bar
Where (?<!x)
means "only if it doesn't have "x" before this point".
See Regular Expressions - Lookaround for more information.
Edit: added the \w*
to capture the characters before (e.g. "beach").
Match string not preceded by another with a regular expression
All B's not preceded by a A by AB.
Find: (?<!A)B
Replace: AB
Find 'word' not followed by a certain character
The (?!@)
negative look-ahead will make word
match only if @
does not appear immediately after word
:
word(?!@)
If you need to fail a match when a word
is followed with a character/string somewhere to the right, you may use any of the three below
word(?!.*@) # Note this will require @ to be on the same line as word
(?s)word(?!.*@) # (except Ruby, where you need (?m)): This will check for @ anywhere...
word(?![\s\S]*@) # ... after word even if it is on the next line(s)
See demo
This regex matches word
substring and (?!@)
makes sure there is no @
right after it, and if it is there, the word
is not returned as a match (i.e. the match fails).
From Regular-expressions.info:
Negative lookahead is indispensable if you want to match something not followed by something else. When explaining character classes, this tutorial explained why you cannot use a negated character class to match a
q
not followed by au
. Negative lookahead provides the solution:q(?!u)
. The negative lookahead construct is the pair of parentheses, with the opening parenthesis followed by a question mark and an exclamation point.
And on Character classes page:
It is important to remember that a negated character class still must match a character.
q[^u]
does not mean: "aq
not followed by au
". It means: "aq
followed by a character that is not au
". It does not match theq
in the stringIraq
. It does match theq
and the space after theq
in Iraq is a country. Indeed: the space becomes part of the overall match, because it is the "character that is not au
" that is matched by the negated character class in the above regexp. If you want the regex to match theq
, and only theq
, in both strings, you need to use negative lookahead:q(?!u)
.
Match pattern not preceded by character
Use ([^^\w]|^)\w+
(see http://regexr.com/3e85b)
It basically injects a word boundary while excluding the ^ as well.[^\w] = \W\b\w
Otherwise [^^]
will match a '^T
'
and \w+
will match est
.
You can see it if you put capture groups around it.
Match if something is not preceded by something else
Unfortunately, there is no way to use a single pattern to match a string not preceded with some sequence in Lua (note that you can't even rely on capturing an alternative that you need since TEST%d+|(%d+)
will not work in Lua, Lua patterns do not support alternation).
You may remove all substrings that start with TEST
+ digits after it, and then extract digit chunks:
local s = "TEST2XX_R_00.01.211_TEST"
for x in string.gmatch(s:gsub("TEST%d+",""), "%d+") do
print(x)
end
See the Lua demo
Here, s:gsub("TEST%d+","")
will remove TEST<digits>+
and %d+
pattern used with string.gmatch
will extract all digit chunks that remain.
Match pattern not preceded or followed by string
With your second attempt, that performs a logical AND, you are almost there. Just use |
to separate the two possible scenarios:
(?<![A-Z]{2})(\d{9,})|(\d{9,})(?![A-Z]{2})
Regex match characters when not preceded by a string
Doing it with only one regex will be tricky - as stated in comments, there are lots of edge cases.
Myself I would do it with three steps:
- Replace spaces that should stay with some special character (
re.sub
) - Split the text (
re.split
) - Replace the special character with space
For example:
import re
zero_width_space = '\u200B'
s = 'I am from New York, N.Y. and I would like to say hello! How are you today? I am well. I owe you $6. 00 because you bought me a No. 3 burger. -Sgt. Smith'
s = re.sub(r'(?<=\.)\s+(?=[\da-z])|(?<=,)\s+|(?<=Sgt\.)\s+', zero_width_space, s)
s = re.split(r'(?<=[.?!])\s+', s)
from pprint import pprint
pprint([line.replace(zero_width_space, ' ') for line in s])
Prints:
['I am from New York, N.Y. and I would like to say hello!',
'How are you today?',
'I am well.',
'I owe you $6. 00 because you bought me a No. 3 burger.',
'-Sgt. Smith']
Regex until character but if not preceded by another character
You may use
\bLocalize\("([^"\\]*(?:\\.[^"\\]*)*)
See this regex demo.
Details:
\bLocalize
- a whole wordLocalize
\("
- a("
substring([^"\\]*(?:\\.[^"\\]*)*)
- Capturing group 1:[^"\\]*
- 0 or more chars other than"
and\
(?:\\.[^"\\]*)*
- 0 or more repetitions of an escaped char followed with 0 or more chars other than"
and\
In Python, declare the pattern with
reg = r'\bLocalize\("([^"\\]*(?:\\.[^"\\]*)*)'
Demo:
import re
reg = r'\bLocalize\("([^"\\]*(?:\\.[^"\\]*)*)'
s = "Localize(\"/Windows/Actions/DeleteActionWarning=The action you are trying to \\\"delete\\\" is referenced in this document.\") + \" Want to Proceed ?\";"
m = re.search(reg, s)
if m:
print(m.group(1))
# => /Windows/Actions/DeleteActionWarning=The action you are trying to \"delete\" is referenced in this document.
Find string not preceded by other string
The current regex matches oo
in foo
because oo(
is not preceded with "def "
.
To stop the pattern from matching inside a word, you may use a a word boundary, \b
and the fix might look like r"\b(?<!\bdef )([a-zA-Z0-9.]+?)\("
.
Note that identifiers can be matched with [a-zA-Z_][a-zA-Z0-9_]
, so your pattern can be enhanced like
re.findall(r'\b(?<!\bdef\s)([a-zA-Z_]\w*(?:\.[a-zA-Z_]\w*)*)\(', s, re.A)
Note that re.A
or re.ASCII
will make \w
match ASCII only letters, digits and _
.
See the regex demo.
Details
\b
- a word boundary(?<!\bdef\s)
- nodef
+ space allowed immediately to the left of the current location([a-zA-Z_]\w*(?:\.[a-zA-Z_]\w*)*)
- Capturing group 1 (its value will be the result ofre.findall
call):[a-zA-Z_]
- an ASCII letter or_
\w*
- 1+ word chars(?:
- start of a non-capturing group matching a sequence of...\.
- a dot[a-zA-Z_]
- an ASCII letter or_
\w*
- 1+ word chars
)*
- ... zero or more times\(
- a(
char.
Related Topics
Connecting to Remote Url Which Requires Authentication Using Java
Java Error: Implicit Super Constructor Is Undefined for Default Constructor
Illegalmonitorstateexception on Wait() Call
Why Does Inetaddress.Isreachable Return False, When I Can Ping the Ip Address
Why Are Filenames in Java the Same as the Public Class Name
How to Map an Entity Field Whose Name Is a Reserved Word in JPA
How to Switch Between Frames in Selenium Webdriver Using Java
Tomcat 10.0.4 Doesn't Load Servlets (@Webservlet Classes) with 404 Error
Add Image Thumbnails to a Layout in a Grid
What Exactly Is Field Injection and How to Avoid It
String, Stringbuffer, and Stringbuilder
What Is Suppresswarnings ("Unchecked") in Java
Why Is There No Multiple Inheritance in Java, But Implementing Multiple Interfaces Is Allowed