A Regex to Match a Substring That Isn't Followed by a Certain Other Substring

Find 'word' not followed by a certain character

The (?!@) negative look-ahead will make word match only if @ does not appear immediately after word:

word(?!@)

If you need to fail a match when a word is followed with a character/string somewhere to the right, you may use any of the three below

word(?!.*@)       # Note this will require @ to be on the same line as word
(?s)word(?!.*@) # (except Ruby, where you need (?m)): This will check for @ anywhere...
word(?![\s\S]*@) # ... after word even if it is on the next line(s)

See demo

This regex matches word substring and (?!@) makes sure there is no @ right after it, and if it is there, the word is not returned as a match (i.e. the match fails).

From Regular-expressions.info:

Negative lookahead is indispensable if you want to match something not followed by something else. When explaining character classes, this tutorial explained why you cannot use a negated character class to match a q not followed by a u. Negative lookahead provides the solution: q(?!u). The negative lookahead construct is the pair of parentheses, with the opening parenthesis followed by a question mark and an exclamation point.

And on Character classes page:

It is important to remember that a negated character class still must match a character. q[^u] does not mean: "a q not followed by a u". It means: "a q followed by a character that is not a u". It does not match the q in the string Iraq. It does match the q and the space after the q in Iraq is a country. Indeed: the space becomes part of the overall match, because it is the "character that is not a u" that is matched by the negated character class in the above regexp. If you want the regex to match the q, and only the q, in both strings, you need to use negative lookahead: q(?!u).

A regex to match a substring that isn't followed by a certain other substring

Try:

/(?!.*bar)(?=.*foo)^(\w+)$/

Tests:

blahfooblah            # pass
blahfooblahbarfail # fail
somethingfoo # pass
shouldbarfooshouldfail # fail
barfoofail # fail

Regular expression explanation

NODE                     EXPLANATION
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
bar 'bar'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
foo 'foo'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string

Other regex

If you only want to exclude bar when it is directly after foo, you can use

/(?!.*foobar)(?=.*foo)^(\w+)$/

Edit

You made an update to your question to make it specific.

/(?=.*foo(?!bar))^(\w+)$/

New tests

fooshouldbarpass               # pass
butnotfoobarfail # fail
fooshouldpassevenwithfoobar # pass
nofuuhere # fail

New explanation

(?=.*foo(?!bar)) ensures a foo is found but is not followed directly bar

Regex to match a string not followed by some string

From all the strings above you want to match


  1. While #EngineSpeed
  2. set #WaitStart
  3. set #WaitStart<13
  4. set #WaitStart<=13
  5. set #WaitStart <= 13

You can use

(?<=\s)#\w+(?!.*[@$#])

See this regex demo

Details:

  • (?<=\s) - there must be a whitespace before...
  • # - a literal hash
  • \w+ - 1 or more word chars
  • (?!.*[@$#]) - fail the match if there are @, or $, or # somewhere after the \w+ on the line.

How to match the character '' not followed by ('a' or 'em' or 'strong')?

Try this:

<(?!a|em|strong)

Regex to match a dash that is not followed by an @ character

You can use a negative lookahead:

'~/(?!@)~'

See the regex demo

The [^@] is a negated character class, and is a consuming pattern, while the lookahead will only check the text to the right of the current location in the string, and will fail the match if the lookahead pattern finds the match.

How to remove string except followed by certain characters with regular expression?

Here's a way using re.sub with a negative lookahead:

re.sub(r'ab(?![xy])', '', s)

s = '123ab456'
re.sub(r'ab(?![xy])', '', s)
# '123456'

s = '123abx456'
re.sub(r'ab(?![xy])', '', s)
# '123abx456'

Details

  • ab(?![xy])
    • ab matches the characters ab literally (case sensitive)
    • Negative Lookahead (?![xy])
      • Match a single character present in the list [xy]
      • xy matches a single character in the list xy (case sensitive)

How to find a part of string with regex?

The regex recommended by Aaron works as I wished:

xy(?!y)

It marks 2. zxyz 3. zxy 4. xy, but not 1. zxyy.

Matching an optional substring in a regex

(\d+)\s+(\(.*?\))?\s?Z

Note the escaped parentheses, and the ? (zero or once) quantifiers. Any of the groups you don't want to capture can be (?: non-capture groups).

I agree about the spaces. \s is a better option there. I also changed the quantifier to insure there are digits at the beginning. As far as newlines, that would depend on context: if the file is parsed line by line it won't be a problem. Another option is to anchor the start and end of the line (add a ^ at the front and a $ at the end).



Related Topics



Leave a reply



Submit