What Is a Regex to Match a String Not At the End of a Line

What is a regex to match a string NOT at the end of a line?

/abc(?!$)/

(?!$) is a negative lookahead. It will look for any match of abc that is not directly followed by a $ (end of line)

Tested against

  • abcddee (match)
  • dddeeeabc (no match)
  • adfassdfabcs (match)
  • fabcddee (match)

applying it to your case:

ruby-1.9.2-p290 :007 > "aslkdjfabcalskdfjaabcaabc".gsub(/abc(?!$)/, 'xyz')
=> "aslkdjfxyzalskdfjaxyzaabc"

Regular expression to match a line that doesn't contain a word

The notion that regex doesn't support inverse matching is not entirely true. You can mimic this behavior by using negative look-arounds:

^((?!hede).)*$

Non-capturing variant:

^(?:(?!:hede).)*$

The regex above will match any string, or line without a line break, not containing the (sub)string 'hede'. As mentioned, this is not something regex is "good" at (or should do), but still, it is possible.

And if you need to match line break chars as well, use the DOT-ALL modifier (the trailing s in the following pattern):

/^((?!hede).)*$/s

or use it inline:

/(?s)^((?!hede).)*$/

(where the /.../ are the regex delimiters, i.e., not part of the pattern)

If the DOT-ALL modifier is not available, you can mimic the same behavior with the character class [\s\S]:

/^((?!hede)[\s\S])*$/

Explanation

A string is just a list of n characters. Before, and after each character, there's an empty string. So a list of n characters will have n+1 empty strings. Consider the string "ABhedeCD":

    ┌──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┬───┬──┐
S = │e1│ A │e2│ B │e3│ h │e4│ e │e5│ d │e6│ e │e7│ C │e8│ D │e9│
└──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┴───┴──┘

index 0 1 2 3 4 5 6 7

where the e's are the empty strings. The regex (?!hede). looks ahead to see if there's no substring "hede" to be seen, and if that is the case (so something else is seen), then the . (dot) will match any character except a line break. Look-arounds are also called zero-width-assertions because they don't consume any characters. They only assert/validate something.

So, in my example, every empty string is first validated to see if there's no "hede" up ahead, before a character is consumed by the . (dot). The regex (?!hede). will do that only once, so it is wrapped in a group, and repeated zero or more times: ((?!hede).)*. Finally, the start- and end-of-input are anchored to make sure the entire input is consumed: ^((?!hede).)*$

As you can see, the input "ABhedeCD" will fail because on e3, the regex (?!hede) fails (there is "hede" up ahead!).

Regex to only match a string that does not end with )

You need to add a positive lookahead assertion at the last if you don't want any other space character following [\w-]+ . (?=\s|$) positive lookahead which asserts that the match must be followed by a space character or end of the line anchor.

@":\s\(([\w-]+)(?=\s|$)"

Use \s if necessary or otherwise @":\s\(([\w-]+)$ would be enough.

DEMO

Regex for string not ending with given suffix

You don't give us the language, but if your regex flavour support look behind assertion, this is what you need:

.*(?<!a)$

(?<!a) is a negated lookbehind assertion that ensures, that before the end of the string (or row with m modifier), there is not the character "a".

See it here on Regexr

You can also easily extend this with other characters, since this checking for the string and isn't a character class.

.*(?<!ab)$

This would match anything that does not end with "ab", see it on Regexr

Regex: don't match string ending with newline (\n) with end-of-line anchor ($)

You more likely don't need $ but rather \Z:

>>> print(re.match(r'^foobar\Z', 'foobar\n'))
None
  • \Z matches only at the end of the string.

Regex: failing to match end of line

The problem you have is that \s shorthand character class matches both vertical and horizontal whitespace. That is, it matches both spaces and newline sequences.

Thus you need to restrict it to match only horizontal whitespace.

You need to replace \s with [ \t] or with [^\S\r\n\v].

JS Regex - Match until the end of line OR a character

The pattern [^:]:::\n([\s\S]*)(_{3}|$) that you tried matches too much because [\s\S]* will match all the way to the end. Then when at the end of string, there is an alternation (_{3}|$) matches either 3 times an underscore or the end of the string.

Then pattern can settle matching the end of the string.


You could use a capture group, and match all following lines that do not start with ___

[^:](:::(?:\n(?!___).*)*)
  • [^:] Match any char except :
  • ( Capture group 1
    • ::: Match literally
    • (?:\n(?!___).*)* Match all consecutive lines that does not start with ___
  • ) Close group 1

Regex demo

Or with a negative lookbehind if supported to get a match only, asserting not : to the left

(?<!:):::(?:\n(?!___).*)*

Regex demo

regex pattern that matches the string at the end of a string but is not followed by line break

The reason why /abc$/ matches both "abc\n" and "abc" is that $ matches the location at the end of the string, or (even without /m modifier) the position before the newline that is at the end of the string.

You need the following regex:

/abc\z/

where \z is the unambiguous very end of the string, or

/abc$/D

where the /D modifier will make $ behave the same way as \z. See PHP.NET:

The meaning of dollar can be changed so that it matches only at the very end of the string, by setting the PCRE_DOLLAR_ENDONLY option at compile or matching time.

See the regex demo



Related Topics



Leave a reply



Submit