How to Match the Last Occurrence of a Pattern

Regex Last occurrence?

Your negative lookahead solution would e.g. be this:

\\(?:.(?!\\))+$

See it here on Regexr

Regex to find last occurrence of pattern in a string

You can try like so:

(\s+)(?=\.[^.]+$)

(?=\.[^.]+$) Positive look ahead for a dot and characters except dot at the end of line.

Demo:

https://regex101.com/r/k9VwC6/3

How to match the last occurrence of a pattern using regex

You can use a negative lookahead regex:

/{+[^}]*}*(?!.*{)/

(?!.*{) is negative lookahead that asserts we don't have any { ahead of the current position.

RegEx Demo

How to match the last occurrence of a pattern?

'String 1 | string 2 | string 3'[/(?<=\|\s)(\w+\s\d+\z)/]
# "string 3"

Where (escaped):

\|  # a pipe
\s # a whitespace
\w+ # one or more of any word character
\s # a whitespace
\d+ # one or more digits
\z # end of string

(...) # capture everything enclosed
(?<=...) # a positive lookbehind

Notice in this case the regex is already getting the last occurrence of the pattern in the string, by attaching it to the end of the string (\z). In other case you could use [\|\s]? instead of (\|\s) to match the string followed by a whitespace and a number and from there, access the last element in the returned array:

'String 1 | string 2 | string 3'.scan(/[\|\s]?(\w+\s\d+)/).last
# ["string 3"]

How do I match the last occurrence of a complex repeating pattern

  • Ctrl+F
  • Find what: [\s\S]+\K^.*?Severity: 20.*\R.*
  • CHECK Wrap around
  • CHECK Regular expression
  • UNCHECK . matches newline
  • Find Next

Explanation:

[\s\S]+             # 1 or more any character
\K # forget all we have seen until this position
^ # beginning of line
.*? # 0 or more any character but newline, not greedy
Severity: 20 # literally
.* # 0 or more any character but newline
\R # any kind of linebreak
.* # 0 or more any character but newline

Screen capture:

enter image description here

How to find a last occurrence of set of characters in string using regex in java?

There are few ways to solve the problem and the best way will depend on the size of the input and the complexity of the pattern:

  1. Reverse the input string and possibly the pattern, this might work for non-complex patterns. Unfortunately java.util.regex doesn't allow to to match the pattern from right to left.

  2. Instead of using a greedy quantifier simply match the pattern and loop Matcher.find() until last occurrence is found.

  3. Use a different regex engine with better performance e.g. RE2/J: linear time regular expression matching in Java.

If option 2 is not efficient enough for your case I'd suggest to try RE2/J:

Java's standard regular expression package, java.util.regex, and many other widely used regular expression packages such as PCRE, Perl and Python use a backtracking implementation strategy: when a pattern presents two alternatives such as a|b, the engine will try to match subpattern a first, and if that yields no match, it will reset the input stream and try to match b instead.

If such choices are deeply nested, this strategy requires an exponential number of passes over the input data before it can detect whether the input matches. If the input is large, it is easy to construct a pattern whose running time would exceed the lifetime of the universe. This creates a security risk when accepting regular expression patterns from untrusted sources, such as users of a web application.

In contrast, the RE2 algorithm explores all matches simultaneously in a single pass over the input data by using a nondeterministic finite automaton.

Find the last occurrence of a pattern

Match all strings in bracket /\(.*?\)/g and post process the result

You can just match all strings satisfying the pattern, and pick the last element from the resulting array. There is no need to come up with a complicated regex for this problem.

> "(Don't match this) and not this (but this)".match(/\(.*?\)/g).pop()
< "(but this)"

> "(Don't match this) and not this (but this) (more)".match(/\(.*?\)/g).pop()
< "(more)"

> "(Don't match this) and not this (but this) (more) the end".match(/\(.*?\)/g).pop()
< "(more)"

Don't want the () in the result? Just use slice(1, -1) to get rid of them, since the pattern fixes their positions:

> "(Don't match this) and not this (but this)".match(/\(.*?\)/g).pop().slice(1, -1)
< "but this"

> "(Don't match this) and not this (but this) (more) the end".match(/\(.*?\)/g).pop().slice(1, -1)
< "more"

Using .* to search for the last instance of the pattern

This is an alternate solution with simple regex. We make use of the greedy property of .* to search for the furthest instance matching pattern \((.*?)\), where the result is captured into capturing group 1:

/^.*\((.*?)\)/

Note that no global flag is used here. When the regex is non-global (find first match only), match function returns the text captured by capturing group, along with the main match.

> "(Don't match this) and not this (but this)".match(/^.*\((.*?)\)/)[1]
< "but this"

> "(Don't match this) and not this (but this) (more) the end".match(/^.*\((.*?)\)/)[1]
< "more"

The ^ is an optimization to prevent the engine from "bumping along" to search at subsequent indices when the pattern .*\((.*?)\) fails to match from index 0.

How do I match the last occurrence of Pattern before another Pattern with REGEX

The regex you could use is either based on a greedy dot pattern placed at the start and followed with a \K match reset operator, or based on a tempered greedy token. Both are very unsafe when it comes to large strings with partial matches (but not matching).

So, the two regexps are

.*\K<IDOC BEGIN.*?0007536846.*?</IDOC>
<IDOC BEGIN(?:(?!<IDOC BEGIN).)*?0007536846(?:(?!<IDOC BEGIN).)*?</IDOC>

The best idea is to unroll the tempered greedy token in these cases:

<IDOC BEGIN[^<]*(?:<(?!IDOC BEGIN)[^<]*?)*0007537181.*?</IDOC>

See the regex demo

The first .*? is replaced with [^<]*(?:<(?!IDOC BEGIN)[^<]*?)*:

  • [^<]* - a negated character class matching 0 or more chars other than <, as many as possible
  • (?:<(?!IDOC BEGIN)[^<]*?)* - 0 or more repetitions of

    • <(?!IDOC BEGIN) - a < char that is not immediately followed with IDOC BEGIN string
    • [^<]*? - a negated character class matching 0 or more chars other than <, as few as possible

Regex match last occurrence of a string from multiple lines

Here is a solution that is not dependent on PCRE feature using negative lookahead:

(?s)\[(\d{2}\/\d{2}\/\d{4} \d{2}:\d{2}:\d{2})\] Moving message 123456789 from NEW to PENDING(?!.* Moving message 123456789 from NEW to PENDING)

RegEx Demo

Date-time is available in 1st capture group.

Here (?!.* Moving message 123456789 from NEW to PENDING) is negative lookahead that ensures we match very last occurrence of given pattern.

How to match the last occurrence of a pattern on a single line string

If you can use perl, then capturing within a regex makes this a lot easier.

 perl -ne 'm(.*<a href="[^:]+://[^/]*/(.*?)" rel="sample".*</dd>) and print "$1\n";'

The regex is basically the same as would also work with grep. I've used m() instead of // to avoid escaping the / inside the regex.

The initial .* will greedily capture everything at the beginning of the line. If you have multiple links on a line, it will capture all but the last. This works with grep too, but it causes grep -o to output the beginning of the line, since this now matches the regex.

This doesn't matter with the capturing parenthesis, as only the part inside the (.*?) is captured and printed.

It would be used the same way as grep.

cat index.html | perl -ne 'm(.*<a href="[^:]+://[^/]*/(.*?)" rel="sample".*</dd>) and print "$1\n";'

or

perl -ne 'm(.*<a href="[^:]+://[^/]*/(.*?)" rel="sample".*</dd>) and print "$1\n";' index.html


Related Topics



Leave a reply



Submit