How to Match All Occurrences of a Regex

Regex - Match all occurrences?

Use the /g modifier for global matching.

my @matches = ($result =~ m/INFO\n(.*?)\n/g);

Lazy quantification is unnecessary in this case as . doesn't match newlines. The following would give better performance:

my @matches = ($result =~ m/INFO\n(.*)\n/g);

/s can be used if you do want periods to match newlines. For more info about these modifiers, see perlre.

Regular expression to return all match occurrences

The issue is with the regular expression used.
The (.*) blocks are accepting more of the string than you realize - .* is referred to as a greedy operation and it will consume as much of the string as it can while still matching. This is why you only see one output.

Suggest matching something like Vacation Allowance:\s*\d+; or similar.

text = '02/05/2020 Vacation Allowance: 21; 02/05/2020 Vacation Allowance: 22; nnn'
m = re.findall('Vacation Allowance:\s*(\d*);', text, re.M)
print(m)

result: ['21', '22']

Regular Expression to Match All Occurrences Before Specified Character

You can use:

\.(?=.*=)
  • \. Match a literal .
  • (?=.*=) Lookahead to match zero or more characters followed by a =.

Sample Image

Live demo here.

Is there a Regex expression to match all occurrences of a letter except in a certain string?

You may try:

f(?<!\bdef\b)

Explanation of the above regex:

f - Matching f literally.

(?<!\bdef\b) - Represents a negative look-behind not matching any occurrence of the word def. If you want def to be case-insensitive then please use the flag \i.

\b - Represents a word boundary for matching the exact word (in this case def) or characters inside.

Pictorial Representation

You can find the demo of the above regex in here.

find all occurrences of a regex as an array

You can use

select split(trim(regexp_replace(regexp_replace(col, '"([^"]+)"|.', '\\1|'),'\\|+','|'), '|'), '|');

Details:

  • regexp_replace(col, '"([^"]+)"|.', '\\1|') - finds any strings between the closest double quotes while capturing the part inside quotes into Group 1, or matching any single char and replaces each match with Group 1 contents + | char (see the regex demo)
  • regexp_replace(...,'\\|+','|') - this shrinks all consecutive pipe symbols into a single occurrence of a | char (see this regex demo)
  • trim(..., '|') - removes | chars on both ends of the string
  • split(..., '|') - splits the string with a | char.

How to match a line with exactly n occurrences with regex?

So we want:

  • START OF LINE
  • 0 or more non-semicolons
  • semicolon
  • 0 or more non-semicolons
  • semicolon
  • EOL

Translating to regex we get:

^[^;]*;[^;]*;$

For the general case with N repetitions you can use

^([^;]*;){N}$

Find multiple occurrences of a character after another character

You can use

(?<=a[\w\W]*?)b

Replace with c. Details:

  • (?<=a[\w\W]*?) - a positive lookbehind that matches a location that is immediately preceded with a and then any zero or more chars (as few as possible)
  • b - a b.

Also, see Multi-line regular expressions in Visual Studio Code for more ways to match any char across lines.

Demo:

Sample Image

After replacing:

Sample Image

If you need to use something like this to replace in multiple files, you need to know that the Rust regex used in the file search and replace VSCode feature is really much less powerful and does not support neither \K, nor \G, nor infinite-width lookbehinds. I suggest using Notepad++ Replace in Files feature:

Sample Image

The (?:\G(?!\A(?<!(?s:.)))|a)[^b]*\Kb pattern matches

  • (?:\G(?!\A(?<!(?s:.)))|a) - either of the two options:
    • \G(?!\A(?<!(?s:.))) - the end of the previous successful match ((?!\A(?<!(?s:.))) is necessary to exclude the start of file position from \G)
    • | - or
    • a - an a
  • [^b]* - any zero or more occurrences of chars other than b
  • \K - omit the matched text
  • b - a b char.


Related Topics



Leave a reply



Submit