Alternation Operator Inside Square Brackets Does Not Work

Alternation operator inside square brackets does not work

replace [wd|word|qw] with (wd|word|qw) or (?:wd|word|qw).

[] denotes character sets, () denotes logical groupings.

Confused why my regex expression isn't working?

Your first character isn't a number so you go to else condition directly, if you want a dynamic regex then you need to build it using RegExp

Also you don't need character class here

/[,|\n|firstChar]/

it should be

/,|\n|firstChar/

let splitter = (str) => {  if (str.includes('\n')) {    let firstChar = str.slice(0, 1);    if (parseInt(firstChar)) {      return str.split(/,|\n/);    } else {      let regex = new RegExp(`,|\\n|\\${firstChar}`, 'g') // building a dynamic regex here      return str.split(regex).filter(Boolean)    }  }
}
console.log(splitter(";\n2;5"))console.log(splitter("*\n2*5"))

Regex capture groups in []

You can omit the outer capture group and use a single alternation using a non capture group for either to or -:

\d+ (?:to|-) \d+ km

Regex demo

import re

s = "3 to 6 km. 3 - 6 km"
print(re.findall(r'\d+ (?:to|-) \d+ km', s))

Output

['3 to 6 km', '3 - 6 km']

Regex with multiple possible patterns

I believe you need

ptrn <- "Z(ABC|ACD|EFG)Y"

(square brackets [] refer to character sets; the () protects your ABC alternative from being read as ZABC, and similarly for EFGY)

However, I'm a little confused about your expected output. I get

  start end
1 1 5
2 7 11

Using look-ahead and look-behind expressions for the Z and Y might be what you have in mind:

ptrn <- "(?<=Z)(ABC|ACD|EFG)(?=Y)"

this matches the Z and Y but doesn't include them when evaluating the hit location, so the first hit is given as (2,4), which may be what you want

Pointed out in the comments that if you wanted to use lookahead/lookbehind ((?...) stuff) with base R functions (gsub(), grep(), grepl() etc.), then you'd have to specify perl=TRUE.

regex: In Python, how do I match one of many words if the words can contain each other

You are using a character class, not an alternation, which you should be using:

year_regex = r'\b(?:years|year|yrs|yr|y)\b'
m = re.findall(r'\d+\s+' + year_regex, '10 year 2 months')
print(m)

This prints:

['10 year']

Your character class was actually searching for a set of the individual characters contained inside, but you want to search for words. Also, equally important, Python's regex engine will scan the above alternation from left to right. We places longer terms, e.g. years, first, before year, so that we will try to match the former first, and only consider the latter in the event that the former cannot be found.

Replacing special characters in Java String

Don't use square brackets, as it represents a set of single characters to match (a character class).

a=a.replaceAll("’|‵", "-");

Demo!



Related Topics



Leave a reply



Submit