Optional Whitespace Regex

Optional Whitespace Regex

Add a \s? if a space can be allowed.

\s stands for white space

? says the preceding character may occur once or not occur.

If more than one spaces are allowed and is optional, use \s*.

* says preceding character can occur zero or more times.

'#<a href\s?="(.*?)" title\s?="(.*?)"><img alt\s?="(.*?)" src\s?="(.*?)"[\s*]width\s?="150"[\s*]height\s?="(.*?)"></a>#'

allows an optional space between attribute name and =.

If you want an optional space after the = also, add a \s? after it also.

Likewise, wherever you have optional characters, you can use ? if the maximum occurrence is 1 or * if the maximum occurrence is unlimited, following the optional character.

And your actual problem was [\s*] which causes occurrence of a whitespace or a * as characters enclosed in [ and ] is a character class. A character class allows occurrence of any of its members once (so remove * from it) and if you append a quantifier (?, +, * etc) after the ] any character(s) in the character class can occur according to the quantifier.

How to include optional space in Grep statement

You are using a POSIX BRE regex and foo\s?(\$ matches foo, a whitespace, a literal ?, a literal ( and a literal $.

You can use

grep -E 'foo\s?\(\$' log.txt

Here, -E makes the pattern POSIX ERE, and thus it now matches foo, then an optional whitespace, and a ($ substring.

See an online demo:

s='foo($abc) - sometext
foo ($xyz) - moretext
baz($qux) - moartext'
grep -E 'foo\s?\(\$' <<< "$s"

Output:

foo($abc) - sometext
foo ($xyz) - moretext

You may still use a more universal syntax like

grep 'foo[[:space:]]\{0,1\}(\$' log.txt

It is a POSIX BRE regex matching foo, one or zero whitespaces, and then ($ substring.

Regex optional space solution

You can make the space itself optional:

^([\w\W]{3,50})(\s?\(v[\d]{1,4}\)){0,1}?$
^

Which allows 0 or 1 space. To allow an arbitrary number of spaces (including none), you can use the * quantifier

^([\w\W]{3,50})(\s*\(v[\d]{1,4}\)){0,1}?$
^

Also, [\w\W] means "a word character or a non-word character", which matches any character. So [\w\W] can be replaced with ..

Lastly, the {0,1} at the end of the expression can simply be omitted, since the optionality of the version number is already being expressed by ?.

Therefore, the expression can be simplified to:

^(.{3,50})(\s*\(v[\d]{1,4}\))?$

Regex negative lookaround with optional whitespace

You may use

import re
s = '200 word1 some 50 foo and 5foo 30word2'
pattern = r"\b[0-9]+(?!\s*foo|[0-9])"
print(re.findall(pattern, s))
# => ['200', '30']

See the Python demo and the regex graph:

Sample Image

Details

  • \b - a word boundary
  • [0-9]+ - 1+ ASCII digits only
  • (?!\s*foo|[0-9]) - not immediately followed with

    • \s*foo - 0+ whitespaces and foo string
    • | - or
    • [0-9] - an ASCII digit.

How to ignore whitespace in a regular expression subject string?

You can stick optional whitespace characters \s* in between every other character in your regex. Although granted, it will get a bit lengthy.

/cats/ -> /c\s*a\s*t\s*s/

Regex optional white space in phone number

How about \(?\b[0-9]{3}\)?[-. ]?[0-9]{3}[-. ]?[0-9]{4}\b which matches 3334445555, 333.444.5555, 333-444-5555, 333 444 5555, (333) 444 5555 and all combinations thereof.

Updated

You're running into the limitations of REGEX and so the ugly solution really is:

(\d{3}-\d{3}-\d{4}|\(\d{3}\)\s?\d{3}-\d{4}|\d{10})

Example

Optional whitespace character in this regular expression pattern

You can use below regex:

^schedule\s*=\s*([0-9]+)

Also the value are grouped so Group-1 would contain only the value(60 in your case)

Regex to find string with optional spaces

Try this:

edit\s?=\s?yes(once)?

Problems with your regex:

  • Whitespace is \s, not /s - the escape character is backslash, not slash.
  • You don't need [] around a single character (or escaped entity)
  • [yes|yesonce] means any one of the characters y e s | y e s o n c e, not either yes or yesonce.
  • You meant (yes|yesonce), although that would always match yes, and not capture the once after the yes was matched. You could use (yesonce|yes) instead to avoid this, but..
  • yes(once)? is simpler :)

If you intended to allow any number of spaces, rather than one or none, you need to replace the appropriate ? symbols ("zero or one") with * ("any number including zero"):

edit\s*=\s*yes(once)?


Related Topics



Leave a reply



Submit