Match a Whole Word in a String Using Dynamic Regex

Match a whole word in a string using dynamic regex

Why not use a word boundary?

match_string = r'\b' + word + r'\b'
match_string = r'\b{}\b'.format(word)
match_string = rf'\b{word}\b' # Python 3.7+ required

If you have a list of words (say, in a words variable) to be matched as a whole word, use

match_string = r'\b(?:{})\b'.format('|'.join(words))
match_string = rf'\b(?:{"|".join(words)})\b' # Python 3.7+ required

In this case, you will make sure the word is only captured when it is surrounded by non-word characters. Also note that \b matches at the string start and end. So, no use adding 3 alternatives.

Sample code:

import re
strn = "word hereword word, there word"
search = "word"
print re.findall(r"\b" + search + r"\b", strn)

And we found our 3 matches:

['word', 'word', 'word']

NOTE ON "WORD" BOUNDARIES

When the "words" are in fact chunks of any chars you should re.escape them before passing to the regex pattern:

match_string = r'\b{}\b'.format(re.escape(word)) # a single escaped "word" string passed
match_string = r'\b(?:{})\b'.format("|".join(map(re.escape, words))) # words list is escaped
match_string = rf'\b(?:{"|".join(map(re.escape, words))})\b' # Same as above for Python 3.7+

If the words to be matched as whole words may start/end with special characters, \b won't work, use unambiguous word boundaries:

match_string = r'(?<!\w){}(?!\w)'.format(re.escape(word))
match_string = r'(?<!\w)(?:{})(?!\w)'.format("|".join(map(re.escape, words)))

If the word boundaries are whitespace chars or start/end of string, use whitespace boundaries, (?<!\S)...(?!\S):

match_string = r'(?<!\S){}(?!\S)'.format(word)
match_string = r'(?<!\S)(?:{})(?!\S)'.format("|".join(map(re.escape, words)))

How to match entire words dynamically using Regex in Python

You forgot an r in your second '\b'.

re.search(r'\b' + re.escape(word) + r'\b', ...)
# ^

The escape sequence \b has special meaning in Python and will become \x08 (U+0008). The regex engine seeing \x08 will try to match this literal character and fail.

Also, I used re.escape(word) to escape special regex characters, so e.g. if a word is "etc. and more" the dot will be matched literally, instead of matching any character.

Matching whole words with special characters with a dynamically built pattern

I suggest either unambigous word boundaries (that match a string only if the search pattern is not enclosed with letters, digits or underscores):

String pattern = "(?<!\\w)"+Pattern.quote(subItem)+"(?!\\w)";

where (?<!\w) matches a location not preceded with a word char and (?!\w) fails if there is no word char immediately after the current position (see this regex demo), or, you can use a variation that takes into account leading/trailing special chars of the potential match:

String pattern = "(?:\\B(?!\\w)|\\b(?=\\w))" + Pattern.quote(subword) + "(?:(?<=\\w)\\b|(?<!\\w)\\B)";

See the regex demo.

Details:

  • (?:\B(?!\w)|\b(?=\w)) - either a non-word boundary if the next char is not a word char, or a word boundary if the next char is a word char
  • Data\[3\] - this is a quoted subItem
  • (?:(?<=\w)\b|(?<!\w)\B) - either a word boundary if the preceding char is a word char, or a non-word boundary if the preceding char is not a word char.

Filter list using dynamic whole word matching regex in dart

You may build the pattern dynamically:

var keys = ['dev', 'soft', 'angular', 'java'];
var regex = new RegExp("\\b(?:${keys.join('|')})\\b", caseSensitive: false);
var contactsAll = ['No match', 'I like java', 'I like javascript'];
var cc = contactsAll.where( (i) => regex.hasMatch(i) ).toList();
print(cc); // => [I like java]

The regex will look like \b(?:dev|soft|angular|java)\b and will match any of the keywords inside the non-capturing group as a whole word due to the \b word boundaries. See the regex demo.

If the keys can contain special characters, but you still need a whole word search, you need to escape all special characters and use either unambiguous boundaries

var regex = new RegExp("(?:^|\\W)(?:${keys.map((val) => val.replaceAll(new RegExp(r'[-\/\\^$*+?.()|[\]{}]'), r'\\$&')).join('|')})(?!\\w)", caseSensitive: false);

This results in a (?:^|\W)(?:dev|soft|angular|java)(?!\w) pattern (see demo) where (?:^|\W) matches start of string or a non-word char and (?!\w) requires the absense of a word char immediately to the right of the current location.

The .map((val) => val.replaceAll(new RegExp(r'[-\/\\^$*+?.()|[\]{}]'), r'\\$&')) part escapes the literal part for use within regex.

Or whitespace boundaries:

var regex = new RegExp("(?:^|\\s)(?:${keys.map((val) => val.replaceAll(new RegExp(r'[-\/\\^$*+?.()|[\]{}]'), r'\\$&')).join('|')})(?!\\S)", caseSensitive: false);

This results in a (?:^|\s)(?:dev|soft|angular|java)(?!\S) pattern where (?:^|\s) matches start of string or a whitespace char and (?!\S) requires the absense of a non-whitespace char immediately to the right of the current location.

See the regex demo.

Dynamically match multiple words in a string to another one using regExp

For the situation you're describing, you don't even need regular expressions. If you split the search string on spaces; you can check every one of the words to match is contained within the array of search words.

function matchesAllWords(searchWords, inputString) {
var wordsToMatch = inputString.toLowerCase().split(' ');

return wordsToMatch.every(
word => searchWords.indexOf(word) >= 0);
}

In the snippet below, typing in the input causes a recalculation of the searchWords. The matching li elements are then given the .match class to highlight them.

function updateClasses(e) {

var searchWords = e.target.value.toLowerCase().split(' ');

listItems.forEach(listItem => listItem.classList.remove('match'));

listItems.filter(

listItem =>

matchesAllWords(searchWords, listItem.innerText))

.forEach(

matchingListItem =>

matchingListItem.classList.add('match'));

}

function matchesAllWords(searchWords, inputString) {

var wordsToMatch = inputString.toLowerCase().split(' ');



return wordsToMatch.every(

word => searchWords.indexOf(word) >= 0);

}

function searchProperties(e) {

var searchWords = e.target.value.toLowerCase().split(' ');

for (var property in propertiesToSearch) {

if (matchesAllWords(searchWords, property)) {

console.log(property, propertiesToSearch[property]);

}

}

}

var propertiesToSearch = {

"red apples": 1,

"juicy fruit": 2

};

listItems = [].slice.call(

document.getElementById('matches').querySelectorAll('li')

);

document.getElementById('search').addEventListener('keyup', updateClasses);

document.getElementById('search').addEventListener('keyup', searchProperties);
.match {

color: green;

}
<label for="search">

Search:

</label>

<input type="text" name="search" id="search" />

<ul id="matches">

<li>red apples</li>

<li>juicy fruits</li>

</ul>

Regex match two strings with given number of words in between strings

You can use something like

import re
text = 'I want apples and oranges'
k = 2
pattern = f"apples(?:\s+\w+){{0,{k}}}\s+oranges"
m = re.search(pattern, text)
if m:
print(m.group())

# => apples and oranges

Here, I used \w+ to match a word. If the word is a non-whitespace chunk, you need to use

pattern = f"apples(?:\s+\S+){{0,{k}}}\s+oranges"

See this Python demo.

If you need to add word boundaries, you need to study the Word boundary with words starting or ending with special characters gives unexpected results and Match a whole word in a string using dynamic regex posts. For the current example, fr"\bapples(?:\s+\w+){{0,{k}}}\s+oranges\b" will work.

The pattern will look like apples(?:\s+\w+){0,k}\s+oranges and match

  • apples - an apples string
  • (?:\s+\w+){0,k} - zero to k repetitions of one or more whitespaces and one or more word chars
  • \s+ - one or more whitespaces
  • oranges an oranges string.

How can I match a whole word in JavaScript?

To use a dynamic regular expression see my updated code:

new RegExp("\\b" + lookup + "\\b").test(textbox.value)

Your specific example is backwards:

alert((/\b(2)\b/g).test(lookup));

Regexpal

Regex Object

Regex Match ALL Dynamic string words to data attribute

I suggest using

var regExApproved = new RegExp(string.split(",").map(x => x.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&')).join('|'), 'i');

Then, to check if a regex matches a string, it makes more sense to use RegExp#test method:

regExApproved.test($(this).attr("data-payment"))

Notes:

  • .split(",") - splits into comma-separated chunks
  • .map(x => x.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&')) (or .map(function(x) return x.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&'); })) - escapes the alternatives
  • .join('|') creates the final alternation pattern.

Edit

It seems that your string is an array of values. In that case, use

var string= ['visa','mastercard'];

var regExApproved = new RegExp(string.join('|'), 'i');

// If the items must be matched as whole words:

// var regExApproved = new RegExp('\\b(?:' + string.join('|') + ')\\b', 'i');

// If the array items contain special chars:

// var regExApproved = new RegExp(string.map(x => x.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&')).join('|'), 'i');

console.log(regExApproved.test("There is MasterCard here"));

how to match multiple words in python

Change your regex to:

keywords_to_match = re.compile(r'\b(?:new zealand|mars|murica)\b')

I'm not sure you need a regex in your case. You can simply do:

titles = ['walk to new zealand', 'fly to mars', 'drive to murica']
[t for t in titles if any(k in t for k in keywords)]

Python match string with dynamic number in string

re.match(r'([a-z])+/(\d)+/(\w)+', 'abc/11/xyz')

(a-z)+ matches literally a-z. It seems you want to match any characters between a and z, so you need to use square brackets ([a-z])+ to make a character class.



Related Topics



Leave a reply



Submit