Match a whole word in a string using dynamic regex
Why not use a word boundary?
match_string = r'\b' + word + r'\b'
match_string = r'\b{}\b'.format(word)
match_string = rf'\b{word}\b' # Python 3.7+ required
If you have a list of words (say, in a words
variable) to be matched as a whole word, use
match_string = r'\b(?:{})\b'.format('|'.join(words))
match_string = rf'\b(?:{"|".join(words)})\b' # Python 3.7+ required
In this case, you will make sure the word is only captured when it is surrounded by non-word characters. Also note that \b
matches at the string start and end. So, no use adding 3 alternatives.
Sample code:
import re
strn = "word hereword word, there word"
search = "word"
print re.findall(r"\b" + search + r"\b", strn)
And we found our 3 matches:
['word', 'word', 'word']
NOTE ON "WORD" BOUNDARIES
When the "words" are in fact chunks of any chars you should re.escape
them before passing to the regex pattern:
match_string = r'\b{}\b'.format(re.escape(word)) # a single escaped "word" string passed
match_string = r'\b(?:{})\b'.format("|".join(map(re.escape, words))) # words list is escaped
match_string = rf'\b(?:{"|".join(map(re.escape, words))})\b' # Same as above for Python 3.7+
If the words to be matched as whole words may start/end with special characters, \b
won't work, use unambiguous word boundaries:
match_string = r'(?<!\w){}(?!\w)'.format(re.escape(word))
match_string = r'(?<!\w)(?:{})(?!\w)'.format("|".join(map(re.escape, words)))
If the word boundaries are whitespace chars or start/end of string, use whitespace boundaries, (?<!\S)...(?!\S)
:
match_string = r'(?<!\S){}(?!\S)'.format(word)
match_string = r'(?<!\S)(?:{})(?!\S)'.format("|".join(map(re.escape, words)))
How to match entire words dynamically using Regex in Python
You forgot an r
in your second '\b'
.
re.search(r'\b' + re.escape(word) + r'\b', ...)
# ^
The escape sequence \b
has special meaning in Python and will become \x08
(U+0008). The regex engine seeing \x08
will try to match this literal character and fail.
Also, I used re.escape(word)
to escape special regex characters, so e.g. if a word is "etc. and more"
the dot will be matched literally, instead of matching any character.
Matching whole words with special characters with a dynamically built pattern
I suggest either unambigous word boundaries (that match a string only if the search pattern is not enclosed with letters, digits or underscores):
String pattern = "(?<!\\w)"+Pattern.quote(subItem)+"(?!\\w)";
where (?<!\w)
matches a location not preceded with a word char and (?!\w)
fails if there is no word char immediately after the current position (see this regex demo), or, you can use a variation that takes into account leading/trailing special chars of the potential match:
String pattern = "(?:\\B(?!\\w)|\\b(?=\\w))" + Pattern.quote(subword) + "(?:(?<=\\w)\\b|(?<!\\w)\\B)";
See the regex demo.
Details:
(?:\B(?!\w)|\b(?=\w))
- either a non-word boundary if the next char is not a word char, or a word boundary if the next char is a word charData\[3\]
- this is a quotedsubItem
(?:(?<=\w)\b|(?<!\w)\B)
- either a word boundary if the preceding char is a word char, or a non-word boundary if the preceding char is not a word char.
Filter list using dynamic whole word matching regex in dart
You may build the pattern dynamically:
var keys = ['dev', 'soft', 'angular', 'java'];
var regex = new RegExp("\\b(?:${keys.join('|')})\\b", caseSensitive: false);
var contactsAll = ['No match', 'I like java', 'I like javascript'];
var cc = contactsAll.where( (i) => regex.hasMatch(i) ).toList();
print(cc); // => [I like java]
The regex will look like \b(?:dev|soft|angular|java)\b
and will match any of the keywords inside the non-capturing group as a whole word due to the \b
word boundaries. See the regex demo.
If the keys
can contain special characters, but you still need a whole word search, you need to escape all special characters and use either unambiguous boundaries
var regex = new RegExp("(?:^|\\W)(?:${keys.map((val) => val.replaceAll(new RegExp(r'[-\/\\^$*+?.()|[\]{}]'), r'\\$&')).join('|')})(?!\\w)", caseSensitive: false);
This results in a (?:^|\W)(?:dev|soft|angular|java)(?!\w)
pattern (see demo) where (?:^|\W)
matches start of string or a non-word char and (?!\w)
requires the absense of a word char immediately to the right of the current location.
The .map((val) => val.replaceAll(new RegExp(r'[-\/\\^$*+?.()|[\]{}]'), r'\\$&'))
part escapes the literal part for use within regex.
Or whitespace boundaries:
var regex = new RegExp("(?:^|\\s)(?:${keys.map((val) => val.replaceAll(new RegExp(r'[-\/\\^$*+?.()|[\]{}]'), r'\\$&')).join('|')})(?!\\S)", caseSensitive: false);
This results in a (?:^|\s)(?:dev|soft|angular|java)(?!\S)
pattern where (?:^|\s)
matches start of string or a whitespace char and (?!\S)
requires the absense of a non-whitespace char immediately to the right of the current location.
See the regex demo.
Dynamically match multiple words in a string to another one using regExp
For the situation you're describing, you don't even need regular expressions. If you split the search string on spaces; you can check every one of the words to match is contained within the array of search words.
function matchesAllWords(searchWords, inputString) {
var wordsToMatch = inputString.toLowerCase().split(' ');
return wordsToMatch.every(
word => searchWords.indexOf(word) >= 0);
}
In the snippet below, typing in the input
causes a recalculation of the searchWords
. The matching li
elements are then given the .match
class to highlight them.
function updateClasses(e) {
var searchWords = e.target.value.toLowerCase().split(' ');
listItems.forEach(listItem => listItem.classList.remove('match'));
listItems.filter(
listItem =>
matchesAllWords(searchWords, listItem.innerText))
.forEach(
matchingListItem =>
matchingListItem.classList.add('match'));
}
function matchesAllWords(searchWords, inputString) {
var wordsToMatch = inputString.toLowerCase().split(' ');
return wordsToMatch.every(
word => searchWords.indexOf(word) >= 0);
}
function searchProperties(e) {
var searchWords = e.target.value.toLowerCase().split(' ');
for (var property in propertiesToSearch) {
if (matchesAllWords(searchWords, property)) {
console.log(property, propertiesToSearch[property]);
}
}
}
var propertiesToSearch = {
"red apples": 1,
"juicy fruit": 2
};
listItems = [].slice.call(
document.getElementById('matches').querySelectorAll('li')
);
document.getElementById('search').addEventListener('keyup', updateClasses);
document.getElementById('search').addEventListener('keyup', searchProperties);
.match {
color: green;
}
<label for="search">
Search:
</label>
<input type="text" name="search" id="search" />
<ul id="matches">
<li>red apples</li>
<li>juicy fruits</li>
</ul>
Regex match two strings with given number of words in between strings
You can use something like
import re
text = 'I want apples and oranges'
k = 2
pattern = f"apples(?:\s+\w+){{0,{k}}}\s+oranges"
m = re.search(pattern, text)
if m:
print(m.group())
# => apples and oranges
Here, I used \w+
to match a word. If the word is a non-whitespace chunk, you need to use
pattern = f"apples(?:\s+\S+){{0,{k}}}\s+oranges"
See this Python demo.
If you need to add word boundaries, you need to study the Word boundary with words starting or ending with special characters gives unexpected results and Match a whole word in a string using dynamic regex posts. For the current example, fr"\bapples(?:\s+\w+){{0,{k}}}\s+oranges\b"
will work.
The pattern will look like apples(?:\s+\w+){0,k}\s+oranges
and match
apples
- anapples
string(?:\s+\w+){0,k}
- zero to k repetitions of one or more whitespaces and one or more word chars\s+
- one or more whitespacesoranges
anoranges
string.
How can I match a whole word in JavaScript?
To use a dynamic regular expression see my updated code:
new RegExp("\\b" + lookup + "\\b").test(textbox.value)
Your specific example is backwards:
alert((/\b(2)\b/g).test(lookup));
Regexpal
Regex Object
Regex Match ALL Dynamic string words to data attribute
I suggest using
var regExApproved = new RegExp(string.split(",").map(x => x.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&')).join('|'), 'i');
Then, to check if a regex matches a string, it makes more sense to use RegExp#test
method:
regExApproved.test($(this).attr("data-payment"))
Notes:
.split(",")
- splits into comma-separated chunks.map(x => x.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&'))
(or.map(function(x) return x.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&'); })
) - escapes the alternatives.join('|')
creates the final alternation pattern.
Edit
It seems that your string
is an array of values. In that case, use
var string= ['visa','mastercard'];
var regExApproved = new RegExp(string.join('|'), 'i');
// If the items must be matched as whole words:
// var regExApproved = new RegExp('\\b(?:' + string.join('|') + ')\\b', 'i');
// If the array items contain special chars:
// var regExApproved = new RegExp(string.map(x => x.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&')).join('|'), 'i');
console.log(regExApproved.test("There is MasterCard here"));
how to match multiple words in python
Change your regex to:
keywords_to_match = re.compile(r'\b(?:new zealand|mars|murica)\b')
I'm not sure you need a regex in your case. You can simply do:
titles = ['walk to new zealand', 'fly to mars', 'drive to murica']
[t for t in titles if any(k in t for k in keywords)]
Python match string with dynamic number in string
re.match(r'([a-z])+/(\d)+/(\w)+', 'abc/11/xyz')
(a-z)+
matches literally a-z
. It seems you want to match any characters between a and z, so you need to use square brackets ([a-z])+
to make a character class.
Related Topics
Does Python Do Variable Interpolation Similar to "String #{Var}" in Ruby
Is There a Built-In Function to Print All the Current Properties and Values of an Object
Create Multiple Dataframes in Loop
Understanding Dict.Copy() - Shallow or Deep
How Accurate Is Python's Time.Sleep()
Tensorflow Not Found Using Pip
How to Use String.Replace() in Python 3.X
How Are Tuples Unpacked in for Loops
Behavior of Exec Function in Python 2 and Python 3
Staleelementexception When Iterating with Python
How to Deploy a Perl/Python/Ruby Script Without Installing an Interpreter
Checking If a String Can Be Converted to Float in Python
Generate Random Integers Between 0 and 9
Converting a String Representation of a List into an Actual List Object