Regular expression to find two strings anywhere in input
/^.*?\bcat\b.*?\bmat\b.*?$/m
Using the m
modifier (which ensures the beginning/end metacharacters match on line breaks rather than at the very beginning and end of the string):
^
matches the line beginning.*?
matches anything on the line before...\b
matches a word boundary the first occurrence of a word boundary (as @codaddict discussed)- then the string
cat
and another word boundary; note that underscores are treated as "word" characters, so_cat_
would not match*; .*?
: any characters before...- boundary,
mat
, boundary .*?
: any remaining characters before...$
: the end of the line.
It's important to use \b
to ensure the specified words aren't part of longer words, and it's important to use non-greedy wildcards (.*?
) versus greedy (.*
) because the latter would fail on strings like "There is a cat on top of the mat which is under the cat." (It would match the last occurrence of "cat" rather than the first.)
* If you want to be able to match _cat_
, you can use:
/^.*?(?:\b|_)cat(?:\b|_).*?(?:\b|_)mat(?:\b|_).*?$/m
which matches either underscores or word boundaries around the specified words. (?:)
indicates a non-capturing group, which can help with performance or avoid conflicted captures.
Edit: A question was raised in the comments about whether the solution would work for phrases rather than just words. The answer is, absolutely yes. The following would match "A line which includes both the first phrase and the second phrase":
/^.*?(?:\b|_)first phrase here(?:\b|_).*?(?:\b|_)second phrase here(?:\b|_).*?$/m
Edit 2: If order doesn't matter you can use:
/^.*?(?:\b|_)(first(?:\b|_).*?(?:\b|_)second|second(?:\b|_).*?(?:\b|_)first)(?:\b|_).*?$/m
And if performance is really an issue here, it's possible lookaround (if your regex engine supports it) might (but probably won't) perform better than the above, but I'll leave both the arguably more complex lookaround version and performance testing as an exercise to the questioner/reader.
Edited per @Alan Moore's comment. I didn't have a chance to test it, but I'll take your word for it.
Regex match two words in a string
This will match exactly the words "Mac" and "ExchangeWebServices" with anything else between them:
\bMac\b.*\bExchangeWebServices\b
Regex 101 Example: https://regex101.com/r/sK2qG1/4
Regex to match string containing two names in any order
You can do checks using positive lookaheads. Here is a summary from the indispensable regular-expressions.info:
Lookahead and lookbehind, collectively called “lookaround”, are
zero-length assertions...lookaround actually matches characters, but
then gives up the match, returning only the result: match or no match.
That is why they are called “assertions”. They do not consume
characters in the string, but only assert whether a match is possible
or not.
It then goes on to explain that positive lookaheads are used to assert that what follows matches a certain expression without taking up characters in that matching expression.
So here is an expression using two subsequent postive lookaheads to assert that the phrase matches jack
and james
in either order:
^(?=.*\bjack\b)(?=.*\bjames\b).*$
Test it.
The expressions in parentheses starting with ?=
are the positive lookaheads. I'll break down the pattern:
^
asserts the start of the expression to be matched.(?=.*\bjack\b)
is the first positive lookahead saying that what follows must match.*\bjack\b
..*
means any character zero or more times.\b
means any word boundary (white space, start of expression, end of expression, etc.).jack
is literally those four characters in a row (the same forjames
in the next positive lookahead).$
asserts the end of the expression to me matched.
So the first lookahead says "what follows (and is not itself a lookahead or lookbehind) must be an expression that starts with zero or more of any characters followed by a word boundary and then jack
and another word boundary," and the second look ahead says "what follows must be an expression that starts with zero or more of any characters followed by a word boundary and then james
and another word boundary." After the two lookaheads is .*
which simply matches any characters zero or more times and $
which matches the end of the expression.
"start with anything then jack or james then end with anything" satisfies the first lookahead because there are a number of characters then the word jack
, and it satisfies the second lookahead because there are a number of characters (which just so happens to include jack
, but that is not necessary to satisfy the second lookahead) then the word james
. Neither lookahead asserts the end of the expression, so the .*
that follows can go beyond what satisfies the lookaheads, such as "then end with anything".
I think you get the idea, but just to be absolutely clear, here is with jack
and james
reversed, i.e. "start with anything then james or jack then end with anything"; it satisfies the first lookahead because there are a number of characters then the word james
, and it satisfies the second lookahead because there are a number of characters (which just so happens to include james
, but that is not necessary to satisfy the second lookahead) then the word jack
. As before, neither lookahead asserts the end of the expression, so the .*
that follows can go beyond what satisfies the lookaheads, such as "then end with anything".
This approach has the advantage that you can easily specify multiple conditions.
^(?=.*\bjack\b)(?=.*\bjames\b)(?=.*\bjason\b)(?=.*\bjules\b).*$
Match a string between two or more words regardless of order
You may use a backreference + a subroutine:
\b(longword1|longword2)\b.*?\b(?!\1\b)(?1)\b
Expanding it for three alternatives:
\b(longword1|longword2|longword3)\b.*?\b(?!\1\b)((?1))\b.*?\b(?!(?:\1|\2)\b)(?1)\b
See the regex demo and this regex demo, too. So, the list of words will be in Group 1, and you will only need to add backreferences before the subsequent subroutines.
Details
\b(longword1|longword2)\b
- a whole word, eitherlongword1
orlongword2
.*?
- any 0 or more chars other than line break chars, as few as possible\b
- a word boundary(?!\1\b)
- there cannot be the same text as matched in Group 1 followed with a word boundary(?1)
- a subroutine that matches the same pattern as in Group 1\b
- a word boundary
Regex match two strings with given number of words in between strings
You can use something like
import re
text = 'I want apples and oranges'
k = 2
pattern = f"apples(?:\s+\w+){{0,{k}}}\s+oranges"
m = re.search(pattern, text)
if m:
print(m.group())
# => apples and oranges
Here, I used \w+
to match a word. If the word is a non-whitespace chunk, you need to use
pattern = f"apples(?:\s+\S+){{0,{k}}}\s+oranges"
See this Python demo.
If you need to add word boundaries, you need to study the Word boundary with words starting or ending with special characters gives unexpected results and Match a whole word in a string using dynamic regex posts. For the current example, fr"\bapples(?:\s+\w+){{0,{k}}}\s+oranges\b"
will work.
The pattern will look like apples(?:\s+\w+){0,k}\s+oranges
and match
apples
- anapples
string(?:\s+\w+){0,k}
- zero to k repetitions of one or more whitespaces and one or more word chars\s+
- one or more whitespacesoranges
anoranges
string.
regex to match 2 different words in a string
You can use /pops?/
if you want to match partially.
const obj = {time_pop: 'fhfvla',icon: 'dsfval',home_pops: 'valffg',title: 'sdfsdfs',pop: 'sfsdfsd',rattle: 'sdfdsf',pops: 'sfsdfsdf'}
const only = Object.entries(obj).filter(([k, v]) => { return /pops?/g.test(k)})
console.log(only)
Regex match multiple words that may be separated by another word giving a list of possible intermediate words
You may create a regex like
/\bmake(?:\s+(?:of|the|a))*\s+wish\b/gi
See the regex demo. Details
\b
- a word boundarymake
- a word(?:\s+(?:of|the|a))*
- 0 or more occurrences of\s+
- 1+ whitespaces(?:of|the|a)
- eitherof
,the
ora
(you might want to usean?
to also matchan
)
\s+
- 1+ whitespaceswish
- a wordwish
\b
- a word boundary
In your code, you may use
let stopword: string[]= ["of", "the", "a"];let to_match : string = "make wish";let text: string = "make wish wish make a wish wish wish make the a wish make";const regex = new RegExp(`\\b${to_match.split(/\s+/).join("(?:\\s+(?:" + stopword.join("|") + "))*\\s+")}\\b`, "gi"); console.log(text.match(regex));
PHP- Regex to match words with more than two letters
The reason it is not working is because the pattern [^\w{2,}]*([\s]+([^\w{2,}])*|$)
matches only spaces, and then you split on those spaces resulting in an array with all the words. This is due to \s
which matches a whitespace char, and using the negated character class [^\w{2,}]
which also matches whitespace chars.
If you want to use split, you also have to match the single word characters so that they are not part of the result.
If you must use split, you can match either a single word character surrounded by optional horizontal whitespace characters to remove those as well, or match 1+ horizontal whitespace characters.
\h*\b\w\b\h*|\h+
Regex demo
For example
$input_string = "I have a cake inside my fridge";
$string_array = preg_split("/\h*\b\w\b\h*|\h+/", $input_string, -1, PREG_SPLIT_NO_EMPTY);
print_r($string_array);
Output
Array
(
[0] => have
[1] => cake
[2] => inside
[3] => my
[4] => fridge
)
If you want to match all strings that consist of at least 2 characters, you could also use \S{2,}
with preg_match_all.
Related Topics
Mobile Safari Sometimes Does Not Trigger the Click Event
Javascript: How to Redirect a Page After Validation
Stop Just One Dropdown Toggle from Closing on Click
React-Native React-Navigation Undefined Is Not an Object (Evaluating 'S.Value.Startswith')
Slick Carousel - Force Slides to Have the Same Height
Calculate Percentage JavaScript
Calculate and Display % Discount for Price Classes JavaScript
Django Template JavaScript Passing a Python Variable to a JavaScript One
Typescript - Possible to Disable Type Checking
How to Build Pdf File from Binary String Returned from a Web-Service Using JavaScript
Increase Counter Value Upon Button Click
How to Disable Eslint Rule Max Line Length for Paragraph in <Template> of Vue.Js
How to Count Unique Value from Object of Array in JavaScript
Use Lodash to Find Objects in an Array Matches an Id (Complex, Drilldown)
Ngoninit Not Being Called When Injectable Class Is Instantiated
Lodash Group by Multiple Properties If Property Value Is True