PHP Preg_Match to Find Whole Words

PHP preg_match to find whole words

preg_match('/\b(express\w+)\b/', $string, $matches); // matches expression
preg_match('/\b(\w*form\w*)\b/', $string, $matches); // matches perform,
// formation, unformatted

Where:

  • \b is a word boundary
  • \w+ is one or more "word" character*
  • \w* is zero or more "word" characters

See the manual on escape sequences for PCRE.


* Note: although not really a "word character", the underscore _ is also included int the character class \w.

PHP Preg_match each word in a string to find matches with all the items in an array that contains forbidden words

The key is to use \b assertion for word-boundary:

<?php
$forbidden = ['pool', 'cat', 'rain'];

// Examples
$examples = [
// pool:
'the pool is cold', //should be TRUE - working fine
'the poolside is yellow', //should be TRUE - working fine
'the carpool lane is closed', //should be FALSE - currently failing
'the print spooler is not working', //should be FALSE - currently failing

// cat:
'the cats are wasting my time', //should be TRUE - working fine
'the cat is wasting my time', //should be TRUE - working fine
'joe is using the bobcat right now', //should be FALSE - currently failing
];

$pattern = '/\b(' . implode ('|', $forbidden) . ')/i';

foreach ($examples as $example) {
echo ((preg_match ($pattern, $example) ? 'TRUE' : 'FALSE') . ': ' . $example . "\n");
}

http://sandbox.onlinephpfunctions.com/code/f424e6c78d3b13905486f646667c8bc9d48eda3a

Regex: Match whole word in string PHP

You can use the following regex

\bIndia(?=$|\s)
  • (?=$|\s) Positive Lookahead $ assert position at end of a line or \s match any white space character [\r\n\t\f ]

If you want to allow ,(comma) or .(dot) then you can simply use

\bIndia(?=[.,]|$|\s)

Regex

How to make preg_match to find whole word but not separate hyphen-words?

You can use lookbehind and lookahead operators. This operators looks in behind and after but not match them.

for example use \b(?<!-)xyz(?!-)\b for finding whole words of xyz that doesn't have - before or after.

PHP substring matching whole words

Like this as I said in the comments

function StringMatch($str1,$str2)
{
return preg_match('/\b'.preg_quote($str1,'/').'\b/i', $str2);
}

echo StringMatch("apple watch", "apple watches"); // output 0
echo "\n";
echo StringMatch("apple watch", "apple watch repairs"); // output 1
echo "\n";
echo StringMatch("apple watch", "new apple watch"); // output 1
echo "\n";
echo StringMatch("apple watch", "pineapple watch"); // output 0
echo "\n";

Output:

0
1
1
0

Sandbox

Preg Quote in necessary to avoid issues where $str1 could contain things like . which in Regex is any character.

Furthermore you could strip punctuation like this

$str1 = preg_replace('/[^\w\s]+/', '', $str1);

For example:

echo StringMatch("apple watch.", "apple watch repairs"); // output 1

Without removing the punctuation, this will return 0. Rather or not that is important is up to you.

Sandbox

UPDATE

Match out of order, for example:

//words out of order
echo StringMatch("watch apple", "new apple watch"); // output 1

The easy way is implode/explode:

function StringMatch($str1,$str2)
{
//use one or the other
$str1 = preg_replace('/[^\w\s]+/', '', $str1);
//$str1 = preg_quote($str1,'/');
$words = explode(' ', $str1);
preg_match_all('/\b('.implode('|',$words).')\b/i', $str2, $matches);
return count($words) == count($matches[0]) ? '1' : '0';
}

Sandbox

You can also skip the explode/implode and use

 $str1 = preg_replace('/\s/', '|', $str1);

Which can be combined to the other preg_replace

 $str1 = preg_replace(['/[^\w\s]+/','/\s/'], ['','|'], $str1);

Or all together

function StringMatch($str1,$str2)
{
$str1 = preg_replace(['/[^\w\s]+/','/\s/'], ['','|'], $str1);
preg_match_all('/\b('.$str1.')\b/i', $str2, $matches);
return (substr_count($str1, '|')+1) == count($matches[0]) ? '1' : '0';
}

Sandbox

But then of course you can't count the words array, but you can count the number of | pipes which is 1 less then the number of words (hence the +1). That is if you care that all the words match.

PREG_MATCH check all words and condition

You need anchored look-aheads:

^(?=.*\bWord1\b)(?=.*\bWord2\b)(?=.*\bWord3\b)

See demo

If there are newline symbols in the input string, you need to use an /s modifier.

Here is an IDEONE demo:

$re = '/^(?=.*\bWord1\b)(?=.*\bWord2\b)(?=.*\bWord3\b)/'; 
$str = "Word4 Word2 Word1 Word3 Word5 Word7";

$myPregMatch = (preg_match($re, $str));
if ($myPregMatch){
echo "FOUND !!!";
}

Result: FOUND !!!

PHP Preg_match match exact word

Use the faster strpos if you only need to check for the existence of two numbers.

if(strpos($mystring, '|7|') !== FALSE AND strpos($mystring, '|11|') !== FALSE)
{
// Found them
}

Or using slower regex to capture the number

preg_match('/\|(7|11)\|/', $mystring, $match);

Use regexpal to test regexes for free.

PHP preg_match - catching full 'words' (even if they start with a special character)

Intuition breaks down when applying word-boundary to a pattern that contains non-word characters. More on that here. What you seem to want, for this case, is \s:

function contains($str, $bads)
{
$template = '/(\s+%1$s\s+|^\s*%1$s\s+|\s+%1$s\s*$|^\s*%1$s\s*$)/';
foreach ($bads as $a) {
$regex = sprintf($template, preg_quote($a, '/'));
if (preg_match($regex, $str)) {
return true;
}
}
return false;
}

See it in action at 3v4l.org.

The regex checks for four different cases, each separated by |:

  1. One or more spaces, the bad pattern, then one or more spaces.
  2. Start of input, zero or more spaces, the bad pattern, then one or more spaces.
  3. One or more spaces, the bad pattern, zero or more spaces, then end of input.
  4. Start of input, zero or more spaces, the bad pattern, zero or more spaces, then end of input.

If you could guarantee that all of your bad patterns contained only word characters - [0-9A-Za-z_] - then \b would work just fine. Since that is not true here, you need to deploy a more explicit pattern.

preg_match_all to find all words after a certain string

First, once you have extracted :amsterdam:hotel you can easily split the string.

If you want to directly obtain separated words, you can use a \G based pattern:

preg_match_all('~(?:\G(?!\A)|cache:search):\K[^:]+~', $subject, $matches);

Where \G matches the position immediately after the previous match. (Note that \G matches the start of the string too, that's why I added (?!\A).)

Using preg_match to find all words in a list

Okay, here some example code with preg_match_all() that shows how to remove the nesting as well:

$pattern = '\b(?:amsterdam|paris|zurich|munich|frankfurt|bulle)\b';
$result = preg_match_all($pattern, $subject, $matches);

# Check for errors in the pattern
if (false === $result) {
throw new Exception(sprintf('Regular Expression failed: %s.', $pattern));
}

# Get the result, for your pattern that's the first element of $matches
$foundCities = $result ? $matches[0] : array();

printf("Found %d city/cities: %s.\n", count($foundCitites), implode('; ', $foundCities));

As $foundCities is now a simple array, you can iterate over it directly as well:

foreach($foundCities as $index => $city) {
echo $index, '. : ', $city, "\n";
}

No need for a nested loop as the $matches return value has been normalized already. The concept is to make the code return / create the data as you need it for further processing.



Related Topics



Leave a reply



Submit