How to Search in an Array with Preg_Match

How to search in an array with preg_match?

In this post I'll provide you with three different methods of doing what you ask for. I actually recommend using the last snippet, since it's easiest to comprehend as well as being quite neat in code.

How do I see what elements in an array that matches my regular expression?

There is a function dedicated for just this purpose, preg_grep. It will take a regular expression as first parameter, and an array as the second.

See the below example:

$haystack = array (
'say hello',
'hello stackoverflow',
'hello world',
'foo bar bas'
);

$matches = preg_grep ('/^hello (\w+)/i', $haystack);

print_r ($matches);

output

Array
(
[1] => hello stackoverflow
[2] => hello world
)

Documentation

  • PHP: preg_grep - Manual

But I just want to get the value of the specified groups. How?

array_reduce with preg_match can solve this issue in clean manner; see the snippet below.

$haystack = array (
'say hello',
'hello stackoverflow',
'hello world',
'foo bar bas'
);

function _matcher ($m, $str) {
if (preg_match ('/^hello (\w+)/i', $str, $matches))
$m[] = $matches[1];

return $m;
}

// N O T E :
// ------------------------------------------------------------------------------
// you could specify '_matcher' as an anonymous function directly to
// array_reduce though that kind of decreases readability and is therefore
// not recommended, but it is possible.

$matches = array_reduce ($haystack, '_matcher', array ());

print_r ($matches);

output

Array
(
[0] => stackoverflow
[1] => world
)

Documentation

  • PHP: array_reduce - Manual
  • PHP: preg_match - Manual

Using array_reduce seems tedious, isn't there another way?

Yes, and this one is actually cleaner though it doesn't involve using any pre-existing array_* or preg_* function.

Wrap it in a function if you are going to use this method more than once.

$matches = array ();

foreach ($haystack as $str)
if (preg_match ('/^hello (\w+)/i', $str, $m))
$matches[] = $m[1];

Documentation

  • PHP: preg_match - Manual

preg_match array items in string?

How about this:

$badWords = array('one', 'two', 'three');
$stringToCheck = 'some stringy thing';
// $stringToCheck = 'one stringy thing';

$noBadWordsFound = true;
foreach ($badWords as $badWord) {
if (preg_match("/\b$badWord\b/", $stringToCheck)) {
$noBadWordsFound = false;
break;
}
}
if ($noBadWordsFound) { ... } else { ... }

How to find pattern matching and save to array with preg_match

Use preg_match instead of preg_replace

preg_match(
"/\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9+&@#\/%?=~_|!:,.;]*[-a-z0-9+&@#\/%=~_|]/i",
$input, $matches);

print_r($matches[0]);

Update :

preg_match will result only single match.

For matching all the occurrence use preg_match_all

PHP using preg_match to match items in array with values that can or not contain accent characters

I suggest a solution based on removing any combining Unicode characters from both the filtered string and the forbidden words. It will require intl extension (sudo apt install php7.4-intl && sudo phpenmod intl). Firstly, it decomposes the Uncode string into characters and combining letter modifiers, secondly, it removes all modifiers (\p{M}):

<?php
$string = 'los mamíferos corren libres y quieren acompanar a su madre';

$forbidden = ['mamiferos', 'acompañar'];

function strip (string $accented): string {
$decomposed = Normalizer::normalize ($accented, Normalizer::FORM_D);
return preg_replace ('/\p{M}/u', '', $decomposed);
}

function filter (string $string, array $words): bool {
$regex = '/\b(?:' . implode ('|', $words) . ')/i';
return preg_match (strip ($regex), strip ($string));
}
echo ((filter ($string, $forbidden) ? 'match!' : 'nope...') . "\n");

By the way, I don't understand the meaning of {3,} in your regular expression, and I removed it from mine. If you think that it will match a string with three or more forbidden words, you are mistaken: the forbidden words will match only if they immediately follow each other.

Further reading: https://www.php.net/manual/en/class.normalizer.

preg_match word and check if it is present in array

The reason it's not working now is that you are not breaking out of the loop when a match is found. This means if your second word matches, flag will be "1" but the loop will continue. Then if the next word is not matched, flag will be reset to "0" printing "Word not matched".

Change the line $flag = 1 to these two:

$flag = 1;
break;

And it will work.

PHP Preg_match each word in a string to find matches with all the items in an array that contains forbidden words

The key is to use \b assertion for word-boundary:

<?php
$forbidden = ['pool', 'cat', 'rain'];

// Examples
$examples = [
// pool:
'the pool is cold', //should be TRUE - working fine
'the poolside is yellow', //should be TRUE - working fine
'the carpool lane is closed', //should be FALSE - currently failing
'the print spooler is not working', //should be FALSE - currently failing

// cat:
'the cats are wasting my time', //should be TRUE - working fine
'the cat is wasting my time', //should be TRUE - working fine
'joe is using the bobcat right now', //should be FALSE - currently failing
];

$pattern = '/\b(' . implode ('|', $forbidden) . ')/i';

foreach ($examples as $example) {
echo ((preg_match ($pattern, $example) ? 'TRUE' : 'FALSE') . ': ' . $example . "\n");
}

http://sandbox.onlinephpfunctions.com/code/f424e6c78d3b13905486f646667c8bc9d48eda3a

How do you perform a preg_match where the pattern is an array, in php?

First of all, if you literally are only doing dozens every minute, then I wouldn't worry terribly about the performance in this case. These matches are pretty quick, and I don't think you're going to have a performance problem by iterating through your patterns array and calling preg_match separately like this:

$matches = false;
foreach ($pattern_array as $pattern)
{
if (preg_match($pattern, $page))
{
$matches = true;
}
}

You can indeed combine all the patterns into one using the or operator like some people are suggesting, but don't just slap them together with a |. This will break badly if any of your patterns contain the or operator.

I would recommend at least grouping your patterns using parenthesis like:

foreach ($patterns as $pattern)
{
$grouped_patterns[] = "(" . $pattern . ")";
}
$master_pattern = implode($grouped_patterns, "|");

But... I'm not really sure if this ends up being faster. Something has to loop through them, whether it's the preg_match or PHP. If I had to guess I'd guess that individual matches would be close to as fast and easier to read and maintain.

Lastly, if performance is what you're looking for here, I think the most important thing to do is pull out the non regex matches into a simple "string contains" check. I would imagine that some of your checks must be simple string checks like looking to see if "This Site is Closed" is on the page.

So doing this:

foreach ($strings_to_match as $string_to_match)
{
if (strpos($page, $string_to_match) !== false))
{
// etc.
break;
}
}
foreach ($pattern_array as $pattern)
{
if (preg_match($pattern, $page))
{
// etc.
break;
}
}

and avoiding as many preg_match() as possible is probably going to be your best gain. strpos() is a lot faster than preg_match().



Related Topics



Leave a reply



Submit