Backreference Does Not Work in PHP

Backreference does not work in PHP

It is because you use a double quoted string, inside a double quoted string \1 is read as the octal notation of a character (the control character SOH = start of heading), not as an escaped 1.

So two ways:

use single quoted string:

'/\[(b|i|u|s)\]\s*(.*?)\s*\[\/\1\]/i'

or escape the backslash to obtain a literal backslash (for the string, not for the pattern):

"/\[(b|i|u|s)\]\s*(.*?)\s*\[\/\\1\]/i"

As an aside, you can write your pattern like this:

$pattern = '~\[([bius])]\s*(.*?)\s*\[/\1]~i';

// with oniguruma notation
$pattern = '~\[([bius])]\s*(.*?)\s*\[/\g{1}]~i';

// oniguruma too but relative:
// (the second group on the left from the current position)
$pattern = '~\[([bius])]\s*(.*?)\s*\[/\g{-2}]~i';

Rewrite Rule not working back reference

RewriteCond %{QUERY_STRING} game=MyGame
RewriteRule ^games/freeGames?$ /flash/games/%1? [L,R=301]

In this code, the %1 backreference is always empty, since you have not defined a captured group (ie. parenthesised subpattern) in the preceding RewriteCond directive. So, the resulting URL is always /flash/games/ (your "homepage" I presume).

You need to define the captured group by surrounding the string/regex in parenthesis), for example:

RewriteCond %{QUERY_STRING} game=(MyGame)

Another question : is possible to have in %{QUERY_STRING} and array ? For example MyGame, OtherGame, Biliard, etc ?

If you need to match multiple items then you can use alternation in the captured group. For example:

RewriteCond %{QUERY_STRING} game=(MyGame|OtherGame|Biliard|etc.)

Now it will match "MyGame" or "OtherGame" or "Biliard" etc, and whatever matches is saved in the %1 backreference.

empty backreference causes match failure in PHP... is there a workaround?

why don't you simply use \1 instead of \2?

preg_match_all("/\{(something(:else)?)\}(.*?)\{\/\\1\}/is", $data, $matches);

as to "you need a parser" problem, you will / do need it to parse nested constructs.

How does this backreference condition in a PHP regex work?

You get the 2 matches as you are using a capture group.

With the pattern that you tried (dog\s*)?cat\s*(?(1)dog) you get a match for cat in dog cat

This is because the pattern optionally matches dog. If there is dog, it is captured and then tries to match cat.

Then in the if clause is states: if we have group 1 present, match dog. What happens is that if there is no match in group 1, it can still match cat as the capture group 1 is optional.

So in dog cat it eventually can not match dog, but the following cat it can match when the attempt starts at cat.


If you want to match all 3 words dog cat dog or only a single cat and you don't want to match dog cat you might use

\b(?:dog cat dog|dog cat\b(*SKIP)(*F)|cat)\b
  • \b A word boundary to prevent a partial match
  • (?: Non capture group
    • dog cat dog Match literally
    • | Or
    • dog cat\b(*SKIP)(*F) In case of dog cat skip the match
    • | Or
    • cat Math only cat
  • ) Close non capture group
  • \b A word boundary

Regex demo | Php demo

For example

$strings = [
"cat",
"dog cat dog",
"dog cat",
"cat dog",
"this cat cat is a test dog cat dog cat"
];
$pattern = "/\b(?:dog cat dog|dog cat\b(*SKIP)(*F)|cat)\b/";
foreach ($strings as $str) {
preg_match_all($pattern, $str, $matches);
print_r($matches[0]);
}

Output

Array
(
[0] => cat
)
Array
(
[0] => dog cat dog
)
Array
(
)
Array
(
[0] => cat
)
Array
(
[0] => cat
[1] => cat
[2] => dog cat dog
[3] => cat
)

An alternative approach using a capture group could be matching what you want to avoid, and capture what you want to keep. For matching spaces, you could use \s but note that it could also match a newline.

\bdog cat\b(?! dog\b)|\b(dog cat dog|cat)\b

Regex demo

If a quantifier is available in a lookbehind assertion, you might also use

\bdog cat dog\b|(?<!dog *)\bcat\b|cat(?= *dog\b)

Regex demo

regex, problem with backreference in pattern with preg_match_all

Make your regex ungreedy:

preg_match_all('/__((\'|")([^\1]+)\1/U', "__('match this') . 'not this'", $matches)

regex back-reference not working in PHP PCRE

You didn't show your PHP code, but I surmise you have your regex in double quotes. If so then the backreference \1 actually is converted into an ASCII character before it reaches PCRE. (All \123 sequences are interpreted as C-string octal escapes there.)

RegEx BackReference to Match Different Values

Note that \g{N} is equivalent to \1, that is, a backreference that matches the same value, not the pattern, that the corresponding capturing group matched. This syntax is a bit more flexible though, since you can define the capture groups that are relative to the current group by using - before the number (i.e. \g{-2}, (\p{L})(\d)\g{-2} will match a1a).

The PCRE engine allows subroutine calls to recurse subpatterns. To repeat the pattern of Group 1, use (?1), and (?&Val) to recurse the pattern of the named group Val.

Also, you may use character classes to match single characters, and consider using ? quantifier to make parts of the regex optional:

(\(\s*(?P<Val>[a-zA-Z]+[0-9]*|[0-9]+|\'.*\'|\[.*\])\s*(ni|in|[*\/+-]|[=!><]=|[><])\s*((?&Val))\s*\))

See the regex demo

Note that \'.*\' and \[.*\] can match too much, consider replacing with \'[^\']*\' and \[[^][]*\].



Related Topics



Leave a reply



Submit