PHP Using Regex to Get Substring of a String

Get integer value from malformed query string

$matches = array();
preg_match('/id=([0-9]+)\?/', $url, $matches);

This is safe for if the format changes. slandau's answer won't work if you ever have any other numbers in the URL.

php.net/preg-match

How can I extract string using Php regular expression

If you are looking for name between a <p> tags and double opening and closing curly braces {{ }}, you could also do it like this:

<p>{{\K.+?(?=}}<\/p>)

Explanation

  • Match <p>{{
  • Reset the starting point of the reported match \K
  • Match any character one or more times (this will be the value you are looking for)
  • A positive lookahead (?=}}<\/p>) which asserts that what follows is }}</p>

You can use preg_match_all to find all of the matches, or use preg_match to return the first match.

Output

Your code could look like:

$subject="<p>{{name}}</p>";
$pattern="/<p>{{\K.+?(?=}}<\/p>)/";

$success = preg_match($pattern, $subject, $match);
if ($success) {
$str = substr($match[0], 5,-2);
echo $str;
} else {
echo 'not match';
}

Note that $str will be false in this case using substr.

PHP / Regex: extract string from string

Using regular expression:

preg_match('/myCountry\:\s*([^\;]+)/', $mainString, $out);
$myRegion = $out[1];

regular expression to get sub string via php

substr() only matches whole strings. You are looking for preg_match().

Update:

$name = 'hello [*kitty*],how good is today';
preg_match( '/\[(.*?)\]/', $name, $match );
var_dump( $match );

You can find the name in $match[1]. I suggest you read up on regular expressions to understand preg_match().

php regex find substring in substring

Sounds like you need string intersection. If you don't mind non regex idea, have a look in Wikibooks Algorithm Implementation/Strings/Longest common substring PHP section.

foreach(["maxs", "Mymaxmuis", "Lemu", "muster"] AS $str)
echo get_longest_common_subsequence($str, "maxmuster") . "\n";

max

maxmu

mu

muster

See this PHP demo at tio.run (caseless comparison).


If you need a regex idea, I would join both strings with space and use a pattern like this demo.

(?=(\w+)(?=\w* \w*?\1))\w

It will capture inside a lookahead at each position before a word character in the first string the longest substring that also matches the second string. Then by PHP matches of the first group need to be sorted by length and the longest match will be returned. See the PHP demo at tio.run.

function get_longest_common_subsequence($w1="", $w2="")
{
$test_str = preg_quote($w1,'/')." ".preg_quote($w2,'/');

if(preg_match_all('/(?=(\w+)(?=\w* \w*?\1))\w/i', $test_str, $out) > 0)
{
usort($out[1], function($a, $b) { return strlen($b) - strlen($a); });
return $out[1][0];
}
}

Regex to match a string that contains substrings separated with dots

Try this:

^[^.\s]\S*\.\S*[^.\s]$

Explanation:

^[^.\s]     Start with any non-dot and non-white character
\S* Any non space characters
\. Force at least one dot
\S* Any non space characters
[^.\s]$ End with any non-dot and non-white character

Demo here.

Find a substring inside a string with special characters PHP

This will work if idIwant has only numbers.

$string = '{},\"employees\":{},\"idIwant\":{\"2545\":{\"attributes\":{\"offset\":9855,';

preg_match('/idIwant.*?(\d+)/', $string, $matches);

echo $matches[1];

Test

PHP Using RegEx(preg match) to get substring of a string

Here is an expression for ya:

(?<=25k8cp1gl6-).*?(?=_(?:SVD|MNA|IMC))

Explanation:

(?<=...) is syntax for a lookahead, meaning we start by finding (but not including in our match) "25k8cp1gl6-". Then we lazily match our entire string with .*?. Finally, (?=...) is a lookahead syntax. We look for "_" followed by "SVD", "MNA", or "IMC" (separated with | in the non-capturing group (?:...)).

PHP:

$strings = array(
'25k8cp1gl6-Mein Herze im Blut, BWV 199: Recitative: Ich Wunden_SVD1329578_14691639_unified :CPN_trans:',
'25k8cp1gl6-La Puesta Del Sol_SVD1133599_12537702_unified :CPN_trans:',
'25k8cp1gl6-La Puesta Del Sol_MNA1133599_12537702_unified :CPN_trans:',
'25k8cp1gl6-La Puesta Del Sol_IMC1133599_12537702_unified :CPN_trans:',
);

foreach($strings as $string) {
if(preg_match('/(?<=25k8cp1gl6-).*?(?=_(?:SVD|MNA|IMC))/', $string, $matches)) {
$substring = reset($matches);
var_dump($substring);
}
}

Another option, which would use preg_replace(), is demoed here:

^\w+-(.*?)_(?:SVD|MNA|IMC).*

Explanation:

This one matches the entire string, but captures the part we want to keep so that we can reference it in our replacement. Also note that I began with ^\w+- instead of 25k8cp1gl6-. This pretty much just looks for any number of "word characters" ([A-Za-z0-9_]) followed by a hyphen at the beginning of the string. If it needs to be "25k8cp1gl6-", you can replace this; I just wanted to show another option.

PHP:

$strings = array(
'25k8cp1gl6-Mein Herze im Blut, BWV 199: Recitative: Ich Wunden_SVD1329578_14691639_unified :CPN_trans:',
'25k8cp1gl6-La Puesta Del Sol_SVD1133599_12537702_unified :CPN_trans:',
'25k8cp1gl6-La Puesta Del Sol_MNA1133599_12537702_unified :CPN_trans:',
'25k8cp1gl6-La Puesta Del Sol_IMC1133599_12537702_unified :CPN_trans:',
);

foreach($strings as $string) {
$substring = preg_replace('/^\w+-(.*?)_(?:SVD|MNA|IMC).*/', '$1', $string);
var_dump($substring);
}

PHP Regex to find a substring from a big string - Matching start and end

Try

 preg_match_all('/(Series.+?information)/', $str, $matches );

As

https://regex101.com/r/oJ0jZ4/1

As I said in the comments, remove the literal \. dot and the start and end anchors... I would also use a non-greedy require any character. .+?

Otherwise you could match this

Seriesinformation

if the casing of Series or information may change such as

Series .... Information

Add the /i flag as in

     preg_match_all('/(Series.+?information)/i', $str, $matches );

The outer capture group isn't really needed, but I think it looks nicer with it in there, if you just want the variable content without the Series or Information then move the capture ( ) to that bit.

 preg_match_all('/Series(.+?)information/i', $str, $matches );

Note you'll want to trim() the match because it will likely have spaces at the beginning and end or add them to the regx like this.

 preg_match_all('/Series\s(.+?)\sinformation/i', $str, $matches );

But that will exclude matching Series information with one space.

If you want to be sure you don't match over an information such as

[Series Hell In Heaven information Series Hell In Heaven information]

Matching all of that you can use a positive lookbehind

preg_match_all('/(Series.+?(?<=information))/i', $str, $matches );

Conversely, if there is a possibility it will contain two information words

   <a href="http://example.com/123">
Series information is power information
</a>

You can do this

    preg_match_all('/(Series[^<]+)</i', $str, $matches );

Which will match up to the < as in </a

AS a Side note you could use the PHPQuery library ( which is a DOM parser ), and look for an a tag that contains those words.

https://github.com/punkave/phpQuery

And

https://code.google.com/archive/p/phpquery/wikis/Manual.wiki

Using something like

  $tags = $doc->getElementsByTagName("a:contains('Series)")->text();

This is an excellent library for parsing HTML

How to extract all matches from a string using single regular expression in PHP?

You may use the following regex solution:

$txt="calculated F 15 513 153135 155 125 156 155";
preg_match_all("/(?:\G(?!\A)|calculated(?:\s+F)?)\s*\K[\w.]+/",$txt,$matches);
print_r($matches[0]);

See the regex demo.

Also, see the PHP demo.

Note that it is basically your regex with a custom \G based boundary added to match consecutive matches after a specific pattern added. Note that your [\d\w_\.] is the same as [\w.] as \w matches what \d and _ match.

Pattern details:

  • (?:\G(?!\A)|calculated(?:\s+F)?) - either the end of the previous match (\G(?!\A), \G by itself matches start of a string or the end of the previous match, thus, (?!\A) subtracts the start of string position) or calculated + 1 or - optionally - more whitespaces + F (matched with the calculated(?:\s+F)? branch)
  • \s* - zero or more whitespaces
  • \K - match reset operator
  • [\w.]+ - 1 or more digits, letters, _ or . characters.


Related Topics



Leave a reply



Submit