Get Everything Between ≪Tag≫ and ≪/Tag≫ With PHP

How to extract text between HTML tags in PHP

You should use the The DOMDocument class

Example:


$html= "<p>hi</p>
<h1>H1 title</h1>
<h2>H2 title</h2>
<h3>H2 title</h3>";
// a new dom object
$dom = new domDocument('1.0', 'utf-8');
// load the html into the object ***/
$dom->loadHTML($html);
//discard white space
$dom->preserveWhiteSpace = false;
$hTwo= $dom->getElementsByTagName('h2'); // here u use your desired tag
echo $hTwo->item(0)->nodeValue;
//will return "H2 title";
?>

Refer DOM Parsing too

Example:

// SimpleHtmlDom example
// Create DOM from URL or file
$html = file_get_html('http://localhost/blah.php');

// Find all paragraphs
foreach($html->find('p') as $element)
echo $element->innerText() . '<br>';

Get text between HTML tags

Using regular expressions is generally a good idea for your problem.

When you look at http://php.net/preg_match you see that $matches will be an array, since there may be more than one match. Try

print_r($matches);

to get an idea of how the result looks, and then pick the right index.

EDIT:

If there is a match, then you can get the text extracted between the parenthesis-group with

print($matches[1]);

If you had more than one parenthesis-group they would be numbered 2, 3 etc. You should also consider the case when there is no match, in which case the array will have the size of 0.

PHP: Regex replace everything between to strings/HTML tags

The reason you're getting the error is because you've not escaped an opening [ character in your regular expression. Please see the [ I have marked below:

preg_replace('/\<p\>\[quote\]\<\/p\>[\s\S]+?\<p\>[\/quote\]\<\/p\>/', '', $string);
^

This has resulted in starting a character class that has not been closed. You should simply escape this opening brace like this:

preg_replace('/\<p\>\[quote\]\<\/p\>[\s\S]+?\<p\>\[\/quote\]\<\/p\>/', '', $string);

Php get string between tags

If you must use a regular expression, the following will do the trick.

$str = 'foo {Vimeo}123456789{/Vimeo} bar';
preg_match('~{Vimeo}([^{]*){/Vimeo}~i', $str, $match);
var_dump($match[1]); // string(9) "123456789"

This may be more than what you want to go through, but here is a way to avoid regex.

$str = 'foo {Vimeo}123456789{/Vimeo} bar';
$m = substr($str, strpos($str, '{Vimeo}')+7);
$m = substr($m, 0, strpos($m, '{/Vimeo}'));
var_dump($m); // string(9) "123456789"

Regex select all text between tags

You can use "<pre>(.*?)</pre>", (replacing pre with whatever text you want) and extract the first group (for more specific instructions specify a language) but this assumes the simplistic notion that you have very simple and valid HTML.

As other commenters have suggested, if you're doing something complex, use a HTML parser.

Preg match text in php between html tags

preg_match("'<p class=\"review\">(.*?)</p>'si", $source, $match);
if($match) echo "result=".$match[1];

Regex that extracts text between tags, but not the tags

You can use this following Regex:

>([^<]*)<

or, >[^<]*<

Then eliminate unwanted characters like '<' & '>'



Related Topics



Leave a reply



Submit