Extract Dom-Elements from String, in PHP

Get DOM element string using PHP

Use DOMNode::C14N method to canonicalize nodes to a string, substr and strpos functions to get the needed fragment :

...
$el = $dom->getElementById("myelementID");
$elString = $el->C14N();

var_dump(substr($elString, 0, strpos($elString, '>') + 1));

The output (for your example):

string(51) "<div class="hello" data-foo="bar" id="myelementID">"

http://php.net/manual/ru/domnode.c14n.php

Extract DOM-elements from string, in PHP

You need to use the DOMDocument class, and, more specifically, its loadHTML method, to load your HTML string to a DOM object.

For example :

$string = <<<HTML
<p>test</p>
<div class="someclass">text</div>
<p>another</p>
HTML;

$dom = new DOMDocument();
$dom->loadHTML($string);


After that, you'll be able to manipulate the DOM, using for instance the DOMXPath class to do XPath queries on it.

For example, in your case, you could use something based on this portion of code :

$xpath = new DOMXpath($dom);
$result = $xpath->query('//div[@class="someclass"]');
if ($result->length > 0) {
var_dump($result->item(0)->nodeValue);
}

Which, here, would get you the following output :

string 'text' (length=4)


As an alternative, instead of DOMDocument, you could also use simplexml_load_string and SimpleXMLElement::xpath -- but for complex manipulations, I generally prefer using DOMDocument.

How to get specific html element from string using PHP

You can use the DOMDocument class (https://www.php.net/manual/en/class.domdocument.php) like this example:

$string = '<div id="pico"><figure id="attachment_84751" aria-describedby="caption-attachment-84751" style="width: 2048px" class="wp-caption aligncenter"><figcaption id="caption-attachment-84751" class="wp-caption-text">No access to a gym? No problem. Push ups and sits ups are the basis for a quick burner workout on Thanksgiving.</figcaption></figure><p>To help pre-empt some of that Thanksgiving feast guilt, why not get your blood flowing and your heart rate up with a quick workout before the holiday meal?</p><p>This five minute burner can be adjusted to any skill level.</p><p><strong>Here’s how it works:</strong></p><ul><li>Set a timer for five minutes. At the top of each minute, set out to do a set of push ups and sit ups. Beginners should shoot for five push ups and five sits ups per minute. Intermediate athletes should shoot for 10 of each and advanced athletes should shoot for 15 of each.</li><li>Once you’ve gotten through a round, rest until the next minute begins and start over.</li></ul><p>It’s just five minutes — and there will be rest in there — but you’ll definitely feel like you got a good workout in!</p><span style="display: inline-block; width: 2px; height: 2px;"></span></div>';

// creates new instance of DOMDocument class
$dom = new domDocument;

// load the html from you variable (@ because figure will throw a warning)
@$dom->loadHTML($string);

// stores all elements of figure
$figures = $dom->getElementsByTagName('figure');

// stores the outerHTML of the first figure
$element = $dom->saveHtml($figures[0]);

// $element contains the html string of the first figure

How to get string for a DomElement?

The DomElement has a property of its DomDocument, i.e. ownerDocument.

Hence you can fetch the XML of the DomElement via:

$domElementXml = $domElement->ownerDocument->saveXML($domElement);

You have to pass the node again as the ownerDocument refers to the whole document. So running $domElement->ownerDocument->saveXML() would fetch the entire XML of the document which could contain different DomElement objects as well.

Converting a DOM element into a string in PHP

First do the following steps to get that URL

  1. $page = icl_object_id(2880, 'page', true);
  2. $url = get_permalink($page);

Then use $parts = explode("/", $url) function and get the last element of that array, you can use array_pop()

Get first level dom element with HTML code

You can try using DOMDocument to parse the HTML and get the tags you want.

Here is some code that does what you describe...

<?php

// Your HTML you provided
$html = <<<HTML
<h1>Indice</h1>
<p class="l3"><a href="#c1" class="ddb1a">I. El Censo</a></p>
<p class="l3"><a href="#c2" class="ddb1a">II. Leyes diversas</a></p>
<p class="l3">
<a href="#c3" class="ddb1a">III. Ofrenda de los Jefes y consagración de los levitas</a>
</p>
HTML;

// Create a DOM document
$dom = new DOMDocument ();

// Load the HTML
$dom->loadHTML ($html);

// Get the <body> tag
$bodys = $dom->getElementsByTagName ('body');
$body = $bodys->item (0);

// The HTML array you want
$html_array = array ();

// Run through each tag, and convert them to HTML strings
foreach ($body->childNodes as $child) {
if ($child instanceof DOMElement) {
$html_array[] = $dom->saveHTML ($child);
}
}

// And lastly, display the array
print_r ($html_array);

PHP DOMDocument extract elements and create new document

Try this:

<?php

$host = 'example.com';

$stringBody = '<head>
<link rel="preload" href="/_next/list.js" as="script">
<!-- ... other link elemens -->
<style data-styled="" data-styled-version="4.2.0"></style>
</head>';

$dom = new DOMDocument();
$dom->loadHTML($stringBody);
$xpath = new DOMXPath($dom);
$headItems = $xpath->query("//head/link[@rel='preload' or @rel='stylesheet'] | //head/style");

$links = [];

foreach ($headItems as $headNode) {
if ($headNode->hasAttribute('href')) {
$headNode->setAttribute('href', $host . $headNode->getAttribute('href'));
}
$links[] = $headNode->ownerDocument->saveHTML($headNode);
}

print_r($links);

Output

Array
(
[0] => <link rel="preload" href="example.com/_next/list.js" as="script">
[1] => <style data-styled="" data-styled-version="4.2.0"></style>
)


Related Topics



Leave a reply



Submit