Getting Node's Text in PHP Dom

Getting node's text in PHP DOM

So long as you can affect the DOM, you could remove that span.

$span = $div->getElementsByTagName('span')->item(0);
$div->removeChild($span);

$nodeValue = $div->nodeValue;

Alternatively, just access the text node of $div.

foreach($div->childNodes as $node) {

if ($node->nodeType != XML_TEXT_NODE) {
continue;
}
$nodeValue = $node;
}

If you end up with more text nodes and only want the first, you can break after the first assignment of $nodeValue.

PHP DOM get children excluding text nodes

In the example given in question, you can simply set the flag preserveWhiteSpace to be false, this prevents the whitespace (new lines, tabs, spaces etc.) from creating extra text nodes...

$doc->preserveWhiteSpace = false;
$doc->load($file_path);

How to add text node in new created element with PHP DOMDocument

You can set the content using $textContent property of DOMElement.

$newElement = $finalDom->createElement("script");
$newElement->setAttribute("src", "https://stackoverflow.com/");

$newElement->textContent = 'alert("ok");';

PHP - DOM - get text with tags

You can use the following code using XPath:

$string = <<<EOF
<div id='1' data-AAA='something1' data-BBB='something2'><em>My</em></div>
<div id='5' data-AAA='something5' data-BBB='something6'><span style='color:red;'>Web</span></div>
EOF;

$doc = new DOMDocument();
$doc->loadHTML($string);

$selector = new DOMXPath($doc);

// Select the parent elements of text nodes somewhere
// in div elements
foreach($selector->query('//div//text()/..') as $node) {
var_dump($doc->saveHTML($node));
}

Output:

string(11) "<em>My</em>"
string(35) "<span style="color:red;">Web</span>"

Fetching value of specific text node using DOMXPath

You should provide the actual snippet, not just a screenshot of it. If I interpreted the screenshot correctly the snippet is something like:

$xml = <<<'XML'
<body>
<div class="cat_price">
<div class="was">67,000 - PKR</div>
"
64,9999"<span> - PKR</span>
</div>
</body>
XML;

The text node with the price is the following sibling of the div with the class was. So it is possible to fetch it using that axis:

$document = new DOMDocument();
$document->loadXml($xml);
$xpath = new DOMXpath($document);

$expression = 'string(//div[@class="cat_price"]
/div[@class="was"]/following-sibling::text()[1])';

var_dump($xpath->evaluate($expression));

Unlike DOMXpath::query(), DOMXpath::evaluate() can return scalar values depending on the expression. A string cast or a string function will return a string.

string(25) "
"
64,9999""

However the result will not only contain the number but the quotes and some whitespaces. translate() and normalize-space() could be used to clean it up:

$expression = 'normalize-space(
translate(//div[@class="cat_price"]
/div[@class="was"]/following-sibling::text()[1], \'"\', " ")
)';

var_dump($xpath->evaluate($expression));

Output:

string(7) "64,9999"

DOM: fetch all text nodes in the document (PHP)

The XPath expression you need is //text(). Try using it with DOMXPath::query. For example:

$xpath = new DOMXPath($doc);
$textnodes = $xpath->query('//text()');

How can I get text only from the current node with DOMElement?

Just iterate through the <div> and combine all text node:

http://3v4l.org/fnTAF

$dom=new DOMDocument;
$dom->loadHTML(<<<HTML
<div>
<a>abc</a>
xyz
</div>
HTML
);
$div=$dom->getElementsByTagName("div")->item(0);
var_dump($div->childNodes->length);//just to debug
$txt="";
foreach(range(0,$div->childNodes->length-1) as $idx)
{
if($div->childNodes->item($idx)->nodeType==3)
{
$txt.=$div->childNodes->item($idx)->nodeValue;
}
}
var_dump($txt);

nodeType==3 means text node. The corresponding nodeName is #text.



Related Topics



Leave a reply



Submit