PHP HTML Domdocument Getelementbyid Problems

PHP HTML DOMDocument getElementById problems

The Manual explains why:

For this function to work, you will need either to set some ID attributes with DOMElement->setIdAttribute() or a DTD which defines an attribute to be of type ID. In the later case, you will need to validate your document with DOMDocument->validate() or DOMDocument->validateOnParse before using this function.

By all means, go for valid HTML & provide a DTD.

Quick fixes:

  1. Call $dom->validate(); and put up with the errors (or fix them), afterwards you can use $dom->getElementById(), regardless of the errors for some reason.
  2. Use XPath if you don't feel like validing: $x = new DOMXPath($dom); $el = $x->query("//*[@id='bid']")->item(0);
  3. Come to think of it: if you just set validateOnParse to true before loading the HTML, if would also work ;P

.

$dom = new DOMDocument();
$html ='<html>
<body>Hello <b id="bid">World</b>.</body>
</html>';
$dom->validateOnParse = true; //<!-- this first
$dom->loadHTML($html); //'cause 'load' == 'parse

$dom->preserveWhiteSpace = false;

$belement = $dom->getElementById("bid");
echo $belement->nodeValue;

Outputs 'World' here.

Alternatives to PHP getElementById() of DOMDocument?

Got it, don't know how it's invalid-document-proof:

$xpath = new \DOMXpath($document);
$nodes = $xpath->query('//img[@id="banner"]');

// Return content if we don't have exactly one image with id="banner"
if(1 !== $nodes->length) return $content;

// DOMNode of the banner
$banner = $nodes->item(0);

// Set the new src attribute and save the content
$banner->setAttribute('src', 'http://mysite.come/img/myimage.png');
$banner->ownerDocument->saveXML($banner);

return $document->saveXML();

DOM getElementbyId doesn't work properly

As noted in comments on the doc page, you must declare a doctype for getElementById to perform as expected

t =<<<D
<!DOCTYPE html>
<form id="frm-send" method="post" action="index.php" >

...code continues ...

Per the documentation, a DTD must be specified for getElementById to understand which attribute of an element is used as the unique identifier. Declaring a doctype accomplishes this. You may also explicitly set this (without giving a DTD) by using setIdAttribute,

Documentation

  • http://www.php.net/manual/en/domdocument.getelementbyid.php
  • http://www.php.net/manual/en/domelement.setidattribute.php

DOMDocument getelementbyid conflict?

You can use a DOMXPath query instead of getElementById() to dodge the name attribute and target only the element with an id attribute of "favela":

$xpath = new DOMXPath($doc);
$favelaElement = $xpath->query('//*[@id="favela"]')->item(0);

print_r($favelaElement->nodeValue);

Output:

A shantytown or slum, especially in Brazil.

php DOMDocument- getElementById- nodeValue sripping html

This is proper behavior - your tag is being converted to a string, and strings in XML can't contain angle brackets (only tags can). Try converting the HTML into a DOMNode and appending it instead:

$node = $mydoc->createElement("b");
$node->nodeValue = "test";
$mydoc->getElementById("whatever")->appendChild($node);

Update with working example:

$html = '<html>
<body id="myBody">
<b id="myBTag">my old element</b>
</body>
</html>';

$mydoc = new DOMDocument("1.0", "utf-8");
$mydoc->loadXML($html);

// need to do this to get getElementById() to work
$all_tags = $mydoc->documentElement->getElementsByTagName("*");
foreach ($all_tags as $element) {
$element->setIdAttribute("id", true);
}

$current_b_tag = $mydoc->getElementById("myBTag");
$new_b_tag = $mydoc->createElement("b");
$new_b_tag->nodeValue = "my new element";
$result = $mydoc->getElementById("myBody");
$result->replaceChild($new_b_tag, $current_b_tag);

echo $mydoc->saveXML($mydoc->documentElement);

DOMDocument::getElementById returns NULL

I think The Manual explains why this may happen

For this function to work, you will need either to set some ID attributes with DOMElement->setIdAttribute() or a DTD which defines an attribute to be of type ID. In the later case, you will need to validate your document with DOMDocument->validate() or DOMDocument->validateOnParse before using this function.

Potential fixes:

  1. Call $dom->validate();, afterwards you can use $dom->getElementById(), regardless of the errors for some reason.
  2. Use XPath if you don't feel like validating:

    $x = new DOMXPath($dom);

    $el = $x->query("//*[@id='title']")->item(0); //Look for id=title

Example of using a custom DTD:

$dtd = '<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>';

$systemId = 'data://text/plain;base64,'.base64_encode($dtd);

$creator = new DOMImplementation;
$doctype = $creator->createDocumentType($root, null, $systemId); //Based on your DTD from above

Document::getElementById('content') doesn't work, but the element is there

I found this:

Please note that if your HTML does not contain a doctype declaration,
then getElementById will always return null.

By looking for tagname and then the ID of that tag name, it will return the id.

Source



Related Topics



Leave a reply



Submit