Domdocument::Loadhtml Error

DOMDocument::loadHTML error

Header, Nav and Section are elements from HTML5. Because HTML5 developers felt it is too difficult to remember Public and System Identifiers, the DocType declaration is just:

<!DOCTYPE html>

In other words, there is no DTD to check, which will make DOM use the HTML4 Transitional DTD and that doesnt contain those elements, hence the Warnings.

To surpress the Warnings, put

libxml_use_internal_errors(true);

before the call to loadHTML and

libxml_use_internal_errors(false);

after it.

An alternative would be to use https://github.com/html5lib/html5lib-php.

PHP DOMDocument loadHTML error

The html page you are trying to grab is malformed. The document type declaration must be the first line of a document. You could try cutting the first two lines off of the content before loading it with loadHTML().

PHP DOMDocument errors/warnings on html5-tags

No, there is no way of specifying a particular doctype to use, or to modify the requirements of the existing one.

Your best workable solution is going to be to disable error reporting with libxml_use_internal_errors:

$dom = new DOMDocument;
libxml_use_internal_errors(true);
$dom->loadHTML('...');
libxml_clear_errors();

PHP: DOMDocument loadHTML returns an error when using HTML5 tags

I've run into this issue with PHP's DOMDoc and XSL functions. You basically have to load the document as XML. Thats the only way I got the <video> tag to work.

Update:
You can also try adding elements & entities to the <!DOCTYPE html5 > as long as $doc->resolveExternals = true.

DOMDocument loadHTML doesn't work properly on a server

To disable the warning, you can use

libxml_use_internal_errors(true);

This works for me, Manual, read on:


Background: You are loading invalid HTML. Invalid HTML is quite common, DOMDocument::loadHTML corrects most of the problems, but gives warnings by default.

With libxml_use_internal_errors you can control that behavior. Set it before loading the document:

$previously = libxml_use_internal_errors(true);
$doc->loadHTML($amazon);

Then after loading you can deal with the errors (if you want/need to):

/* @var LibXMLError[] $xmlErrors */
$xmlErrors = libxml_get_errors();

And finally clear them (as they will add up) and restore the previous setting if applicable:

unset($xmlErrors);
libxml_clear_errors();
libxml_use_internal_errors($previously);

References

  • libxml_use_internal_errors Disable libxml errors and allow user to fetch error information as needed
  • libxml_clear_errors Clear libxml error buffer
  • libxml_get_errors Retrieve array of errors
  • LibXMLError The libXMLError class
  • Stackoverflow answer to DOMDocument PHP Memory Leak (by Tak; Dec 2011)

PHP DOM loadHTML() method unusual warning

You're affected by one of PHP bugs. The issue was present only in PHP 5.6.8 and 5.6.9. Most likely you have affected PHP version on the server, and bug-free version on your localhost.

The bug itself forbids all null characters in HTML document you're loading, so as a workaround you may try to remove those (actually not needed) characters before further parsing.

$document = new DOMDocument();
$p_result_without_null_chars = str_replace("\0", '', $p_result)
$document->loadHTML($p_result_without_null_chars);

DOMDocument::loadHTML(): warning - htmlParseEntityRef: no name in Entity

This correct answer comes from a comment from @lonesomeday.

My best guess then is that there is an unescaped ampersand (&) somewhere in the HTML. This will make the parser think we're in an entity reference (e.g. ). When it gets to ;, it thinks the entity is over. It then realises what it has doesn't conform to an entity, so it sends out a warning and returns the content as plain text.



Related Topics



Leave a reply



Submit