php DOMDocument: element ending up within another
A DomDocument
has to have a single root element, so it will move all following siblings inside the first top-level element.
You could most easily address this by bookending your content with a container tag e.g.
$content = '<div><figure class="image image-style-align-left">
<img src="https://placekitten.com/g/200/300"></figure>
<p>Lorem ipsum dolor sit amet, consectetuer adipiscing elit.</p></div>';
PHP DOMDocument saveHTML not encoding cyrillic correctly
The problem is with $dom->saveHTML();
, you need to add the root node as a parameter, like this:
return $dom->saveHTML((new \DOMXPath($dom))->query('/')->item(0));
The suddenly it renders the page differently, with substitution. If it does not, double check the values of $dom->encoding
and $dom->substituteEntities
, they should read UTF-8
and TRUE
.
PHP domDocument works incorrectly when the node wrapper in figure?
I was unable to reproduce your problem. My guess would be a misplaced element somewhere in your source HTML. But your code can be simplified quite a bit.
There's no need to put your image nodes into an array, you can work directly with the results of DomDocument::getElementsByTagName()
.
As mentioned in comments you can setup DomDocument::loadHTML()
not to add the doctype and implied elements, instead of removing them later with potentially tricky string manipulations.
A simple DomDocument::createElement()
can be used for the element you want to append, instead of creating a new object.
Finally, the error control operator @
should generally be avoided. Instead, libxml_use_internal_errors()
can be used to set the error behaviour. This allows you to examine error messages with libxml_get_errors()
if desired.
$content = <<< HTML
<div class="content">
<a href="..."><img src=""></a>
<figure>
<a href="..."><img src=""></a>
<figcaption>Caption</figcaption>
</figure>
</div>
HTML;
$dom = new DOMDocument();
libxml_use_internal_errors(true);
$dom->loadHTML($content, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
libxml_use_internal_errors(false);
foreach ($dom->getElementsByTagName('img') as $node) {
$node->parentNode->appendChild($dom->createElement("span", "11"));
}
$newHtml = $dom->saveHTML();
echo $newHtml;
Output:
<div class="content">
<a href="..."><img src=""><span>11</span></a>
<figure>
<a href="..."><img src=""><span>11</span></a>
<figcaption>Caption</figcaption>
</figure>
</div>
How does one strip tags (and their content) from an HTML string using PHP's DOMDocument?
Based on Niet the Dark Absol's comment, my solution was to simply wrap my code nippet in a div
, and then use substr
to remove it. Seems like an acceptable workaround for working with valid inline HTML snippets (and not the entire DOM) via DOMDocument.
$html = '<a href="#">LINK1</a> - and <i>also</i> <a href="#">LINK2</a>';
$dom = new DOMDocument;
$dom->preserveWhiteSpace = false;
$dom->validateOnParse = false;
$dom->resolveExternals = false;
$dom->substituteEntities = false;
$dom->loadHTML( '<div>'.$html.'</div>', LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD );
$list = $dom->getElementsByTagName('a');
while ($list->length > 0) {
$p = $list->item(0);
$p->parentNode->removeChild($p);
}
$result = substr($dom->saveHTML(), 5, -6);
Related Topics
Check to See If a String Is Serialized
Accessing Laravel .Env Variables in Blade
How to Customize Fos Userbundle Urls
How to Populate HTML Dropdown List with Values from Database
Downloading Attachments to Directory with Imap in PHP, Randomly Works
Pull All Images from a Specified Directory and Then Display Them
Issues with PHP 5.3 and Sessions Folder
Detect Exif Orientation and Rotate Image Using Imagemagick
Relative Path in Require_Once Doesn't Work
When Do I Use Static Variables/Functions in PHP
How to Cast Array Elements to Strings in PHP
PHP to Clean-Up Pasted Microsoft Input
Crop or Mask an Image into a Circle
How to Pass Variables from JavaScript to PHP
How to Get a One-Dimensional Scalar Array as a Doctrine Dql Query Result