Using PHP to Get Dom Element

Using PHP to get DOM Element

getElementsByTagName returns you a list of elements, so first you need to loop through the elements, then through their attributes.

$divs = $dom->getElementsByTagName('div');
foreach ($divs as $div) {
foreach ($div->attributes as $attr) {
$name = $attr->nodeName;
$value = $attr->nodeValue;
echo "Attribute '$name' :: '$value'<br />";
}
}

In your case, you said you needed a specific ID. Those are supposed to be unique, so to do that, you can use (note getElementById might not work unless you call $dom->validate() first):

$div = $dom->getElementById('divID');

Then to get your attribute:

$attr = $div->getAttribute('customAttr');

EDIT: $dom->loadHTML just reads the contents of the file, it doesn't execute them. index.php won't be ran this way. You might have to do something like:

$dom->loadHTML(file_get_contents('http://localhost/index.php'))

Get a Dom element from website using php

There's no need to apply htmlentities on an html string before to parse it. If you do that, all angle brackets are replaced and the parser will no more find any tags.

There's also no need to use file_get_contents to load a file, since DOMDocument has a method to do it.

In your comment, you didn't use the good method to load an HTML file with its URL (and not an HTML string).

The DOMDocument method is DOMDocument::loadHTMLFile and not DOMDocument::loadHTML:

$doc = new DOMDocument();
$doc->loadHTMLFile("http://stackoverflow.com/");
$h1 = $doc->getElementsByTagName("title")->item(0)->textContent;
echo $h1, PHP_EOL;

Note that you can prevent the different warnings to be displayed using libxml_use_internal_errors(true); before this method.

Get DOM element string using PHP

Use DOMNode::C14N method to canonicalize nodes to a string, substr and strpos functions to get the needed fragment :

...
$el = $dom->getElementById("myelementID");
$elString = $el->C14N();

var_dump(substr($elString, 0, strpos($elString, '>') + 1));

The output (for your example):

string(51) "<div class="hello" data-foo="bar" id="myelementID">"

http://php.net/manual/ru/domnode.c14n.php

PHP access DOM within your page

If you have PHP code generating the page, you could use the output buffer to generate the page in memory, edit the generated page and then flush it to the browser. You can only change the DOM before the browser gets it.

You could do the following:

ob_start(); // Should be called before any output is generated

// ... PHP code that outputs HTML ...

$generated_html = ob_get_clean(); // Store generated HTML to string

// Load and manipulate HTML
$doc = new DOMDocument();
$doc->loadHTML($generated_html);

// ... Manipulate the generated HTML ...

echo $doc->saveHTML(); // echo the modified HTML

However, since you are generating the HTML it would make more sense to change whatever you need to change before it's generated to reduce procesing time.

If you want to change the HTML of a page which is already shown in the browser you'll need another way (such as JS/AJAX) since at that point PHP can't possibly access the DOM.

Get first level dom element with HTML code

You can try using DOMDocument to parse the HTML and get the tags you want.

Here is some code that does what you describe...

<?php

// Your HTML you provided
$html = <<<HTML
<h1>Indice</h1>
<p class="l3"><a href="#c1" class="ddb1a">I. El Censo</a></p>
<p class="l3"><a href="#c2" class="ddb1a">II. Leyes diversas</a></p>
<p class="l3">
<a href="#c3" class="ddb1a">III. Ofrenda de los Jefes y consagración de los levitas</a>
</p>
HTML;

// Create a DOM document
$dom = new DOMDocument ();

// Load the HTML
$dom->loadHTML ($html);

// Get the <body> tag
$bodys = $dom->getElementsByTagName ('body');
$body = $bodys->item (0);

// The HTML array you want
$html_array = array ();

// Run through each tag, and convert them to HTML strings
foreach ($body->childNodes as $child) {
if ($child instanceof DOMElement) {
$html_array[] = $dom->saveHTML ($child);
}
}

// And lastly, display the array
print_r ($html_array);

Getting DOM elements by classname

Update: Xpath version of *[@class~='my-class'] css selector

So after my comment below in response to hakre's comment, I got curious and looked into the code behind Zend_Dom_Query. It looks like the above selector is compiled to the following xpath (untested):

[contains(concat(' ', normalize-space(@class), ' '), ' my-class ')]

So the PHP would be:

$dom = new DomDocument();
$dom->load($filePath);
$finder = new DomXPath($dom);
$classname="my-class";
$nodes = $finder->query("//*[contains(concat(' ', normalize-space(@class), ' '), ' $classname ')]");

Basically, all we do here is normalize the class attribute so that even a single class is bounded by spaces, and the complete class list is bounded in spaces. Then append the class we are searching for with a space. This way we are effectively looking for and find only instances of my-class .


Use an xpath selector?

$dom = new DomDocument();
$dom->load($filePath);
$finder = new DomXPath($dom);
$classname="my-class";
$nodes = $finder->query("//*[contains(@class, '$classname')]");

If it is only ever one type of element you can replace the * with the particular tagname.

If you need to do a lot of this with very complex selector I would recommend Zend_Dom_Query which supports CSS selector syntax (a la jQuery):

$finder = new Zend_Dom_Query($html);
$classname = 'my-class';
$nodes = $finder->query("*[class~=\"$classname\"]");

How to find the style property of Dom element using Dom Document in PHP?

you can try this:If CSS is on the same page you can try this:

    $html = file_get_contents('testing.html');
$dom = new DOMDocument();
$dom->loadHTML($html);
$div = $dom->getElementById('test');
if ($div->hasAttributes()) {
foreach ($div->attributes as $attr) {
$name = $attr->nodeName;
$value = $attr->nodeValue;
if( strcmp($name,"class") == 0){
$x=getStyle($dom->textContent,".$value{","}");
echo "<pre>";
print_r($x);
}
if( strcmp($name,"id") == 0){
$idCss=getStyle($dom->textContent,"#$value{","}");
echo "<pre>";
print_r($idCss);
}
}
}

function getStyle($string, $start, $end){
$string = ' ' . $string;
$ini = strpos($string, $start);
if ($ini == 0) return '';
$len = strpos($string, $end, $ini);
return substr($string, $ini, $len);
}

to update the style you can use.

$div->setAttribute('style', 'background-color:blue;'); 
echo $dom->saveHTML(); exit;.

testing.html:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Insert title here</title>
</head>
<body>
<div id="test" class="stl" style="width:100px;">

</div>
</body>
</html>

<style>
.stl{
background-color:green;
}

</style>

output:

.stl{
background-color:green;
}

if you are looking for only style you can do this:

$html = file_get_contents('testing.html');
$dom = new DOMDocument();
$dom->loadHTML($html);
$div = $dom->getElementById('test');
if ($div->hasAttributes()) {
echo $div->getAttribute('style');//width:100px;
}


Related Topics



Leave a reply



Submit