DOMXpath - Get href attribute and text value of an a element
Fetch
//td[@class='name']/a
and then pluck the text with nodeValue
and the attribute with getAttribute('href')
.
Apart from that, you can combine Xpath queries with the Union Operator |
so you can use
//td[@class='name']/a/@href|//td[@class='name']
as well.
Get href value with DOMDocument in PHP
h1//a[@href=""]
is looking for an a
element with an href
attribute with an empty string as the value, whereas your href
attribute contains something other than the empty string as the value.
If that's the entire document, then you could use the expression //a
.
Otherwise, h1//a
should work as well.
If you require the a
element to have an href
attribute with any kind of value, you could use h1//a[@href]
.
If the h1
is not at the root of the document, you might want to use //h1
instead. So the last example would become //h1//a[@href]
.
Getting Text Value of Data-Attribute Link in DOM XPath
Quite often I find it easiest to "inspect" the element I wish to target using the developer tools in Chrome from where it is possible to copy the XPath expression that targets that particular node. This doesn't always return the most useful XPath expression but it is usually a good starting point - in this case I tweaked the returned query and added in the classname.
Hope it helps
$term='dog show';
$url=sprintf('https://en.wikipedia.org/w/index.php?search=%s&title=Special:Search&fulltext=Search', urlencode( $term ) );
printf( '<a href="%s" target="_blank">%s</a>', $url, $url );
libxml_use_internal_errors(true);
$dom=new DOMDocument;
$dom->recover=true;
$dom->formatOutput=true;
$dom->preserveWhiteSpace=true;
$dom->strictErrorChecking=false;
$dom->loadHTMLFile( $url );
$xp=new DOMXPath( $dom );
/* possibly the important bit */
$query='//*[@id="mw-content-text"]/div/ul/li/div[@class="mw-search-result-heading"]/a';
$col=$xp->query( $query );
$html=array();
if( $col && $col->length > 0 ){
foreach( $col as $node ){
$html[]=array(
'title'=>$node->nodeValue,
'href'=>$node->getAttribute('href')
);
}
}
printf('<pre>%s</pre>',print_r($html,true));
Will output:
https://en.wikipedia.org/w/index.php?search=dog+show&title=Special:Search&fulltext=Search
Array(
[0] => Array
(
[title] => Dog show
[href] => /wiki/Dog_show
)
[1] => Array
(
[title] => Show dog
[href] => /wiki/Show_dog
)
[2] => Array
(
[title] => Westminster Kennel Club Dog Show
[href] => /wiki/Westminster_Kennel_Club_Dog_Show
)
[3] => Array
(
[title] => Dog Eat Dog (U.S. game show)
[href] => /wiki/Dog_Eat_Dog_(U.S._game_show)
)
.......... etc
XPath get attribute value in PHP
XPath can do the job of getting the value attribute with $xpath->query("//input[@name='text1']/@value");
. Then you can iterate over the node list of attribute nodes and access the $value
property of each attribute node.
PHP Xpath : get all href values that contain needle
Not sure I understand the question correctly, but the second XPath expression already does what you are describing. It does not match against the text node of the A element, but the href attribute:
$html = <<< HTML
<ul>
<li>
<a href="http://example.com/page?foo=bar">Description</a>
</li>
<li>
<a href="http://example.com/page?lang=de">Description</a>
</li>
</ul>
HTML;
$xml = simplexml_load_string($html);
$list = $xml->xpath("//a[contains(@href,'foo')]");
Outputs:
array(1) {
[0]=>
object(SimpleXMLElement)#2 (2) {
["@attributes"]=>
array(1) {
["href"]=>
string(31) "http://example.com/page?foo=bar"
}
[0]=>
string(11) "Description"
}
}
As you can see, the returned NodeList contains only the A element with href containing foo (which I understand is what you are looking for). It contans the entire element, because the XPath translates to Fetch all A elements with href attribute containing foo. You would then access the attribute with
echo $list[0]['href'] // gives "http://example.com/page?foo=bar"
If you only want to return the attribute itself, you'd have to do
//a[contains(@href,'foo')]/@href
Note that in SimpleXml, this would return a SimpleXml element though:
array(1) {
[0]=>
object(SimpleXMLElement)#3 (1) {
["@attributes"]=>
array(1) {
["href"]=>
string(31) "http://example.com/page?foo=bar"
}
}
}
but you can output the URL now by
echo $list[0] // gives "http://example.com/page?foo=bar"
Get href attribute of an element of a web page using Selenium
To print the value of the href
attribute you can use either of the following locator strategies:
Using cssSelector:
System.out.println(wd.findElement(By.cssSelector("a.a-link-normal#vvp-product-details-modal--product-title")).getAttribute("href"));
Using xpath:
System.out.println(wd.findElement(By.xpath("//a[@class='a-link-normal' and @id='vvp-product-details-modal--product-title']")).getAttribute("href"));
Ideally, to extract the the value of the href
attribute, you have to induce WebDriverWait for the visibilityOfElementLocated() and you can use either of the following locator strategies:
Using cssSelector and
getText()
:System.out.println(new WebDriverWait(driver, 20).until(ExpectedConditions.visibilityOfElementLocated(By.cssSelector("a.a-link-normal#vvp-product-details-modal--product-title"))).getAttribute("href"));
Using xpath and
getAttribute("innerHTML")
:System.out.println(new WebDriverWait(driver, 20).until(ExpectedConditions.visibilityOfElementLocated(By.xpath("//a[@class='a-link-normal' and @id='vvp-product-details-modal--product-title']"))).getAttribute("href"));
PHP DOMXPath extract href of anchor inside a td
With the structure you posted, the following outputs the href-value:
<?php
$dom = new DOMDocument('1.0');
$dom->loadHTMLFile('input.html');
$xpath = new DOMXPath($dom);
$query = '//*[@id="main_content"]/table/tr/td/table[3]/tr[2]/td/table/tr[position() >= 3]/td[2]/a';
$nodes = $xpath->query($query);
foreach ($nodes as $node) {
/** @var $node DOMElement */
var_dump(
$node->getAttribute('href'), // the href-attribute value
$node->nodeValue // the inner text
);
}
Related Topics
Remove Trailing Slash from String PHP
How to Include a PHP.Ini File in Another PHP.Ini File
Updating Product Stock Programmatically in Woocommerce 3
Laravel 5.5 Error Base Table or View Already Exists: 1050 Table 'Users' Already Exists
Parse Error: Syntax Error, Unexpected T_Function Line 10
Mysqli Prepared Statement Column with Variable
Leverage Browser Caching, How on Apache or .Htaccess
Get Characters After Last/In Url
Login Only If User Is Active Using Laravel
Strict Standards: Only Variables Should Be Assigned by Reference PHP 5.4
Increase PHP Script Execution Time
How to View/Open a Word Document in My Browser Using with PHP or HTML
Why Is Calling a Function (Such as Strlen, Count etc) on a Referenced Value So Slow
Is MySQLi_Real_Escape_String Safe
Is There a Built-In Function or Plugin to Handle Date Formatting in JavaScript