How do I access HTML elements that are rendered in JavaScript using XPath?
Using gem "capybara-webkit" is a viable way of manipulating this website in full javascript rendered view.
Here is a scratch example of what a capybara-webkit script might look like.
#!/usr/bin/env ruby
require "rubygems"
require "pp"
require "bundler/setup"
require "capybara"
require "capybara/dsl"
require "capybara-webkit"
Capybara.run_server = false
Capybara.current_driver = :webkit
Capybara.app_host = "http://www.goalzz.com/"
module Test
class Goalzz
include Capybara::DSL
def get_results
visit('/default.aspx?c=8358')
all(:xpath, '//td[@class="m_g"]').each { |node| pp node.to_s }
end
end
end
spider = Test::Goalzz.new
spider.get_results
What is required to find the example xpath in this case (due to the page being created dynamically), is a fully functional javascript webdriving engine.
How to access HTML element using XPath in IE8?
You might also like this solution to add xpath support for HTML in IE:
http://sourceforge.net/projects/html-xpath/
This has the benefit of unifying the API between IE and other browsers, as well.
Get element by xpath on HTML string
Use can use DOMParser
var response = "<!DOCTYPE html><html><body><h1>This is heading 1</h1></body></html>";
var dom = new DOMParser().parseFromString(response, 'text/html');
// Example
console.log(dom.querySelector('h1').textContent);
Compare two DOM elements using xpath
How exactly are you using XPath? Be aware that XPath may not work in IE, at least without a shim (its native support is only for XML).
But with a proper polyfill/shim, you should be able to just compare the DOM element (or iterate through each XPath result if multiple are returned and compare each individual item).
if (xPathClassDOM === xPathIDDDOM) {...}
Also, .tab-header-home
is not XPath, but a CSS Selector, but you can still compare it with an XPath-retrieved element if you have grabbed the DOM element, e.g., via document.querySelector
).
For example:
JavaScript:
var classBasedExpression = ".tab-header-home"; // CSS-Selector-based
var idBasedExpression = ".//*[@id='1']/div/ul/li[5]"; // XPath-based
var firstCSSSelection = document.querySelector(classBasedExpression);
var iterator = document.evaluate(idBasedExpression, document, null, XPathResult.UNORDERED_NODE_ITERATOR_TYPE, null );
try {
var thisNode = iterator.iterateNext();
while (thisNode) {
if (firstCSSSelection === thisNode) {
alert('found!')
break;
}
thisNode = iterator.iterateNext();
}
}
catch (e) {
alert('error:' + e);
}
HTML:
<div id="1">
<div>
<ul>
<li>a</li>
<li>b</li>
<li>c</li>
<li>d</li>
<li class="tab-header-home">e</li>
</ul>
</div>
</div>
JSFiddle
Highlighting when HTML and Xpath is given
This is what I ended up doing.
public render(){
let htmlText = //The string above
let doc = new DOMParser().parseFromString(htmlRender,'text/html');
let xpathNode = doc.evaluate("/html/body/ul/li[1]/a[1]", doc, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null);
const highlightedNode = xpathNode.singleNodeValue.innerText;
const textValuePrev = highlightedNode.slice(0, char_start);
const textValueAfter = highlightedNode.slice(char_end, highlightedNode.length);
xpathNode.singleNodeValue.innerHTML = `${textValuePrev}
<span class='pt-tag'>
${highlightedNode.slice(char_start, char_end)}
</span> ${textValueAfter}`;
return(
<h5> Display html data </h5>
<div dangerouslySetInnerHTML={{__html: doc.body.outerHTML}} />
)
How to access HTML DOM Property using Xpath or Css Selector in Selenium
Now I could get the <img>
tag,and get the url of the picture:
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("https://www.freepik.com/search?dates=any&format=search&page=1&query=Polygonal%20Human&sort=popular")
# result = WebDriverWait(driver,5).until(EC.element_located_to_be_selected(driver.find_elements_by_css_selector("[class='lzy landscape lazyload--done']")))
result = driver.find_elements_by_css_selector("[class='lzy landscape lazyload--done']") # the class always be "lzy landscape lazyload--done"
for i in result:
print(i.get_attribute('src'))
Result:
https://img.freepik.com/free-vector/innovative-medicine-abstract-composition-with-polygonal-wireframe-images-human-hand-carefully-holding-heart-vector-illustration_1284-30757.jpg?size=626&ext=jpg
https://img.freepik.com/free-vector/computer-generated-rendering-hand_41667-189.jpg?size=626&ext=jpg
https://img.freepik.com/free-vector/polygonal-wireframe-business-strategy-composition-with-glittering-images-human-hand-incandescent-lamp-with-text_1284-32265.jpg?size=626&ext=jpg
https://img.freepik.com/free-vector/particles-geometric-art-line-dot-engineering_31941-119.jpg?size=626&ext=jpg
........
Or get showcase__link:
result = driver.find_elements_by_css_selector("[class='showcase__link']")
for i in result:
print(i.get_attribute('href'),i.get_attribute('id'),i.get_attribute('data-download-file-url'))
Selenium - Able to get the list of Webelements rendered & displayed in a Web page
As you are trying to to get only the web elements which is displayed on a web page
doesn't justifies to me as a valid Business Case
. A typical Testcase
may want you to verify if a particular element (Button, Text, etc) is displayed
or not.
As you mentioned for the web page https://login.yahoo.com/, there is only a few web elements which are rendered to the browser ( 1 input box, 7 links, 1 button etc )
. Yes you saw it right as a End User
. Here you missed out the fact that those ( 1 input box, 7 links, 1 button etc ) are with property
visible
set to value
true
. Hence you see them when you access the URL
.
Next when you try to get only these web elements, all I am getting is a large number of web elements in my collection
because of different reasons:
- To find/search particular element/elements you need to take help of locators (either
id
,name
,linkText
,css
,xpath
). All these locators are unique for each and every element present on theHTML DOM
. So if you are trying to usexpath
orcss
ensure that the locators you constructed identifies a unique element (unique set of elements) on theHTML DOM
. - All the elements on the
HTML DOM
necessarily doesn't shows up on the Website. That's because, some elements are kept hidden so they are not displayed on the Web page to restrict theEnd Users
to perform any actions on them. These elements though not visible in the Website but they do exists asHidden Elements
Conclusion:
So When I try to get only these web elements
consider identifying each element through respective locator which identifies the element uniquely on the HTML DOM
. Once identified, you can perform any desired action (sendKey()
, click()
, etc) on them till they are attached to the HTML DOM
Related Topics
How to Understand the #Dup and #Clone Operate on Objects Which Referencing Other Objects
404 Resource Not Found: Domain with Google Directory API
Sublime Text Can't Understand Gets.Chomp
Loop Within Loop in Rails Controller
Accessing a Ruby Hash with a Variable as the Key
Why Is Ruby's Loop Command Slower Than While True
Recursive Rails Nested Resources
Determining Method's Visibility on the Fly
Installing Ruby Using Rvm Fails, Without Trace
Gitlab: Invocation of Gitlab-Shell
Ruby Multiple Background Threads
Adding Fields to Devise Sign Up Using Rails 4
Ruby Daemon Process to Keep Objects Alive for Transient Ruby Instances
How to Get the File Creation Time in Ruby on Windows
Currying a Proc with Keyword Arguments in Ruby