How to Access HTML Elements That Are Rendered in JavaScript Using Xpath

How do I access HTML elements that are rendered in JavaScript using XPath?

Using gem "capybara-webkit" is a viable way of manipulating this website in full javascript rendered view.

Here is a scratch example of what a capybara-webkit script might look like.

#!/usr/bin/env ruby
require "rubygems"
require "pp"
require "bundler/setup"
require "capybara"
require "capybara/dsl"
require "capybara-webkit"

Capybara.run_server = false
Capybara.current_driver = :webkit
Capybara.app_host = "http://www.goalzz.com/"

module Test
  class Goalzz
    include Capybara::DSL

    def get_results
      visit('/default.aspx?c=8358')
      all(:xpath, '//td[@class="m_g"]').each { |node| pp node.to_s }

    end
  end
end

spider = Test::Goalzz.new
spider.get_results

What is required to find the example xpath in this case (due to the page being created dynamically), is a fully functional javascript webdriving engine.

How to access HTML element using XPath in IE8?

You might also like this solution to add xpath support for HTML in IE:

http://sourceforge.net/projects/html-xpath/

This has the benefit of unifying the API between IE and other browsers, as well.

Get element by xpath on HTML string

Use can use DOMParser

var response = "<!DOCTYPE html><html><body><h1>This is heading 1</h1></body></html>";
var dom = new DOMParser().parseFromString(response, 'text/html');
// Example
console.log(dom.querySelector('h1').textContent);

Compare two DOM elements using xpath

How exactly are you using XPath? Be aware that XPath may not work in IE, at least without a shim (its native support is only for XML).

But with a proper polyfill/shim, you should be able to just compare the DOM element (or iterate through each XPath result if multiple are returned and compare each individual item).

if (xPathClassDOM === xPathIDDDOM) {...}

Also, .tab-header-home is not XPath, but a CSS Selector, but you can still compare it with an XPath-retrieved element if you have grabbed the DOM element, e.g., via document.querySelector).

For example:

JavaScript:

var classBasedExpression = ".tab-header-home"; // CSS-Selector-based
var idBasedExpression = ".//*[@id='1']/div/ul/li[5]"; // XPath-based

var firstCSSSelection = document.querySelector(classBasedExpression);

var iterator = document.evaluate(idBasedExpression, document, null, XPathResult.UNORDERED_NODE_ITERATOR_TYPE, null ); 

try {
  var thisNode = iterator.iterateNext();

  while (thisNode) {
    if (firstCSSSelection === thisNode) {
      alert('found!')
      break;
    }
    thisNode = iterator.iterateNext();
  }
}
catch (e) {
    alert('error:' + e);
}

HTML:

<div id="1">
    <div>
        <ul>
            <li>a</li>
            <li>b</li>
            <li>c</li>
            <li>d</li>
            <li class="tab-header-home">e</li>
        </ul>
    </div>
</div>

JSFiddle

Highlighting when HTML and Xpath is given

This is what I ended up doing.

public render(){
  let htmlText = //The string above
  let doc = new DOMParser().parseFromString(htmlRender,'text/html');
  let xpathNode = doc.evaluate("/html/body/ul/li[1]/a[1]", doc, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null); 
  const highlightedNode = xpathNode.singleNodeValue.innerText;
  const textValuePrev = highlightedNode.slice(0, char_start);
  const textValueAfter = highlightedNode.slice(char_end, highlightedNode.length);
  xpathNode.singleNodeValue.innerHTML = `${textValuePrev}
                                         <span class='pt-tag'>
                                         ${highlightedNode.slice(char_start, char_end)}
                                         </span> ${textValueAfter}`;
  return(
    <h5> Display html data </h5>
    <div dangerouslySetInnerHTML={{__html: doc.body.outerHTML}} />
   )

How to access HTML DOM Property using Xpath or Css Selector in Selenium

Now I could get the <img> tag,and get the url of the picture:

from selenium import webdriver

driver = webdriver.Chrome()
driver.get("https://www.freepik.com/search?dates=any&format=search&page=1&query=Polygonal%20Human&sort=popular")
# result = WebDriverWait(driver,5).until(EC.element_located_to_be_selected(driver.find_elements_by_css_selector("[class='lzy landscape lazyload--done']"))) 
result = driver.find_elements_by_css_selector("[class='lzy landscape lazyload--done']") # the class always be "lzy landscape lazyload--done"
for i in result:
    print(i.get_attribute('src'))

Result:

https://img.freepik.com/free-vector/innovative-medicine-abstract-composition-with-polygonal-wireframe-images-human-hand-carefully-holding-heart-vector-illustration_1284-30757.jpg?size=626&ext=jpg
https://img.freepik.com/free-vector/computer-generated-rendering-hand_41667-189.jpg?size=626&ext=jpg
https://img.freepik.com/free-vector/polygonal-wireframe-business-strategy-composition-with-glittering-images-human-hand-incandescent-lamp-with-text_1284-32265.jpg?size=626&ext=jpg
https://img.freepik.com/free-vector/particles-geometric-art-line-dot-engineering_31941-119.jpg?size=626&ext=jpg
........

Or get showcase__link:

result = driver.find_elements_by_css_selector("[class='showcase__link']")
for i in result:
    print(i.get_attribute('href'),i.get_attribute('id'),i.get_attribute('data-download-file-url'))

Selenium - Able to get the list of Webelements rendered & displayed in a Web page

As you are trying to to get only the web elements which is displayed on a web page doesn't justifies to me as a valid Business Case. A typical Testcase may want you to verify if a particular element (Button, Text, etc) is displayed or not.

As you mentioned for the web page https://login.yahoo.com/, there is only a few web elements which are rendered to the browser ( 1 input box, 7 links, 1 button etc ). Yes you saw it right as a End User. Here you missed out the fact that those ( 1 input box, 7 links, 1 button etc ) are with property visible set to value true. Hence you see them when you access the URL.

Next when you try to get only these web elements, all I am getting is a large number of web elements in my collection because of different reasons:

To find/search particular element/elements you need to take help of locators (either id, name, linkText, css, xpath). All these locators are unique for each and every element present on the HTML DOM. So if you are trying to use xpath or css ensure that the locators you constructed identifies a unique element (unique set of elements) on the HTML DOM.
All the elements on the HTML DOM necessarily doesn't shows up on the Website. That's because, some elements are kept hidden so they are not displayed on the Web page to restrict the End Users to perform any actions on them. These elements though not visible in the Website but they do exists as Hidden Elements

Conclusion:

So When I try to get only these web elements consider identifying each element through respective locator which identifies the element uniquely on the HTML DOM. Once identified, you can perform any desired action (sendKey(), click(), etc) on them till they are attached to the HTML DOM