Python Selenium Accessing HTML Source

Get HTML source of WebElement in Selenium WebDriver using Python

You can read the innerHTML attribute to get the source of the content of the element or outerHTML for the source with the current element.

Python:

element.get_attribute('innerHTML')

Java:

elem.getAttribute("innerHTML");

C#:

element.GetAttribute("innerHTML");

Ruby:

element.attribute("innerHTML")

JavaScript:

element.getAttribute('innerHTML');

PHP:

$element->getAttribute('innerHTML');

It was tested and worked with the ChromeDriver.

Get html of inspect element source with selenium

It seems that it's working after some delay. If I were you I should try to experiment with the delay time.

from selenium import webdriver
import time

browser = webdriver.Chrome()

browser.get('http://bijsluiters.fagg-afmps.be/?localeValue=nl')
searchform = browser.find_element_by_class_name('iceInpTxt')
searchform.send_keys('cefuroxim')
button = browser.find_element_by_class_name('iceCmdBtn').click()

time.sleep(10)

element = browser.find_element_by_class_name('contentContainer')
html = element.get_attribute('innerHTML')
browser.close()
print(html)

Addition: a nicer way is to let the script proceed when an element is available (because of time it takes with JS (for example) before a specific element has been added to the DOM). The element to look for in your example is table with id iceDatTbl (for what I could find after a quick look).

How can I get the code from the View source on the site page using BS4 or another library?

Solution:

from selenium import webdriver
import time
from selenium.webdriver.chrome.options import Options

chrome_options = Options()
chrome_options.add_argument("--headless")
driver = webdriver.Chrome(options=chrome_options)
driver.get("my_URL")

time.sleep(10)

html_source = driver.page_source

Using the headless option we launch the browser without displaying the window. A pause is needed for the entire javascript to be executed, otherwise the data we need will not have time to load. As a result, we get data that matches the data from the "View source".

How to access particular text in web page source using selenium python?

In [41]: from selenium import webdriver
...:
...: driver = webdriver.Chrome("/Users/bigbounty/Downloads/chromedriver")
...: html_content = """
...: <html>
...: <head></head>
...: <body>
...: <div class="dropdown-popup">
...: <a href="/integrations/sources/" class="dropdown-item">Sources</a>
...: <a href="/integrations/destinations/" class="dropdown-item">Destinations</a>
...: <a href="/integrations/analysis-tools/" class="dropdown-item">Analysis</a>
...: </div>
...: </body>
...: </html>"""

In [42]: driver.get("data:text/html;charset=utf-8,{html_content}".format(html_content=html_content))

In [43]: driver.find_elements_by_css_selector("div.dropdown-popup")
Out[43]: [<selenium.webdriver.remote.webelement.WebElement (session="3dd84e3bc612314090a4105a1658c6fc", element="e92de33a-b219-45da-bc17-0fa06533cc06")>]

In [44]: tag_list = [i.text.strip() for i in driver.find_elements_by_css_selector("div.dropdown-popup")[0].find_elemen
...: ts_by_tag_name('a')]

In [45]: tag_list
Out[45]: ['Sources', 'Destinations', 'Analysis']

How to get page html code using selenium?

to get the entire source code you just do:

driver.get('https://mangalib.me/manga-list')
html = driver.page_source

then you can do whatever you want with it

How to get the entire web page source using Selenium WebDriver in python

Your WebDriver object should have a page_source attribute, so for Firefox it would look like

from selenium import webdriver
driver = webdriver.Firefox()
driver.page_source


Related Topics



Leave a reply



Submit