How can I render JavaScript HTML to HTML in python?
You can pip install selenium
from a command line, and then run something like:
from selenium import webdriver
from urllib2 import urlopen
url = 'http://www.google.com'
file_name = 'C:/Users/Desktop/test.txt'
conn = urlopen(url)
data = conn.read()
conn.close()
file = open(file_name,'wt')
file.write(data)
file.close()
browser = webdriver.Firefox()
browser.get('file:///'+file_name)
html = browser.page_source
browser.quit()
How to render HTML in python?
I have fixed this by using the tkinterweb
library.
Code:
import tkinter
from tkinterweb import HtmlFrame
screen = tkinter.Tk()
screen.geometry("700x700")
frame = HtmlFrame(screen, horizontal_scrollbar="auto")
urlInput = tkinter.Entry(screen)
def search():
frame.load_website(urlInput.get())
button = tkinter.Button(screen,text="search",command=search)
frame = HtmlFrame(screen)
urlInput.grid(row=0,column=0,columnspan=2)
button.grid(row=1,column=0)
frame.grid(row=2,column=0)
screen.mainloop()
This is for anyone who wants to know how I solved it
Trouble getting the trade-price using Requests-HTML library
You have several errors. The first is a 'navigation' timeout, showing that the page didn’t complete rendering:
Exception in callback NavigatorWatcher.waitForNavigation.<locals>.watchdog_cb(<Task finishe...> result=None>) at C:\Users\ar\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pyppeteer\navigator_watcher.py:49
handle: <Handle NavigatorWatcher.waitForNavigation.<locals>.watchdog_cb(<Task finishe...> result=None>) at C:\Users\ar\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pyppeteer\navigator_watcher.py:49>
Traceback (most recent call last):
File "C:\Users\ar\AppData\Local\Programs\Python\Python36-32\lib\asyncio\events.py", line 145, in _run
self._callback(*self._args)
File "C:\Users\ar\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pyppeteer\navigator_watcher.py", line 52, in watchdog_cb
self._timeout)
File "C:\Users\ar\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pyppeteer\navigator_watcher.py", line 40, in _raise_error
raise error
concurrent.futures._base.TimeoutError: Navigation Timeout Exceeded: 3000 ms exceeded
This traceback is not raised in the main thread, your code was not aborted because of this. Your page may or may not be complete; you may want to set a longer timeout or introduce a sleep cycle for the browser to have time to process AJAX responses.
Next, the response.html.render()
element returns None
. It loads the HTML into a headless Chromium browser, leaves JavaScript rendering to that browser, then copies back the page HTML into the response.html
datasctructure in place, and nothing needs to be returned. So js
is set to None
, not a new HTML
instance, causing your next traceback.
Use the existing response.html
object to search, after rendering:
r.html.render()
item = r.html.find('.MarketInfo_market-num_1lAXs', first=True)
There is most likely no such CSS class, because the last 5 characters are generated on each page render, after JSON data is loaded over AJAX. This makes it hard to use CSS to find the element in question.
Moreover, I found that without a sleep cycle, the browser has no time to fetch AJAX resources and render the information you wanted to load. Give it, say, 10 seconds of sleep
to do some work before copying back the HTML. Set a longer timeout (the default is 8 seconds) if you see network timeouts:
r.html.render(timeout=10, sleep=10)
You could set the timeout
to 0
too, to remove the timeout and just wait indefinitely until the page has loaded.
Hopefully a future API update also provides features to wait for network activity to cease.
You can use the included parse
library to find the matching CSS classes:
# search for CSS suffixes
suffixes = [r[0] for r in r.html.search_all('MarketInfo_market-num_{:w}')]
for suffix in suffixes:
# for each suffix, find all matching elements with that class
items = r.html.find('.MarketInfo_market-num_{}'.format(suffix))
for item in items:
print(item.text)
Now we get output produced:
169.81 EUR
+
1.01 %
18,420 LTC
169.81 EUR
+
1.01 %
18,420 LTC
169.81 EUR
+
1.01 %
18,420 LTC
169.81 EUR
+
1.01 %
18,420 LTC
Your last traceback shows that the Chromium user data path could not be cleaned up. The underlying Pyppeteer library configures the headless Chromium browser with a temporary user data path, and in your case the directory contains some still-locked resource. You can ignore the error, although you may want to try and remove any remaining files in the .pyppeteer
folder at a later time.
Related Topics
Event When User Stops Scrolling
"You May Need an Appropriate Loader to Handle This File Type" with Webpack and Babel
Difference Between Single Quotes and Double Quotes in JavaScript
How to Get "Get" Request Parameters in JavaScript
How to Addeventlistener to Multiple Elements in a Single Line
Changing the Key Name in an Array of Objects
How to Reset (Clear) Form Through JavaScript
Http Head Request in JavaScript/Ajax
Difference Between (Function(){})(); and Function(){}();
JavaScript Variable Definition: Commas VS. Semicolons
Trying to Validate Url Using JavaScript
Getmonth in JavaScript Gives Previous Month
How to Change an Element Type Using Jquery
Why How to Access Typescript Private Members When I Shouldn't Be Able To
Differencebetween $(Window).Load and $(Document).Ready
Es6 Object Destructuring Default Parameters