Save a Web Page with Python Selenium
Unfortunately you can't do what you would like to do with Selenium. You can use page_source to get the html but that is all that you would get.
Selenium unfortunately can't interact with the Dialog that is given to you when you do save as.
You can do the following to get the dialog up but then you will need something like AutoIT to finish it off
from selenium.webdriver.common.action_chains import ActionChains
saveas = ActionChains(driver).key_down(Keys.CONTROL)\
.send_keys('s').key_up(Keys.CONTROL)
saveas.perform()
using Selenium, Firefox, Python to save download of EPS files to disk after automated clicking of download link
Thank you to @unutbu for helping me solve this. I just didn't understand the anatomy of a file download. I do understand a little bit better now.
I ended up installing an extension called "Live HTTP Headers" on Firefox to examine the headers sent by the server. As it turned out, the 'EPS' files were sent with a 'Content-Type' of 'application/octet-stream'.
Now the EPS files are saved to disk as expected. I modified the Firefox preferences to the following:
profile.set_preference('browser.helperApps.neverAsk.saveToDisk',
'image/jpeg,image/png,'
'application/octet-stream')
How to download the PDF by using Selenium Module (FireFox) in Python 3
Apart from Tarun's solution, you can also download the file through js and store it as a blob. Then you can extract the data into python via selinium's execute script as shown in this answer.
In you case,
url = 'http://technical.traders.com/archive/articlefinal.asp?file=\V26\C07\\131INTR.pdf'
browser.execute_script("""
window.file_contents = null;
var xhr = new XMLHttpRequest();
xhr.responseType = 'blob';
xhr.onload = function() {
var reader = new FileReader();
reader.onloadend = function() {
window.file_contents = reader.result;
};
reader.readAsDataURL(xhr.response);
};
xhr.open('GET', %(download_url)s);
xhr.send();
""".replace('\r\n', ' ').replace('\r', ' ').replace('\n', ' ') % {
'download_url': json.dumps(url),
})
Now your data exists as a blob on the window object, so you can easily extract into python:
time.sleep(3)
downloaded_file = driver.execute_script("return (window.file_contents !== null ? window.file_contents.split(',')[1] : null);")
with open('/Users/Chetan/Desktop/dummy.pdf', 'wb') as f:
f.write(base64.b64decode(downloaded_file))
Related Topics
How to Find Which Version of Tensorflow Is Installed in My System
Taking Data from Drop-Down Menu Using Flask
Clicking Links With Python Beautifulsoup
How to Specify File Path in Jupyter Notebook
How to Make Python Code to Execute Only Once
Get Only Unique Words from a Sentence in Python
How to Run Linux Terminal Command in Python in New Terminal
Keras Valueerror: Input 0 Is Incompatible With Layer Conv2D_1: Expected Ndim=4, Found Ndim=5
How to Install a Module for All Users With Pip on Linux
How to Fill Empty Cell Value in Pandas With Condition
How to Close a Tkinter Window by Pressing a Button
How to Remove an Item from a List in Python If That Item Contains a Word
How to Replace Values At Specific Indexes of a Python List
Max Value of List Without Max() Method
How to Split a Byte String into Separate Bytes in Python
Opening a Word Document That Has a Password Using Docx Library
How to Clear Only Last One Line in Python Output Console
Filtering the Dataframe Based on the Column Value of Another Dataframe