How to Save/Download Pdf Embedded in Web Page Without a Pdf Filename

How to save/download pdf embedded in web page without a pdf filename

The common method in CF for streaming a PDF to the browser is using this method:

<cfheader name="Content-Disposition" value="attachment;filename=#PDFFileName#">
<cfcontent type="application/pdf" reset="true" variable="#toBinary(PDFinMemory)#">

Use a C# WebRequest to get the URL of the PDf. Then check the response header for a 'Content-Type of 'application/pdf'. If so, save the binary stream to a PDF file on disk.

(HTML) Download a PDF file instead of opening them in browser when clicked

There is now the HTML 5 download attribute that can handle this.

I agree, and think Sarim's answer is good (it probably should be the chosen answer if the OP ever returns). However, this answer is still the reliable way to handle it (as Yiğit Yener's answer points out and--oddly--people agree with). While the download attribute has gained support, it's still spotty:

http://caniuse.com/#feat=download

Python Download PDF Embedded in a Page

Using Selenium with a specific ChromeProfile you can download embedded pdfs using the following code:

Code:

def download_pdf(lnk):

from selenium import webdriver
from time import sleep

options = webdriver.ChromeOptions()

download_folder = "C:\\"

profile = {"plugins.plugins_list": [{"enabled": False,
"name": "Chrome PDF Viewer"}],
"download.default_directory": download_folder,
"download.extensions_to_open": "",
"plugins.always_open_pdf_externally": True}

options.add_experimental_option("prefs", profile)

print("Downloading file from link: {}".format(lnk))

driver = webdriver.Chrome(chrome_options = options)
driver.get(lnk)

filename = lnk.split("/")[4].split(".cfm")[0]
print("File: {}".format(filename))

print("Status: Download Complete.")
print("Folder: {}".format(download_folder))

driver.close()

And when I call this function:

download_pdf("http://www.equibase.com/premium/eqbPDFChartPlus.cfm?RACE=1&BorP=P&TID=ALB&CTRY=USA&DT=06/17/2002&DAY=D&STYLE=EQB")

Thats the output:

>>> Downloading file from link: http://www.equibase.com/premium/eqbPDFChartPlus.cfm?RACE=1&BorP=P&TID=ALB&CTRY=USA&DT=06/17/2002&DAY=D&STYLE=EQB
>>> File: eqbPDFChartPlus
>>> Status: Download Complete.
>>> Folder: C:\

Sample Image>


Take a look at the specific profile:

profile = {"plugins.plugins_list": [{"enabled": False,
"name": "Chrome PDF Viewer"}],
"download.default_directory": download_folder,
"download.extensions_to_open": ""}

It disables the Chrome PDF Viewer plugin (that embedds the pdf at the webpage), set the default download folder to the folder defined at download_folder variable and sets that Chrome isn't allowed to open any extensions automatically.

After that, when you open the so called "Internal link" your webdriver will automatically download the .pdf file to the download_folder.

Set the default save as name for a an <embed> or <iframe> that uses a Blob

Note:

This answer is outdated.

The behavior described below did change since it was posted, and it may still change in the future.

Since this question has been asked elsewhere, with better responses, I invite you to read these instead: Can I set the filename of a PDF object displayed in Chrome?


I didn't find, yet, for chrome's default plugin.

I've got something that works for Firefox though, and which will default to download.pdf in chrome, for some odd reason...

By passing a dataURI in the form of

'data:application/pdf;headers=filename%3D' + FILE_NAME + ';base64,...'

Firefox accepts FILE_NAME as the name of your file, but chrome doesn't...


A plnkr to show a better download.pdf in chrome, which doesn't like nested iframes...

And an snippet which will only work in FF :