Download File from Internet via R Despite The Popup

How to download file from internet via R

Setting the mode might be required to treat the file as binary data while saving it. If I leave that argument out, I get a blank file, but this way works for me:

url <- "http://journal.gucas.ac.cn/CN/article/downloadArticleFile.do?
attachType=PDF&id=11771"
destfile <- "myfile.pdf"
download.file(url, destfile, mode="wb")

using powershell Save file on IE bypassing Download Popup

You can first active the IE window and bring it to front using AppActivate, then using SendKeys to send keystrokes Ctrl+S to save the file.

The sample code is like below, you can change the url and element selector to your owns:

[void] [System.Reflection.Assembly]::LoadWithPartialName("'System.Windows.Forms")
[void] [System.Reflection.Assembly]::LoadWithPartialName("'Microsoft.VisualBasic")

$ie = New-Object -ComObject 'internetExplorer.Application'
$ie.Visible=$true
$ie.Navigate("https://www.example.com/download.html") #change it to your own url
while($ie.ReadyState -ne 4 -or $ie.Busy) {Start-Sleep -m 100}
$link=$ie.Document.getElementById("btnDowload") #change it to your own selector
$link.click()

Sleep 5
$ieProc = Get-Process | ? { $_.MainWindowHandle -eq $ie.HWND }
[Microsoft.VisualBasic.Interaction]::AppActivate($ieProc.Id)
[System.Windows.Forms.SendKeys]::Sendwait("%{s}");

Download a file from HTTPS using download.file()

It might be easiest to try the RCurl package. Install the package and try the following:

# install.packages("RCurl")
library(RCurl)
URL <- "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv"
x <- getURL(URL)
## Or
## x <- getURL(URL, ssl.verifypeer = FALSE)
out <- read.csv(textConnection(x))
head(out[1:6])
# RT SERIALNO DIVISION PUMA REGION ST
# 1 H 186 8 700 4 16
# 2 H 306 8 700 4 16
# 3 H 395 8 100 4 16
# 4 H 506 8 700 4 16
# 5 H 835 8 800 4 16
# 6 H 989 8 700 4 16
dim(out)
# [1] 6496 188

download.file("https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv",destfile="reviews.csv",method="libcurl")


Related Topics



Leave a reply



Submit