Importing Excel File Using Url Using Read.Xls

Read Excel file from a URL using the readxl package

This works for me on Windows:

library(readxl)
library(httr)
packageVersion("readxl")
# [1] ‘0.1.1’

GET(url1, write_disk(tf <- tempfile(fileext = ".xls")))
df <- read_excel(tf, 2L)
str(df)
# Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 20131 obs. of 8 variables:
# $ Code : chr "C115388" "C115800" "C115801" "C115802" ...
# $ Codelist Code : chr NA "C115388" "C115388" "C115388" ...
# $ Codelist Extensible (Yes/No): chr "No" NA NA NA ...
# $ Codelist Name : chr "6 Minute Walk Functional Test Test Code" "6 Minute Walk Functional Test Test Code" "6 Minute Walk Functional Test Test Code" "6 Minute Walk Functional Test Test Code" ...
# $ CDISC Submission Value : chr "SIXMW1TC" "SIXMW101" "SIXMW102" "SIXMW103" ...
# $ CDISC Synonym(s) : chr "6 Minute Walk Functional Test Test Code" "SIXMW1-Distance at 1 Minute" "SIXMW1-Distance at 2 Minutes" "SIXMW1-Distance at 3 Minutes" ...
# $ CDISC Definition : chr "6 Minute Walk Test test code." "6 Minute Walk Test - Distance at 1 minute." "6 Minute Walk Test - Distance at 2 minutes." "6 Minute Walk Test - Distance at 3 minutes." ...
# $ NCI Preferred Term : chr "CDISC Functional Test 6MWT Test Code Terminology" "6MWT - Distance at 1 Minute" "6MWT - Distance at 2 Minutes" "6MWT - Distance at 3 Minutes" ...

Importing Excel file using url using read.xls

@G.G is correct that read.xls does not support https. However, if you simply replace the https with http in the url you should be able to download the file.

Give this a try:

require(RCurl)
require(gdata)
url <- "http://dl.dropboxusercontent.com/u/27644144/NADAC%2020140101.xls"
test <- read.xls(url)

R Importing excel file directly from web

Without information about why using the gdata package does not work for you I have to assume. Make sure you have Perl installed - you can download it at http://www.activestate.com/activeperl

This works for me:

library('gdata')

## URL broken into multiple lines for readability
url <- paste("https://quotespeed.morningstar.com/exportChartDataToExcel.",
"jsp?tickers=AAPL&symbols=126.1.AAPL&st=1980-12-1&ed=2015-",
"6-8&f=m&dty=1&types=1&ver=1.6.0&qs_wsid=E43474CC03753FE0E",
"777D89877788ECB", sep = "")
url <- gsub("https", "http",url)
data <- read.xls(url, perl = "C:/Perl64/bin/perl.exe")

Without perl = "path_to_perl.exe" I got the error

Error in findPerl(verbose = verbose) : 
perl executable not found. Use perl= argument to specify the correct path.
Error in file.exists(tfn) : invalid 'file' argument

Python: how import excel file from the web?

Your file isn't a CSV or an Excel file. Actual contents are an HTML table (see as follows).

Exchange in {0}, Import(+)/Export(-)
<html>
<body>
<table>
<thead>
<tr>
<td colspan="5">Exchange EE connections in MWh, MW</td>
</tr><tr>
<td colspan="5">Data was last updated 06-01-2021</td>
</tr><tr>
<td></td><td style="text-align:center;">EE net exchange</td><td style="text-align:center;">EE - FI</td><td style="text-align:center;">EE - LV</td><td style="text-align:center;">EE - RU</td>
</tr>
</thead><tbody>
<tr>
<td style="text-align:left;">01-01-2021</td><td style="text-align:right;">14575</td><td style="text-align:right;">20969,0</td><td style="text-align:right;">-4884,0</td><td style="text-align:right;">-1510,0</td>
</tr><tr>
<td style="text-align:left;">02-01-2021</td><td style="text-align:right;">12073</td><td style="text-align:right;">22479,0</td><td style="text-align:right;">-8001,0</td><td style="text-align:right;">-2405,0</td>
</tr><tr>
<td style="text-align:left;">03-01-2021</td><td style="text-align:right;">14321</td><td style="text-align:right;">22540,0</td><td style="text-align:right;">-8259,0</td><td style="text-align:right;">40,0</td>
</tr><tr>
<td style="text-align:left;">04-01-2021</td><td style="text-align:right;">14662</td><td style="text-align:right;">17653,0</td><td style="text-align:right;">-5829,0</td><td style="text-align:right;">2838,0</td>
</tr><tr>
<td style="text-align:left;">05-01-2021</td><td style="text-align:right;">13570</td><td style="text-align:right;">13779,0</td><td style="text-align:right;">-5314,0</td><td style="text-align:right;">5105,0</td>
</tr><tr>
<td style="text-align:left;">06-01-2021</td><td style="text-align:right;">6243</td><td style="text-align:right;"></td><td style="text-align:right;"></td><td style="text-align:right;"></td>
</tr>
</tbody>
</table>
</body>
</html>

Use pd.read_html like so:

import pandas as pd

url = 'https://www.nordpoolgroup.com/48d3ac/globalassets/marketdata-excel-files/exchange-ee-connections_2021_daily.xls'
dfs = pd.read_html(url)
df = dfs[0]

That you can open your file in Excel is because Excel iterates through possible formats until finding something that works. Eg you can make a tab separated values (which should have extension .tsv) file, append .xls and while it isn't an actual horrible spread sheet format (XLS), Excel will still open it normally. It also does this with HTML data.

Read xls file from a URL in python

Use this example to download the excel from Google Drive (the fileid is the ID after the /d/ part in your URL):

fileid = "16cp23cJxeyUfnBHMp-sNCuFNQxe8cqOV"

df = pd.read_excel(
"https://drive.google.com/uc?export=download&id={fileid}".format(
fileid=fileid
),
skiprows=17,
)
print(df)

Prints:

     Unnamed: 0                                         Unnamed: 1                                         Unnamed: 2 Petajoules Gigajoules           %
0 NaN Afghanistan Afghanistan 321 10 78.669280
1 NaN Albania Albania 102 35 100.000000
2 NaN Algeria Algeria 1959 51 0.551010
3 NaN American Samoa American Samoa ... ... 0.641026
4 NaN Andorra Andorra 9 121 88.695650
5 NaN Angola Angola 642 27 70.909090

...and so on.

Using Pandas to read excel from url

Try the link to raw excel file:

import pandas as pd
url='https://github.com/owid/covid-19-data/blob/master/public/data/owid-covid-data.xlsx?raw=true'
df=pd.read_excel(url, sheet_name='Sheet1')


Related Topics



Leave a reply



Submit