Read Excel file from a URL using the readxl package
This works for me on Windows:
library(readxl)
library(httr)
packageVersion("readxl")
# [1] ‘0.1.1’
GET(url1, write_disk(tf <- tempfile(fileext = ".xls")))
df <- read_excel(tf, 2L)
str(df)
# Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 20131 obs. of 8 variables:
# $ Code : chr "C115388" "C115800" "C115801" "C115802" ...
# $ Codelist Code : chr NA "C115388" "C115388" "C115388" ...
# $ Codelist Extensible (Yes/No): chr "No" NA NA NA ...
# $ Codelist Name : chr "6 Minute Walk Functional Test Test Code" "6 Minute Walk Functional Test Test Code" "6 Minute Walk Functional Test Test Code" "6 Minute Walk Functional Test Test Code" ...
# $ CDISC Submission Value : chr "SIXMW1TC" "SIXMW101" "SIXMW102" "SIXMW103" ...
# $ CDISC Synonym(s) : chr "6 Minute Walk Functional Test Test Code" "SIXMW1-Distance at 1 Minute" "SIXMW1-Distance at 2 Minutes" "SIXMW1-Distance at 3 Minutes" ...
# $ CDISC Definition : chr "6 Minute Walk Test test code." "6 Minute Walk Test - Distance at 1 minute." "6 Minute Walk Test - Distance at 2 minutes." "6 Minute Walk Test - Distance at 3 minutes." ...
# $ NCI Preferred Term : chr "CDISC Functional Test 6MWT Test Code Terminology" "6MWT - Distance at 1 Minute" "6MWT - Distance at 2 Minutes" "6MWT - Distance at 3 Minutes" ...
Importing Excel file using url using read.xls
@G.G is correct that read.xls
does not support https
. However, if you simply replace the https
with http
in the url you should be able to download the file.
Give this a try:
require(RCurl)
require(gdata)
url <- "http://dl.dropboxusercontent.com/u/27644144/NADAC%2020140101.xls"
test <- read.xls(url)
R Importing excel file directly from web
Without information about why using the gdata
package does not work for you I have to assume. Make sure you have Perl
installed - you can download it at http://www.activestate.com/activeperl
This works for me:
library('gdata')
## URL broken into multiple lines for readability
url <- paste("https://quotespeed.morningstar.com/exportChartDataToExcel.",
"jsp?tickers=AAPL&symbols=126.1.AAPL&st=1980-12-1&ed=2015-",
"6-8&f=m&dty=1&types=1&ver=1.6.0&qs_wsid=E43474CC03753FE0E",
"777D89877788ECB", sep = "")
url <- gsub("https", "http",url)
data <- read.xls(url, perl = "C:/Perl64/bin/perl.exe")
Without perl = "path_to_perl.exe"
I got the error
Error in findPerl(verbose = verbose) :
perl executable not found. Use perl= argument to specify the correct path.
Error in file.exists(tfn) : invalid 'file' argument
Python: how import excel file from the web?
Your file isn't a CSV or an Excel file. Actual contents are an HTML table (see as follows).
Exchange in {0}, Import(+)/Export(-)
<html>
<body>
<table>
<thead>
<tr>
<td colspan="5">Exchange EE connections in MWh, MW</td>
</tr><tr>
<td colspan="5">Data was last updated 06-01-2021</td>
</tr><tr>
<td></td><td style="text-align:center;">EE net exchange</td><td style="text-align:center;">EE - FI</td><td style="text-align:center;">EE - LV</td><td style="text-align:center;">EE - RU</td>
</tr>
</thead><tbody>
<tr>
<td style="text-align:left;">01-01-2021</td><td style="text-align:right;">14575</td><td style="text-align:right;">20969,0</td><td style="text-align:right;">-4884,0</td><td style="text-align:right;">-1510,0</td>
</tr><tr>
<td style="text-align:left;">02-01-2021</td><td style="text-align:right;">12073</td><td style="text-align:right;">22479,0</td><td style="text-align:right;">-8001,0</td><td style="text-align:right;">-2405,0</td>
</tr><tr>
<td style="text-align:left;">03-01-2021</td><td style="text-align:right;">14321</td><td style="text-align:right;">22540,0</td><td style="text-align:right;">-8259,0</td><td style="text-align:right;">40,0</td>
</tr><tr>
<td style="text-align:left;">04-01-2021</td><td style="text-align:right;">14662</td><td style="text-align:right;">17653,0</td><td style="text-align:right;">-5829,0</td><td style="text-align:right;">2838,0</td>
</tr><tr>
<td style="text-align:left;">05-01-2021</td><td style="text-align:right;">13570</td><td style="text-align:right;">13779,0</td><td style="text-align:right;">-5314,0</td><td style="text-align:right;">5105,0</td>
</tr><tr>
<td style="text-align:left;">06-01-2021</td><td style="text-align:right;">6243</td><td style="text-align:right;"></td><td style="text-align:right;"></td><td style="text-align:right;"></td>
</tr>
</tbody>
</table>
</body>
</html>
Use pd.read_html
like so:
import pandas as pd
url = 'https://www.nordpoolgroup.com/48d3ac/globalassets/marketdata-excel-files/exchange-ee-connections_2021_daily.xls'
dfs = pd.read_html(url)
df = dfs[0]
That you can open your file in Excel is because Excel iterates through possible formats until finding something that works. Eg you can make a tab separated values (which should have extension .tsv) file, append .xls and while it isn't an actual horrible spread sheet format (XLS), Excel will still open it normally. It also does this with HTML data.
Read xls file from a URL in python
Use this example to download the excel from Google Drive (the fileid
is the ID after the /d/
part in your URL):
fileid = "16cp23cJxeyUfnBHMp-sNCuFNQxe8cqOV"
df = pd.read_excel(
"https://drive.google.com/uc?export=download&id={fileid}".format(
fileid=fileid
),
skiprows=17,
)
print(df)
Prints:
Unnamed: 0 Unnamed: 1 Unnamed: 2 Petajoules Gigajoules %
0 NaN Afghanistan Afghanistan 321 10 78.669280
1 NaN Albania Albania 102 35 100.000000
2 NaN Algeria Algeria 1959 51 0.551010
3 NaN American Samoa American Samoa ... ... 0.641026
4 NaN Andorra Andorra 9 121 88.695650
5 NaN Angola Angola 642 27 70.909090
...and so on.
Using Pandas to read excel from url
Try the link to raw excel file:
import pandas as pd
url='https://github.com/owid/covid-19-data/blob/master/public/data/owid-covid-data.xlsx?raw=true'
df=pd.read_excel(url, sheet_name='Sheet1')
Related Topics
Time Series Plot with X Axis in "Year"-"Month" in R
Legends for Multiple Fills in Ggplot
Dual Y Axis in Ggplot2 for Multiple Panel Figure
Finding Where Two Linear Fits Intersect in R
How to Train a Ml Model in Sparklyr and Predict New Values on Another Dataframe
Find Consecutive Values in Vector in R
In Read.Table(): Incomplete Final Line Found by Readtableheader
Extract First Word from a Column and Insert into New Column
Minus Operation of Data Frames
Finding Overlapping Ranges Between Two Interval Data
Understanding Element Wise Clearing of R's Workspace
How to Order Bars Within All Facets
Partially Read Really Large CSV.Gz in R Using Vroom
Creating Accompanying Slides for Bookdown Project