Download File from Web in Python 3

Download file from web in Python 3

If you want to obtain the contents of a web page into a variable, just read the response of urllib.request.urlopen:

import urllib.request
...
url = 'http://example.com/'
response = urllib.request.urlopen(url)
data = response.read() # a `bytes` object
text = data.decode('utf-8') # a `str`; this step can't be used if data is binary

The easiest way to download and save a file is to use the urllib.request.urlretrieve function:

import urllib.request
...
# Download the file from `url` and save it locally under `file_name`:
urllib.request.urlretrieve(url, file_name)
import urllib.request
...
# Download the file from `url`, save it in a temporary directory and get the
# path to it (e.g. '/tmp/tmpb48zma.txt') in the `file_name` variable:
file_name, headers = urllib.request.urlretrieve(url)

But keep in mind that urlretrieve is considered legacy and might become deprecated (not sure why, though).

So the most correct way to do this would be to use the urllib.request.urlopen function to return a file-like object that represents an HTTP response and copy it to a real file using shutil.copyfileobj.

import urllib.request
import shutil
...
# Download the file from `url` and save it locally under `file_name`:
with urllib.request.urlopen(url) as response, open(file_name, 'wb') as out_file:
shutil.copyfileobj(response, out_file)

If this seems too complicated, you may want to go simpler and store the whole download in a bytes object and then write it to a file. But this works well only for small files.

import urllib.request
...
# Download the file from `url` and save it locally under `file_name`:
with urllib.request.urlopen(url) as response, open(file_name, 'wb') as out_file:
data = response.read() # a `bytes` object
out_file.write(data)

It is possible to extract .gz (and maybe other formats) compressed data on the fly, but such an operation probably requires the HTTP server to support random access to the file.

import urllib.request
import gzip
...
# Read the first 64 bytes of the file inside the .gz archive located at `url`
url = 'http://example.com/something.gz'
with urllib.request.urlopen(url) as response:
with gzip.GzipFile(fileobj=response) as uncompressed:
file_header = uncompressed.read(64) # a `bytes` object
# Or do anything shown above using `uncompressed` instead of `response`.

Unable to download file from URL using python

Check this, It's worked for me.

import requests
headers = {
"User-Agent": 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.93 Safari/537.36'}
response = requests.get(
"https://www.cmegroup.com/content/dam/cmegroup/notices/clearing/2020/08/Chadv20-239.pdf", headers=headers)
pdf = open("Chadv20-239.pdf", 'wb')
pdf.write(response.content)
pdf.close()

How do I download a file using urllib.request in Python 3?

change

f.write(g)

to

f.write(g.read())

How to download a file over HTTP?

Use urllib.request.urlopen():

import urllib.request
with urllib.request.urlopen('http://www.example.com/') as f:
html = f.read().decode('utf-8')

This is the most basic way to use the library, minus any error handling. You can also do more complex stuff such as changing headers.

On Python 2, the method is in urllib2:

import urllib2
response = urllib2.urlopen('http://www.example.com/')
html = response.read()

Downloading a file with a URL using python

To download the file:

In [1]: import requests

In [2]: url = 'https://assets.publishing.service.gov.uk/government/uploads/syste
...: m/uploads/attachment_data/file/959864/COVID-19-transport-use-statistics.
...: ods'

In [3]: with open('COVID-19-transport-use-statistics.ods', 'wb') as out_file:
...: content = requests.get(url, stream=True).content
...: out_file.write(content)

And then you can use pandas-ods-reader to read the file by running:

pip install pandas-ods-reader

Then:

In [4]: from pandas_ods_reader import read_ods

In [5]: df = read_ods('COVID-19-transport-use-statistics.ods', 1)

In [6]: df
Out[6]:
Department for Transport statistics ... unnamed.9
0 https://www.gov.uk/government/statistics/trans... ... None
1 None ... None
2 Use of transport modes: Great Britain, since 1... ... None
3 Figures are percentages of an equivalent day o... ... None
4 None ... Percentage
.. ... ... ...
390 Transport for London Tube and Bus ... None
391 Buses (excl. London) ... None
392 Cycling ... None
393 Any other queries ... None
394 Media enquiries ... None

And you can save it as a csv if that is what you want using df.to_csv('my_data.csv', index=False)

Python3, download file from url by button click

By adding the proper headers and using session we can download and save the file using request module.

import requests

headers = {
"Host": "freemidi.org",
"Connection": "keep-alive",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36",
"Accept-Encoding": "gzip, deflate, br",
"Accept-Language": "en-US,en;q=0.9",
}

session = requests.Session()

#the website sets the cookies first
req1 = session.get("https://freemidi.org/getter-13560", headers = headers)

#Request again to download
req2 = session.get("https://freemidi.org/getter-13560", headers = headers)
print(len(req2.text)) # This is the size of the mdi file

with open("testFile.mid", "wb") as saveMidi:
saveMidi.write(req2.content)

Basic http file downloading and saving to disk in python?

A clean way to download a file is:

import urllib

testfile = urllib.URLopener()
testfile.retrieve("http://randomsite.com/file.gz", "file.gz")

This downloads a file from a website and names it file.gz. This is one of my favorite solutions, from Downloading a picture via urllib and python.

This example uses the urllib library, and it will directly retrieve the file form a source.



Related Topics



Leave a reply



Submit