Download image file from the HTML page source
Here is some code to download all the images from the supplied URL, and save them in the specified output folder. You can modify it to your own needs.
"""
dumpimages.py
Downloads all the images on the supplied URL, and saves them to the
specified output file ("/test/" by default)
Usage:
python dumpimages.py http://example.com/ [output]
"""
from bs4 import BeautifulSoup as bs
from urllib.request import (
urlopen, urlparse, urlunparse, urlretrieve)
import os
import sys
def main(url, out_folder="/test/"):
"""Downloads all the images at 'url' to /test/"""
soup = bs(urlopen(url))
parsed = list(urlparse(url))
for image in soup.findAll("img"):
print("Image: %(src)s" % image)
filename = image["src"].split("/")[-1]
parsed[2] = image["src"]
outpath = os.path.join(out_folder, filename)
if image["src"].lower().startswith("http"):
urlretrieve(image["src"], outpath)
else:
urlretrieve(urlunparse(parsed), outpath)
def _usage():
print("usage: python dumpimages.py http://example.com [outpath]")
if __name__ == "__main__":
url = sys.argv[-1]
out_folder = "/test/"
if not url.lower().startswith("http"):
out_folder = sys.argv[-1]
url = sys.argv[-2]
if not url.lower().startswith("http"):
_usage()
sys.exit(-1)
main(url, out_folder)
Edit: You can specify the output folder now. Download and save image from HTML source
Found the answer. We have to set the cookie container from the web site to your request.
public static Stream DownloadImageData(CookieContainer cookies, string siteURL)
{
HttpWebRequest httpRequest = null;
HttpWebResponse httpResponse = null;
httpRequest = (HttpWebRequest)WebRequest.Create(siteURL);
httpRequest.CookieContainer = cookies;
httpRequest.AllowAutoRedirect = true;
try
{
httpResponse = (HttpWebResponse)httpRequest.GetResponse();
if (httpResponse.StatusCode == HttpStatusCode.OK)
{
var httpContentData = httpResponse.GetResponseStream();
return httpContentData;
}
return null;
}
catch (WebException we)
{
return null;
}
finally
{
if (httpResponse != null)
{
httpResponse.Close();
}
}
}
How to save all files from source code of a web site?
In Chrome, go to options (Customize and Control, the 3 dots/bars at top right) ---> More Tools ---> save page as
save page as
filename : any_name.html
save as type : webpage complete.
Then you will get any_name.html
and any_name folder
. href image link download on click
<a download="custom-filename.jpg" href="/path/to/image" title="ImageName">
<img alt="ImageName" src="/path/to/image">
</a>
It's not yet fully supported caniuse, but you can use with modernizr (under Non-core detects) to check the support of the browser.
Related Topics
Filename and Line Number of Python Script
Get the String Within Brackets in Python
Opencv Python: Draw Minarearect ( Rotatedrect Not Implemented)
Is There a Simple Process-Based Parallel Map for Python
Detect 64Bit Os (Windows) in Python
Shell Script: Execute a Python Program from Within a Shell Script
Syntax Error: Invalid Syntax' for No Apparent Reason
How to Group a Pandas Dataframe by a Defined Time Interval
How to Check for Python Version in a Program That Uses New Language Features
Scipy Curve_Fit Doesn't Like Math Module
How to Crop the Internal Area of a Contour
Stop Matplotlib Repeating Labels in Legend
How to Flatten a Pandas Dataframe with Some Columns as JSON
Fitting a Closed Curve to a Set of Points
Python Element-Wise Tuple Operations Like Sum
Python Regex to Find a String in Double Quotes Within a String