Download Image File from the HTML Page Source

Download image file from the HTML page source

Here is some code to download all the images from the supplied URL, and save them in the specified output folder. You can modify it to your own needs.

"""
dumpimages.py
Downloads all the images on the supplied URL, and saves them to the
specified output file ("/test/" by default)

Usage:
python dumpimages.py http://example.com/ [output]
"""
from bs4 import BeautifulSoup as bs
from urllib.request import (
urlopen, urlparse, urlunparse, urlretrieve)
import os
import sys

def main(url, out_folder="/test/"):
"""Downloads all the images at 'url' to /test/"""
soup = bs(urlopen(url))
parsed = list(urlparse(url))

for image in soup.findAll("img"):
print("Image: %(src)s" % image)
filename = image["src"].split("/")[-1]
parsed[2] = image["src"]
outpath = os.path.join(out_folder, filename)
if image["src"].lower().startswith("http"):
urlretrieve(image["src"], outpath)
else:
urlretrieve(urlunparse(parsed), outpath)

def _usage():
print("usage: python dumpimages.py http://example.com [outpath]")

if __name__ == "__main__":
url = sys.argv[-1]
out_folder = "/test/"
if not url.lower().startswith("http"):
out_folder = sys.argv[-1]
url = sys.argv[-2]
if not url.lower().startswith("http"):
_usage()
sys.exit(-1)
main(url, out_folder)

Edit: You can specify the output folder now.

Download and save image from HTML source

Found the answer. We have to set the cookie container from the web site to your request.

public static Stream DownloadImageData(CookieContainer cookies, string siteURL)
{
HttpWebRequest httpRequest = null;
HttpWebResponse httpResponse = null;

httpRequest = (HttpWebRequest)WebRequest.Create(siteURL);

httpRequest.CookieContainer = cookies;
httpRequest.AllowAutoRedirect = true;

try
{
httpResponse = (HttpWebResponse)httpRequest.GetResponse();
if (httpResponse.StatusCode == HttpStatusCode.OK)
{
var httpContentData = httpResponse.GetResponseStream();

return httpContentData;
}
return null;
}
catch (WebException we)
{
return null;
}
finally
{
if (httpResponse != null)
{
httpResponse.Close();
}
}
}

How to save all files from source code of a web site?

In Chrome, go to options (Customize and Control, the 3 dots/bars at top right) ---> More Tools ---> save page as

save page as  
filename : any_name.html
save as type : webpage complete.

Then you will get any_name.html and any_name folder.

href image link download on click

<a download="custom-filename.jpg" href="/path/to/image" title="ImageName">
<img alt="ImageName" src="/path/to/image">
</a>

It's not yet fully supported caniuse, but you can use with modernizr (under Non-core detects) to check the support of the browser.



Related Topics



Leave a reply



Submit