How to Know the Size of a File Before Downloading It

How to know the size of a file before downloading it?

you can get a header called Content-Length form the HTTP Response object that you get, this will give you the length of the file.
you should note though, that some servers don't return that information, and the only way to know the actual size is to read everything from the response.

Example:

URL url = new URL("http://server.com/file.mp3");
URLConnection urlConnection = url.openConnection();
urlConnection.connect();
int file_size = urlConnection.getContentLength();

Python | HTTP - How to check file size before downloading it

If the server supplies a Content-Length header, then you can use that to determine if you'd like to continue downloading the remainder of the body or not. If the server does not provide the header, then you'll need to stream the response until you decide you no longer want to continue.

To do this, you'll need to make sure that you're not preloading the full response.

from urllib3 import PoolManager

pool = PoolManager()
response = pool.request("GET", url, preload_content=False)

# Maximum amount we want to read
max_bytes = 1000000

content_bytes = response.headers.get("Content-Length")
if content_bytes and int(content_bytes) < max_bytes:
# Expected body is smaller than our maximum, read the whole thing
data = response.read()
# Do something with data
...
elif content_bytes is None:
# Alternatively, stream until we hit our limit
amount_read = 0
for chunk in r.stream():
amount_read += len(chunk)
# Save chunk
...
if amount_read > max_bytes:
break

# Release the connection back into the pool
response.release_conn()

Get file size before downloading & counting how much already downloaded (http+ruby)

so I made it work even with the progress bar ....

require 'net/http'
require 'uri'
require 'progressbar'

url = "url with some file"

url_base = url.split('/')[2]
url_path = '/'+url.split('/')[3..-1].join('/')
@counter = 0

Net::HTTP.start(url_base) do |http|
response = http.request_head(URI.escape(url_path))
ProgressBar#format_arguments=[:title, :percentage, :bar, :stat_for_file_transfer]
pbar = ProgressBar.new("file name:", response['content-length'].to_i)
File.open("test.file", 'w') {|f|
http.get(URI.escape(url_path)) do |str|
f.write str
@counter += str.length
pbar.set(@counter)
end
}
end
pbar.finish
puts "Done."

Ajax - Get size of file before downloading

You can get XHR response header data manually:

http://www.w3.org/TR/XMLHttpRequest/#the-getresponseheader()-method

This function will get the filesize of the requested URL:

function get_filesize(url, callback) {
var xhr = new XMLHttpRequest();
xhr.open("HEAD", url, true); // Notice "HEAD" instead of "GET",
// to get only the header
xhr.onreadystatechange = function() {
if (this.readyState == this.DONE) {
callback(parseInt(xhr.getResponseHeader("Content-Length")));
}
};
xhr.send();
}

get_filesize("http://example.com/foo.exe", function(size) {
alert("The size of foo.exe is: " + size + " bytes.");
});

Get size of a file before downloading in Python

I have reproduced what you are seeing:

import urllib, os
link = "http://python.org"
print "opening url:", link
site = urllib.urlopen(link)
meta = site.info()
print "Content-Length:", meta.getheaders("Content-Length")[0]

f = open("out.txt", "r")
print "File on disk:",len(f.read())
f.close()

f = open("out.txt", "w")
f.write(site.read())
site.close()
f.close()

f = open("out.txt", "r")
print "File on disk after download:",len(f.read())
f.close()

print "os.stat().st_size returns:", os.stat("out.txt").st_size

Outputs this:

opening url: http://python.org
Content-Length: 16535
File on disk: 16535
File on disk after download: 16535
os.stat().st_size returns: 16861

What am I doing wrong here? Is os.stat().st_size not returning the correct size?


Edit:
OK, I figured out what the problem was:

import urllib, os
link = "http://python.org"
print "opening url:", link
site = urllib.urlopen(link)
meta = site.info()
print "Content-Length:", meta.getheaders("Content-Length")[0]

f = open("out.txt", "rb")
print "File on disk:",len(f.read())
f.close()

f = open("out.txt", "wb")
f.write(site.read())
site.close()
f.close()

f = open("out.txt", "rb")
print "File on disk after download:",len(f.read())
f.close()

print "os.stat().st_size returns:", os.stat("out.txt").st_size

this outputs:

$ python test.py
opening url: http://python.org
Content-Length: 16535
File on disk: 16535
File on disk after download: 16535
os.stat().st_size returns: 16535

Make sure you are opening both files for binary read/write.

// open for binary write
open(filename, "wb")
// open for binary read
open(filename, "rb")

How to determine online file size before download in R?

A simple solution would be:

download_size <- function(url) as.numeric(httr::HEAD(url)$headers$`content-length`)

Which would allow

download_size("https://cran.r-project.org/doc/manuals/r-release/R-ints.pdf")
#> [1] 452557


Related Topics



Leave a reply



Submit