Downloading a Large File Using Curl

Downloading a large file using curl

<?php
set_time_limit(0);
//This is the file where we save the information
$fp = fopen (dirname(__FILE__) . '/localfile.tmp', 'w+');
//Here is the file we are downloading, replace spaces with %20
$ch = curl_init(str_replace(" ","%20",$url));
// make sure to set timeout to a high enough value
// if this is too low the download will be interrupted
curl_setopt($ch, CURLOPT_TIMEOUT, 600);
// write curl response to file
curl_setopt($ch, CURLOPT_FILE, $fp);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
// get curl response
curl_exec($ch);
curl_close($ch);
fclose($fp);
?>

Download a large file via curl

You can use the following bash script:

#!/usr/bin/env bash

$(mkdir -p "Data/")
echo "Downloading Data"
fileid="1fdFu5NGXe4rTLYKD5wOqk9dl-eJOefXo"
filename="nyu_data.zip"
curl -c ./cookie -s -L "https://drive.google.com/uc?export=download&id=${fileid}" > /dev/null
curl -Lb ./cookie "https://drive.google.com/uc?export=download&confirm=`awk '/download/ {print $NF}' ./cookie`&id=${fileid}" -o "Data/"${filename}
$(rm cookie)

This will put the zip file in a created Data directory

wget/curl large file from google drive

WARNING: This functionality is deprecated. See warning below in comments.


Have a look at this question: Direct download from Google Drive using Google Drive API

Basically you have to create a public directory and access your files by relative reference with something like

wget https://googledrive.com/host/LARGEPUBLICFOLDERID/index4phlat.tar.gz

Alternatively, you can use this script: https://github.com/circulosmeos/gdown.pl

How to download a big file from google drive via curl in Bash?

  • You want to download a file from Google Drive using the curl command with the access token.

If my understanding is correct, how about this modification?

Modified curl command:

Please add the query parameter of alt=media.

curl -H "Authorization: Bearer $token" "https://www.googleapis.com/drive/v3/files/$id?alt=media" -o "$file"

Note:

  • This modified curl command supposes that your access token can be used for downloading the file.
  • In this modification, the files except for Google Docs can be downloaded. If you want to download the Google Docs, please use the Files: export method of Drive API. Ref

Reference:

  • Download files

If I misunderstood your question and this was not the direction you want, I apologize.

How to download a big zip file with curl?

Well, I always need to search for the correct key words AFTER I ask on SE...

If I understood well:

There is a redirection. Curl does not follow it by default, but you can tell curl to do it with the following option:

-L

Source: https://curl.se/docs/faq.html#301_Moved_Permanently

PHP cUrl - Unable to fetch large files

what do you think this line does? curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); - it tells php to catch all stdout output of curl_exec, and save it all in memory at once, before doing anything else, that's both a very slow approach (because you don't start writing to disk before your download is 100% complete, and unless you're running on SSDs, disks are slow), and extremely memory hungry approach (because you store the entire file in memory at once), neither of those things are desirable. instead, do $fp=fopen(basename($url),'wb');curl_setopt($ch,CURLOPT_FILE,$fp); - now curl will write the content directly to the disk, thus being much faster (writing it to disk as it's being downloaded) AND just use a small amount of ram, no matter how big the download file is.

  • also note, if you're going to run large amount of slow downloads simultaneously, PHP-behind-a-webserver is simply a bad tool for the job, usually the amount of concurrent php processes you can run is very limited, and block your entire website from loading when all of them are busy, and php aborts if the client disconnect for some reason (see ignore_user_abort()), and many webservers will timeout if the script takes too long (see nginx proxy_read_timeout for example), and php often even kill itself for timeout reasons (see set_time_limit()) .. if that's the case, consider writing the downloader in another language (for example, Go's goroutines should be able to do a massive amount of concurrent slow downloads with little resource usage, unlike PHP)


Related Topics



Leave a reply



Submit