Python: Download Files from Google Drive Using Url

Python: download files from google drive using url

If by "drive's url" you mean the shareable link of a file on Google Drive, then the following might help:

import requests

def download_file_from_google_drive(id, destination):
URL = "https://docs.google.com/uc?export=download"

session = requests.Session()

response = session.get(URL, params = { 'id' : id }, stream = True)
token = get_confirm_token(response)

if token:
params = { 'id' : id, 'confirm' : token }
response = session.get(URL, params = params, stream = True)

save_response_content(response, destination)

def get_confirm_token(response):
for key, value in response.cookies.items():
if key.startswith('download_warning'):
return value

return None

def save_response_content(response, destination):
CHUNK_SIZE = 32768

with open(destination, "wb") as f:
for chunk in response.iter_content(CHUNK_SIZE):
if chunk: # filter out keep-alive new chunks
f.write(chunk)

if __name__ == "__main__":
file_id = 'TAKE ID FROM SHAREABLE LINK'
destination = 'DESTINATION FILE ON YOUR DISK'
download_file_from_google_drive(file_id, destination)

The snipped does not use pydrive, nor the Google Drive SDK, though. It uses the requests module (which is, somehow, an alternative to urllib2).

When downloading large files from Google Drive, a single GET request is not sufficient. A second one is needed - see wget/curl large file from google drive.

Downloading files from public Google Drive in python: scoping issues?

Well thanks to the security update released by Google few months before. This makes the link sharing stricter and you need resource key as well to access the file in-addition to the fileId.

As per the documentation , You need to provide the resource key as well for newer links, if you want to access it in the header X-Goog-Drive-Resource-Keys as fileId1/resourceKey1.

If you apply this change in your code, it will work as normal. Example edit below:

regex = "(?<=https://drive.google.com/file/d/)[a-zA-Z0-9]+"
regex_rkey = "(?<=resourcekey=)[a-zA-Z0-9-]+"
for i, l in enumerate(links_to_download):
url = l
file_id = re.search(regex, url)[0]
resource_key = re.search(regex_rkey, url)[0]
request = drive_service.files().get_media(fileId=file_id)
request.headers["X-Goog-Drive-Resource-Keys"] = f"{file_id}/{resource_key}"
fh = io.FileIO(f"file_{i}", mode='wb')
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print("Download %d%%." % int(status.progress() * 100))

Well, the regex for resource key was something I quickly made, so cannot be sure on if it supports every case. But this provides you the solution.
Now, you may have to listen to old and new links based on this and set the changes.

Download Files From Google Drive Using Python

Try this code:-

from zdrive import Downloader

output_directory = "/home/abhinav/Documents"
d = Downloader()

# folder which want to download from Drive
folder_id = 'XXXX-YYYY-ZZZZ'
d.downloadFolder(folder_id, destinationFolder=output_directory)

Download Google Drive files to a specific location using Python

I understand that you have an array of Drive file links, and you want to download them locally using Python. I am assuming that you want to to download files stored on Drive, not Workspace files (i.e. Docs, Sheets…). You can do it very easily by following the Drive API Python quickstart guide. That walkthrough will install all the necessary dependencies and show you an example code. Then, you only have to edit the main function to download the files instead of the sample operations.

To download the Python files you only need to know its id and use the Files.get method. I see that you already know the ids, so you are ready to make the request. To build the request you should introduce the id of the file and set the parameter alt to the value media. If you are using the example from the paragraph above, you can do it just by using the id like this example. If those guides don't work for you, please let me know.

How to download a file from Google Drive using Python and the Drive API v3

To make requests to Google APIs the work flow is in essence the following:

  1. Go to developer console, log in if you haven't.
  2. Create a Cloud Platform project.
  3. Enable for your project, the APIs you are interested in using with you projects' apps (for example: Google Drive API).
  4. Create and download OAuth 2.0 Client IDs credentials that will allow your app to gain authorization for using your enabled APIs.
  5. Head over to OAuth consent screen, click on Sample Image and add your scope using the Sample Image button. (scope: https://www.googleapis.com/auth/drive.readonly for you). Choose Internal/External according to your needs, and for now ignore the warnings if any.
  6. To get the valid token for making API request the app will go through the OAuth flow to receive the authorization token. (Since it needs consent)
  7. During the OAuth flow the user will be redirected to your the OAuth consent screen, where it will be asked to approve or deny access to your app's requested scopes.
  8. If consent is given, your app will receive an authorization token.
  9. Pass the token in your request to your authorized API endpoints.[2]
  10. Build a Drive Service to make API requests (You will need the valid token)[1]


NOTE:

The available methods for the Files resource for Drive API v3 are here.

When using the Python Google APIs Client, then you can use export_media() or get_media() as per Google APIs Client for Python documentation



IMPORTANT:

Also, check that the scope you are using, actually allows you to do what you want (Downloading Files from user's Drive) and set it accordingly. ATM you have an incorrect scope for your goal. See OAuth 2.0 API Scopes



Sample Code References:

  1. Building a Drive Service:
import google_auth_oauthlib.flow
from google.auth.transport.requests import Request
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build


class Auth:

def __init__(self, client_secret_filename, scopes):
self.client_secret = client_secret_filename
self.scopes = scopes
self.flow = google_auth_oauthlib.flow.Flow.from_client_secrets_file(self.client_secret, self.scopes)
self.flow.redirect_uri = 'http://localhost:8080/'
self.creds = None

def get_credentials(self):
flow = InstalledAppFlow.from_client_secrets_file(self.client_secret, self.scopes)
self.creds = flow.run_local_server(port=8080)
return self.creds


# The scope you app will use.
# (NEEDS to be among the enabled in your OAuth consent screen)
SCOPES = "https://www.googleapis.com/auth/drive.readonly"
CLIENT_SECRET_FILE = "credentials.json"

credentials = Auth(client_secret_filename=CLIENT_SECRET_FILE, scopes=SCOPES).get_credentials()

drive_service = build('drive', 'v3', credentials=credentials)

  1. Making the request to export or get a file
request = drive_service.files().export(fileId=file_id, mimeType='application/pdf')

fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print("Download %d%%" % int(status.progress() * 100))

# The file has been downloaded into RAM, now save it in a file
fh.seek(0)
with open('your_filename.pdf', 'wb') as f:
shutil.copyfileobj(fh, f, length=131072)

Download pdf file(Not restricted) from google drive through URL

I was able to find the solution for it through wget in python. Answering it so that it could help someone in the future.

import os
import wget
def download_candidate_resume(email: str, resume_url: str):
"""
This function is used to download resume from google drive and store on the local system
@param email: candidate email
@type email: str
@param resume_url: url of resume on google drive
@type resume_url: str
"""
file_extension = "pdf"
current_time = datetime.now()
file_name = f'{email}_{int(current_time.timestamp())}.{file_extension}'
temp_file_path = os.path.join(
os.getcwd(),
f'{email}_{int(current_time.timestamp())}.{file_extension}',
)
downloadable_resume_url = re.sub(
r"https://drive\.google\.com/file/d/(.*?)/.*?\?usp=sharing",
r"https://drive.google.com/uc?export=download&id=\1",
resume_url,
)
wget.download(downloadable_resume_url, out=temp_file_path)


Related Topics



Leave a reply



Submit