How to Download a File from Google Drive Using Python and the Drive API V3

How to download a file from Google Drive using Python and the Drive API v3

To make requests to Google APIs the work flow is in essence the following:

  1. Go to developer console, log in if you haven't.
  2. Create a Cloud Platform project.
  3. Enable for your project, the APIs you are interested in using with you projects' apps (for example: Google Drive API).
  4. Create and download OAuth 2.0 Client IDs credentials that will allow your app to gain authorization for using your enabled APIs.
  5. Head over to OAuth consent screen, click on Sample Image and add your scope using the Sample Image button. (scope: https://www.googleapis.com/auth/drive.readonly for you). Choose Internal/External according to your needs, and for now ignore the warnings if any.
  6. To get the valid token for making API request the app will go through the OAuth flow to receive the authorization token. (Since it needs consent)
  7. During the OAuth flow the user will be redirected to your the OAuth consent screen, where it will be asked to approve or deny access to your app's requested scopes.
  8. If consent is given, your app will receive an authorization token.
  9. Pass the token in your request to your authorized API endpoints.[2]
  10. Build a Drive Service to make API requests (You will need the valid token)[1]


NOTE:

The available methods for the Files resource for Drive API v3 are here.

When using the Python Google APIs Client, then you can use export_media() or get_media() as per Google APIs Client for Python documentation



IMPORTANT:

Also, check that the scope you are using, actually allows you to do what you want (Downloading Files from user's Drive) and set it accordingly. ATM you have an incorrect scope for your goal. See OAuth 2.0 API Scopes



Sample Code References:

  1. Building a Drive Service:
import google_auth_oauthlib.flow
from google.auth.transport.requests import Request
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build


class Auth:

def __init__(self, client_secret_filename, scopes):
self.client_secret = client_secret_filename
self.scopes = scopes
self.flow = google_auth_oauthlib.flow.Flow.from_client_secrets_file(self.client_secret, self.scopes)
self.flow.redirect_uri = 'http://localhost:8080/'
self.creds = None

def get_credentials(self):
flow = InstalledAppFlow.from_client_secrets_file(self.client_secret, self.scopes)
self.creds = flow.run_local_server(port=8080)
return self.creds


# The scope you app will use.
# (NEEDS to be among the enabled in your OAuth consent screen)
SCOPES = "https://www.googleapis.com/auth/drive.readonly"
CLIENT_SECRET_FILE = "credentials.json"

credentials = Auth(client_secret_filename=CLIENT_SECRET_FILE, scopes=SCOPES).get_credentials()

drive_service = build('drive', 'v3', credentials=credentials)

  1. Making the request to export or get a file
request = drive_service.files().export(fileId=file_id, mimeType='application/pdf')

fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print("Download %d%%" % int(status.progress() * 100))

# The file has been downloaded into RAM, now save it in a file
fh.seek(0)
with open('your_filename.pdf', 'wb') as f:
shutil.copyfileobj(fh, f, length=131072)

How to save downloaded data from Google Drive in file - Python Drive API

  • You want to save the downloaded file as a file to the local PC.
  • You want to achieve this using google-api-python-client with Python.
  • You have already been able to get and put the file using Drive API.

If my understanding is correct, how about this modification?

From:

fh = io.BytesIO()

To:

fh = io.FileIO("### filename ###", mode='wb')

References:

  • Binary I/O
  • File I/O

If this was not the result you want, I apologize.

Google drive api python - How to download all the files inside a folder or the folder to a specific local destination path

Do a file.list setting parent to the file id of the directory you are looking for.

Then loop through each file returned and download it using the code you have now.

  file_id = [File id from first call]
request = service.files().get_media(fileId=file_id)
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print ("Download %d%%." % int(status.progress() * 100))

If you are looking for a method which will do it for you you wont find one. You will need to download them one at a time.

python: How do i download a file from Google drive using api

just write the content variable to a file instead of returning the content

fo = open("foo.jpg", "wb")
fo.write(content)
fo.close()

Google Drive API downloading files with Python

The lines below and starting with 'file_id=....' are indented with 2 spaces from the left. However the code above it are indented with 4 spaces. If these lines of code are to be within the 'main' function, they need to be indented like the rest of the code... 4 spaces at a time.

This is a common error when copying and pasting code. If you plan to further extend the code, either modify the copied code to match or change your style to suit. Note development environments like PyCharm and VSCode have settings to configure the default indentation.

I'm trying to get the direct download link for a file using the Google Drive API

I believe your goal as follows.

  • From I'm trying to get the direct download link for a file in Google Drive using the Google Drive API (v3),, I understand that you want to retrieve webContentLink.
  • The file that you want to retrieve the webContentLink is the files except for Google Docs files.
  • You have already been able to get the file metadata using Drive API. So your access token can be used for this.

Modification points:

  • When the file is not shared, the API key cannot be used. By this, https://www.googleapis.com/drive/v3/files/**FILE_ID**?alt=media&supportsAllDrives=True&includeItemsFromAllDrives=True&key=**API_KEY** returns File not found. I think that the reason of this issue is due to this.
  • When I saw your script in your question, it seems that you want to download the file content.
  • In your script, headers is not used. So in this case, the access token is not used.
  • In the method of "Files: get", there is no includeItemsFromAllDrives.
  • In your script, I think that an error occurs at credentials.access_token. How about this? If my understanding is correct, please try to modify to accessToken = credentials.token.
  • In Drive API v3, the default response values don't include webContentLink. So in this case, the field value is required to be set like fields=webContentLink.

When your script is modified, it becomes as follows.

Modified script:

file_id = '###'  # Please set the file ID.
req_url = "https://www.googleapis.com/drive/v3/files/" + file_id + "?supportsAllDrives=true&fields=webContentLink"
headers = {'Authorization': 'Bearer %s' % accessToken}
res = requests.get(req_url, headers=headers)
obj = res.json()
print(obj.get('webContentLink'))

Or, you can use drive = build('drive', 'v3', credentials=credentials) in your script, you can also use the following script.

file_id = '###'  # Please set the file ID.
drive = build('drive', 'v3', credentials=credentials)
request = drive.files().get(fileId=file_id, supportsAllDrives=True, fields='webContentLink').execute()
print(request.get('webContentLink'))

Note:

  • In this modified script,
    • When the file is in the shared Drive and you don't have the permissions for retrieving the file metadata, an error occurs.
    • When your access token cannot be used for retrieving the file metadata, an error occurs.

So please be careful above points.

  • When * is used for fields, all file metadata can be retrieved.

Reference:

  • Files: get

Added:

  • You want to download the binary data from the Google Drive by the URL.
  • The file size is large like "2-10 gigabytes".

In this case, unfortunately, webContentLink cannot be used. Because in the case of the such large file, webContentLink is redirected. So I think that the method that the file is publicly shared and use the API key is suitable for achieving your goal. But, you cannot publicly shared the file.

From this situation, as a workaround, I would like to propose to use this method. This method is "One Time Download for Google Drive". At Google Drive, when the publicly shared file is downloaded, even when the permission of file is deleted under the download, the download can be run. This method uses this.

Flow

In this sample script, the API key is used.

  1. Request to Web Apps with the API key and the file ID you want to download.
  2. At Web Apps, the following functions are run.
    • Permissions of file of the received file ID are changed. And the file is started to be publicly shared.
    • Install a time-driven trigger. In this case, the trigger is run after 1 minute.
      • When the function is run by the time-driven trigger, the permissions of file are changed. And sharing file is stopped. By this, the shared file of only one minute can be achieved.
  3. Web Apps returns the endpoint for downloading the file of the file ID.
    • After you got the endpoint, please download the file using the endpoint in 1 minute. Because the file is shared for only one minute.

Usage:

1. Create a standalone script

In this workaround, Google Apps Script is used as the server side. Please create a standalone script.
If you want to directly create it, please access to https://script.new/. In this case, if you are not logged in Google, the log in screen is opened. So please log in to Google. By this, the script editor of Google Apps Script is opened.

2. Set sample script of Server side

Please copy and paste the following script to the script editor. At that time, please set your API key to the variable of key in the function doGet(e).

Here, please set your API key in the function of doGet(e). In this Web Apps, when the inputted API key is the same, the script is run.

function deletePermission() {
const forTrigger = "deletePermission";
const id = CacheService.getScriptCache().get("id");
const triggers = ScriptApp.getProjectTriggers();
triggers.forEach(function(e) {
if (e.getHandlerFunction() == forTrigger) ScriptApp.deleteTrigger(e);
});
const file = DriveApp.getFileById(id);
file.setSharing(DriveApp.Access.PRIVATE, DriveApp.Permission.NONE);
}

function checkTrigger(forTrigger) {
const triggers = ScriptApp.getProjectTriggers();
for (var i = 0; i < triggers.length; i++) {
if (triggers[i].getHandlerFunction() == forTrigger) {
return false;
}
}
return true;
}

function doGet(e) {
const key = "###"; // <--- API key. This is also used for checking the user.

const forTrigger = "deletePermission";
var res = "";
if (checkTrigger(forTrigger)) {
if ("id" in e.parameter && e.parameter.key == key) {
const id = e.parameter.id;
CacheService.getScriptCache().put("id", id, 180);
const file = DriveApp.getFileById(id);
file.setSharing(DriveApp.Access.ANYONE_WITH_LINK, DriveApp.Permission.VIEW);
var d = new Date();
d.setMinutes(d.getMinutes() + 1);
ScriptApp.newTrigger(forTrigger).timeBased().at(d).create();
res = "https://www.googleapis.com/drive/v3/files/" + id + "?alt=media&key=" + e.parameter.key;
} else {
res = "unavailable";
}
} else {
res = "unavailable";
}
return ContentService.createTextOutput(res);
}

3. Deploy Web Apps

  1. On the script editor, Open a dialog box by "Publish" -> "Deploy as web app".
  2. Select "Me" for "Execute the app as:".
  3. Select "Anyone, even anonymous" for "Who has access to the app:". This is a test case.
    • If Only myself is used, only you can access to Web Apps. At that time, please use your access token.
  4. Click "Deploy" button as new "Project version".
  5. Automatically open a dialog box of "Authorization required".
    1. Click "Review Permissions".
    2. Select own account.
    3. Click "Advanced" at "This app isn't verified".
    4. Click "Go to ### project name ###(unsafe)"
    5. Click "Allow" button.
  6. Click "OK"

4. Test run: Client side

This is a sample script of python. Before you test this, please confirm the above script is deployed as Web Apps. And please set the URL of Web Apps, the file ID and your API key.

import requests
url1 = "https://script.google.com/macros/s/###/exec"
url1 += "?id=###fileId###&key=###your API key###"
res1 = requests.get(url1)
url2 = res1.text
res2 = requests.get(url2)
with open("###sampleFilename###", "wb") as f:
f.write(res2.content)
  • In this sample script, at first, it requests to the Web Apps using the file ID and API key, and the file is shared publicly in 1 minute. And then, the file can be downloaded. After 1 minute, the file is not publicly shared. But the download of the file can be kept.

Note:

  • When you modified the script of Web Apps, please redeploy the Web Apps as new version. By this, the latest script is reflected to the Web Apps. Please be careful this.

References:

  • One Time Download for Google Drive
  • Web Apps
  • Taking advantage of Web Apps with Google Apps Script

Downloading files from public Google Drive in python: scoping issues?

Well thanks to the security update released by Google few months before. This makes the link sharing stricter and you need resource key as well to access the file in-addition to the fileId.

As per the documentation , You need to provide the resource key as well for newer links, if you want to access it in the header X-Goog-Drive-Resource-Keys as fileId1/resourceKey1.

If you apply this change in your code, it will work as normal. Example edit below:

regex = "(?<=https://drive.google.com/file/d/)[a-zA-Z0-9]+"
regex_rkey = "(?<=resourcekey=)[a-zA-Z0-9-]+"
for i, l in enumerate(links_to_download):
url = l
file_id = re.search(regex, url)[0]
resource_key = re.search(regex_rkey, url)[0]
request = drive_service.files().get_media(fileId=file_id)
request.headers["X-Goog-Drive-Resource-Keys"] = f"{file_id}/{resource_key}"
fh = io.FileIO(f"file_{i}", mode='wb')
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print("Download %d%%." % int(status.progress() * 100))

Well, the regex for resource key was something I quickly made, so cannot be sure on if it supports every case. But this provides you the solution.
Now, you may have to listen to old and new links based on this and set the changes.

How to do a partial download in Google Drive Api v3?

  • You want to achieve the partial download of the file from Google Drive using google-api-python-client with python.
  • You have already been able to download the file from Google Drive using Drive API with your script.

If my understanding is correct, how about this answer? Please think of this as just one of several possible answers.

Modification points:

  • In this case, the range property like Range: bytes=500-999 is required to be included in the request header. This has already been mentioned in your question.

    • For request = drive_service.files().get_media(fileId=file_id), it includes the range property in the header.

When your script is modified, it becomes as follows.

Modified script:

From:

request = drive_service.files().get_media(fileId=file_id)
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request, chunksize=length)
done = False
while done is False:
status, done = downloader.next_chunk()
return fh.getvalue()
To:

request = drive_service.files().get_media(fileId=file_id)
request.headers["Range"] = "bytes={}-{}".format(start, start+length)
fh = io.BytesIO(request.execute())
return fh.getvalue()

Note:

  • In above modified script, when MediaIoBaseDownload is used, it was found that the file is completely downloaded without using the range property. So I don't use MediaIoBaseDownload.
  • Also you can use requests like as follows.

    url = "https://www.googleapis.com/drive/v3/files/" + file_id + "?alt=media"
    headers = {"Authorization": "Bearer ###accessToken###", "Range": "bytes={}-{}".format(start, start+length)}
    res = requests.get(url, headers=headers)
    fh = io.BytesIO(res.content)
    return fh.getvalue()

Reference:

  • Partial download

If I misunderstood your question and this was not the direction you want, I apologize.

Google Drive Authenticate and Download Files with Service Account

Daily Limit for Unauthenticated Use Exceeded.

This error message is normally the result of not applying the authorization credentials in your code. I couldn't spot any issues off the bat with your code though. The first thing i would like to suggest is that you double check that you have service account credentials added in your project and not the wrong type. However i would have expected a different error message if this was the issue.

Try this its based upon the offical samples manage downloads

from apiclient.discovery import build
from oauth2client.service_account import ServiceAccountCredentials

SCOPES = ['https://www.googleapis.com/auth/drive']
KEY_FILE_LOCATION = '<REPLACE_WITH_JSON_FILE>'

def initialize_drive():
"""Initializes an drive service object.

Returns:
An authorized drive service object.
"""
credentials = ServiceAccountCredentials.from_json_keyfile_name(
KEY_FILE_LOCATION, SCOPES)

# Build the service object.
service = build('drive', 'v3', credentials=credentials)

return service

def download_report(drive_service, id):
file_id = '0BwwA4oUTeiV1UVNwOHItT0xfa2M'
request = drive_service.files().get_media(fileId=file_id)
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print "Download %d%%." % int(status.progress() * 100)
return fh

only files with binary content can be downloaded

Remember there are two types of files on drive. Google drive mimetype files which need to be downloaded using the export method and all other binary type files. Binary files are downloaded using the method you are using now.

def main():
service = initialize_drive()

buffer = download_report(service, <file_id>)

if __name__ == '__main__':
main()

Export method

file_id = '1ZdR3L3qP4Bkq8noWLJHSr_iBau0DNT4Kli4SxNc2YEo'
request = drive_service.files().export_media(fileId=file_id,
mimeType='application/pdf')
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print "Download %d%%." % int(status.progress() * 100)

Google Drive API:How to download files from google drive?

access_token should not be placed in the request body,We should put access_token in the header.Can try on this site oauthplayground



Related Topics



Leave a reply



Submit