How to download a file from Google Drive using Python and the Drive API v3
To make requests to Google APIs the work flow is in essence the following:
- Go to developer console, log in if you haven't.
- Create a Cloud Platform project.
- Enable for your project, the APIs you are interested in using with you projects' apps (for example: Google Drive API).
- Create and download OAuth 2.0 Client IDs credentials that will allow your app to gain authorization for using your enabled APIs.
- Head over to OAuth consent screen, click on and add your scope using the button. (scope: https://www.googleapis.com/auth/drive.readonly for you). Choose Internal/External according to your needs, and for now ignore the warnings if any.
- To get the valid token for making API request the app will go through the OAuth flow to receive the authorization token. (Since it needs consent)
- During the OAuth flow the user will be redirected to your the OAuth consent screen, where it will be asked to approve or deny access to your app's requested scopes.
- If consent is given, your app will receive an authorization token.
- Pass the token in your request to your authorized API endpoints.[2]
- Build a Drive Service to make API requests (You will need the valid token)[1]
NOTE:
The available methods for the Files resource for Drive API v3 are here.When using the Python Google APIs Client, then you can use export_media()
or get_media()
as per Google APIs Client for Python documentation
IMPORTANT:
Also, check that the scope you are using, actually allows you to do what you want (Downloading Files from user's Drive) and set it accordingly. ATM you have an incorrect scope for your goal. See OAuth 2.0 API ScopesSample Code References:
- Building a Drive Service:
import google_auth_oauthlib.flow
from google.auth.transport.requests import Request
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build
class Auth:
def __init__(self, client_secret_filename, scopes):
self.client_secret = client_secret_filename
self.scopes = scopes
self.flow = google_auth_oauthlib.flow.Flow.from_client_secrets_file(self.client_secret, self.scopes)
self.flow.redirect_uri = 'http://localhost:8080/'
self.creds = None
def get_credentials(self):
flow = InstalledAppFlow.from_client_secrets_file(self.client_secret, self.scopes)
self.creds = flow.run_local_server(port=8080)
return self.creds
# The scope you app will use.
# (NEEDS to be among the enabled in your OAuth consent screen)
SCOPES = "https://www.googleapis.com/auth/drive.readonly"
CLIENT_SECRET_FILE = "credentials.json"
credentials = Auth(client_secret_filename=CLIENT_SECRET_FILE, scopes=SCOPES).get_credentials()
drive_service = build('drive', 'v3', credentials=credentials)
- Making the request to export or get a file
request = drive_service.files().export(fileId=file_id, mimeType='application/pdf')
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print("Download %d%%" % int(status.progress() * 100))
# The file has been downloaded into RAM, now save it in a file
fh.seek(0)
with open('your_filename.pdf', 'wb') as f:
shutil.copyfileobj(fh, f, length=131072)
How to save downloaded data from Google Drive in file - Python Drive API
- You want to save the downloaded file as a file to the local PC.
- You want to achieve this using google-api-python-client with Python.
- You have already been able to get and put the file using Drive API.
From:
fh = io.BytesIO()
To:
fh = io.FileIO("### filename ###", mode='wb')
References:
- Binary I/O
- File I/O
Google drive api python - How to download all the files inside a folder or the folder to a specific local destination path
Do a file.list setting parent to the file id of the directory you are looking for.
Then loop through each file returned and download it using the code you have now.
file_id = [File id from first call]
request = service.files().get_media(fileId=file_id)
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print ("Download %d%%." % int(status.progress() * 100))
If you are looking for a method which will do it for you you wont find one. You will need to download them one at a time. python: How do i download a file from Google drive using api
just write the content variable to a file instead of returning the content
fo = open("foo.jpg", "wb")
fo.write(content)
fo.close()
Google Drive API downloading files with Python
The lines below and starting with 'file_id=....' are indented with 2 spaces from the left. However the code above it are indented with 4 spaces. If these lines of code are to be within the 'main' function, they need to be indented like the rest of the code... 4 spaces at a time.
This is a common error when copying and pasting code. If you plan to further extend the code, either modify the copied code to match or change your style to suit. Note development environments like PyCharm and VSCode have settings to configure the default indentation.
I'm trying to get the direct download link for a file using the Google Drive API
I believe your goal as follows.
- From
I'm trying to get the direct download link for a file in Google Drive using the Google Drive API (v3),
, I understand that you want to retrievewebContentLink
. - The file that you want to retrieve the
webContentLink
is the files except for Google Docs files. - You have already been able to get the file metadata using Drive API. So your access token can be used for this.
Modification points:
- When the file is not shared, the API key cannot be used. By this,
https://www.googleapis.com/drive/v3/files/**FILE_ID**?alt=media&supportsAllDrives=True&includeItemsFromAllDrives=True&key=**API_KEY**
returnsFile not found
. I think that the reason of this issue is due to this. - When I saw your script in your question, it seems that you want to download the file content.
- In your script,
headers
is not used. So in this case, the access token is not used. - In the method of "Files: get", there is no
includeItemsFromAllDrives
. - In your script, I think that an error occurs at
credentials.access_token
. How about this? If my understanding is correct, please try to modify toaccessToken = credentials.token
. - In Drive API v3, the default response values don't include
webContentLink
. So in this case, the field value is required to be set likefields=webContentLink
.
Modified script:
file_id = '###' # Please set the file ID.
req_url = "https://www.googleapis.com/drive/v3/files/" + file_id + "?supportsAllDrives=true&fields=webContentLink"
headers = {'Authorization': 'Bearer %s' % accessToken}
res = requests.get(req_url, headers=headers)
obj = res.json()
print(obj.get('webContentLink'))
Or, you can use drive = build('drive', 'v3', credentials=credentials)
in your script, you can also use the following script.file_id = '###' # Please set the file ID.
drive = build('drive', 'v3', credentials=credentials)
request = drive.files().get(fileId=file_id, supportsAllDrives=True, fields='webContentLink').execute()
print(request.get('webContentLink'))
Note:
- In this modified script,
- When the file is in the shared Drive and you don't have the permissions for retrieving the file metadata, an error occurs.
- When your access token cannot be used for retrieving the file metadata, an error occurs.
- When
*
is used forfields
, all file metadata can be retrieved.
Reference:
- Files: get
Added:
- You want to download the binary data from the Google Drive by the URL.
- The file size is large like "2-10 gigabytes".
webContentLink
cannot be used. Because in the case of the such large file, webContentLink
is redirected. So I think that the method that the file is publicly shared and use the API key is suitable for achieving your goal. But, you cannot publicly shared the file.From this situation, as a workaround, I would like to propose to use this method. This method is "One Time Download for Google Drive". At Google Drive, when the publicly shared file is downloaded, even when the permission of file is deleted under the download, the download can be run. This method uses this.
Flow
In this sample script, the API key is used.
- Request to Web Apps with the API key and the file ID you want to download.
- At Web Apps, the following functions are run.
- Permissions of file of the received file ID are changed. And the file is started to be publicly shared.
- Install a time-driven trigger. In this case, the trigger is run after 1 minute.
- When the function is run by the time-driven trigger, the permissions of file are changed. And sharing file is stopped. By this, the shared file of only one minute can be achieved.
- Web Apps returns the endpoint for downloading the file of the file ID.
- After you got the endpoint, please download the file using the endpoint in 1 minute. Because the file is shared for only one minute.
Usage:
1. Create a standalone script
In this workaround, Google Apps Script is used as the server side. Please create a standalone script.If you want to directly create it, please access to https://script.new/. In this case, if you are not logged in Google, the log in screen is opened. So please log in to Google. By this, the script editor of Google Apps Script is opened.
2. Set sample script of Server side
Please copy and paste the following script to the script editor. At that time, please set your API key to the variable ofkey
in the function doGet(e)
.Here, please set your API key in the function of doGet(e)
. In this Web Apps, when the inputted API key is the same, the script is run.
function deletePermission() {
const forTrigger = "deletePermission";
const id = CacheService.getScriptCache().get("id");
const triggers = ScriptApp.getProjectTriggers();
triggers.forEach(function(e) {
if (e.getHandlerFunction() == forTrigger) ScriptApp.deleteTrigger(e);
});
const file = DriveApp.getFileById(id);
file.setSharing(DriveApp.Access.PRIVATE, DriveApp.Permission.NONE);
}
function checkTrigger(forTrigger) {
const triggers = ScriptApp.getProjectTriggers();
for (var i = 0; i < triggers.length; i++) {
if (triggers[i].getHandlerFunction() == forTrigger) {
return false;
}
}
return true;
}
function doGet(e) {
const key = "###"; // <--- API key. This is also used for checking the user.
const forTrigger = "deletePermission";
var res = "";
if (checkTrigger(forTrigger)) {
if ("id" in e.parameter && e.parameter.key == key) {
const id = e.parameter.id;
CacheService.getScriptCache().put("id", id, 180);
const file = DriveApp.getFileById(id);
file.setSharing(DriveApp.Access.ANYONE_WITH_LINK, DriveApp.Permission.VIEW);
var d = new Date();
d.setMinutes(d.getMinutes() + 1);
ScriptApp.newTrigger(forTrigger).timeBased().at(d).create();
res = "https://www.googleapis.com/drive/v3/files/" + id + "?alt=media&key=" + e.parameter.key;
} else {
res = "unavailable";
}
} else {
res = "unavailable";
}
return ContentService.createTextOutput(res);
}
3. Deploy Web Apps
- On the script editor, Open a dialog box by "Publish" -> "Deploy as web app".
- Select "Me" for "Execute the app as:".
- Select "Anyone, even anonymous" for "Who has access to the app:". This is a test case.
- If Only myself is used, only you can access to Web Apps. At that time, please use your access token.
- Click "Deploy" button as new "Project version".
- Automatically open a dialog box of "Authorization required".
- Click "Review Permissions".
- Select own account.
- Click "Advanced" at "This app isn't verified".
- Click "Go to ### project name ###(unsafe)"
- Click "Allow" button.
- Click "OK"
4. Test run: Client side
This is a sample script of python. Before you test this, please confirm the above script is deployed as Web Apps. And please set the URL of Web Apps, the file ID and your API key.import requests
url1 = "https://script.google.com/macros/s/###/exec"
url1 += "?id=###fileId###&key=###your API key###"
res1 = requests.get(url1)
url2 = res1.text
res2 = requests.get(url2)
with open("###sampleFilename###", "wb") as f:
f.write(res2.content)
- In this sample script, at first, it requests to the Web Apps using the file ID and API key, and the file is shared publicly in 1 minute. And then, the file can be downloaded. After 1 minute, the file is not publicly shared. But the download of the file can be kept.
Note:
- When you modified the script of Web Apps, please redeploy the Web Apps as new version. By this, the latest script is reflected to the Web Apps. Please be careful this.
References:
- One Time Download for Google Drive
- Web Apps
- Taking advantage of Web Apps with Google Apps Script
Downloading files from public Google Drive in python: scoping issues?
Well thanks to the security update released by Google few months before. This makes the link sharing stricter and you need resource key as well to access the file in-addition to the fileId
.
As per the documentation , You need to provide the resource key as well for newer links, if you want to access it in the header X-Goog-Drive-Resource-Keys
as fileId1/resourceKey1
.
If you apply this change in your code, it will work as normal. Example edit below:
regex = "(?<=https://drive.google.com/file/d/)[a-zA-Z0-9]+"
regex_rkey = "(?<=resourcekey=)[a-zA-Z0-9-]+"
for i, l in enumerate(links_to_download):
url = l
file_id = re.search(regex, url)[0]
resource_key = re.search(regex_rkey, url)[0]
request = drive_service.files().get_media(fileId=file_id)
request.headers["X-Goog-Drive-Resource-Keys"] = f"{file_id}/{resource_key}"
fh = io.FileIO(f"file_{i}", mode='wb')
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print("Download %d%%." % int(status.progress() * 100))
Well, the regex for resource key was something I quickly made, so cannot be sure on if it supports every case. But this provides you the solution.Now, you may have to listen to old and new links based on this and set the changes.
How to do a partial download in Google Drive Api v3?
- You want to achieve the partial download of the file from Google Drive using google-api-python-client with python.
- You have already been able to download the file from Google Drive using Drive API with your script.
Modification points:
- In this case, the range property like
Range: bytes=500-999
is required to be included in the request header. This has already been mentioned in your question.- For
request = drive_service.files().get_media(fileId=file_id)
, it includes the range property in the header.
- For
Modified script:
From:request = drive_service.files().get_media(fileId=file_id)
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request, chunksize=length)
done = False
while done is False:
status, done = downloader.next_chunk()
return fh.getvalue()
To:request = drive_service.files().get_media(fileId=file_id)
request.headers["Range"] = "bytes={}-{}".format(start, start+length)
fh = io.BytesIO(request.execute())
return fh.getvalue()
Note:
- In above modified script, when
MediaIoBaseDownload
is used, it was found that the file is completely downloaded without using the range property. So I don't useMediaIoBaseDownload
. Also you can use
requests
like as follows.url = "https://www.googleapis.com/drive/v3/files/" + file_id + "?alt=media"
headers = {"Authorization": "Bearer ###accessToken###", "Range": "bytes={}-{}".format(start, start+length)}
res = requests.get(url, headers=headers)
fh = io.BytesIO(res.content)
return fh.getvalue()
Reference:
- Partial download
Google Drive Authenticate and Download Files with Service Account
Daily Limit for Unauthenticated Use Exceeded.This error message is normally the result of not applying the authorization credentials in your code. I couldn't spot any issues off the bat with your code though. The first thing i would like to suggest is that you double check that you have service account credentials added in your project and not the wrong type. However i would have expected a different error message if this was the issue.
Try this its based upon the offical samples manage downloads
from apiclient.discovery import build
from oauth2client.service_account import ServiceAccountCredentials
SCOPES = ['https://www.googleapis.com/auth/drive']
KEY_FILE_LOCATION = '<REPLACE_WITH_JSON_FILE>'
def initialize_drive():
"""Initializes an drive service object.
Returns:
An authorized drive service object.
"""
credentials = ServiceAccountCredentials.from_json_keyfile_name(
KEY_FILE_LOCATION, SCOPES)
# Build the service object.
service = build('drive', 'v3', credentials=credentials)
return service
def download_report(drive_service, id):
file_id = '0BwwA4oUTeiV1UVNwOHItT0xfa2M'
request = drive_service.files().get_media(fileId=file_id)
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print "Download %d%%." % int(status.progress() * 100)
return fh
only files with binary content can be downloaded
Remember there are two types of files on drive. Google drive mimetype files which need to be downloaded using the export method and all other binary type files. Binary files are downloaded using the method you are using now.def main():
service = initialize_drive()
buffer = download_report(service, <file_id>)
if __name__ == '__main__':
main()
Export method
file_id = '1ZdR3L3qP4Bkq8noWLJHSr_iBau0DNT4Kli4SxNc2YEo'
request = drive_service.files().export_media(fileId=file_id,
mimeType='application/pdf')
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print "Download %d%%." % int(status.progress() * 100)
Google Drive API:How to download files from google drive?
access_token should not be placed in the request body,We should put access_token in the header.Can try on this site oauthplayground
Related Topics
How to Concatenate Three Excels Files Xlsx Using Python
Logisticregression: Unknown Label Type: 'Continuous' Using Sklearn in Python
Wrapping Around on a List When List Index Is Out of Range
Rename Multiindex Columns in Pandas
Why Isn't .Ico File Defined When Setting Window's Icon
How to Skip Iterations in a Loop
Filtering a Pyspark Dataframe with SQL-Like in Clause
How to Compare Times of the Day
Stop Matplotlib Repeating Labels in Legend
Running Multiple Bash Commands with Subprocess
How to Convert a List into a String with Spaces in Python
Basic Program to Convert Integer to Roman Numerals
Regex for Existence of Some Words Whose Order Doesn't Matter
What Could Cause a Python Module to Be Imported Twice
Range Over Character in Python
Overflowerror: Long Int Too Large to Convert to Float in Python