How to Download Entire Folder Located on S3 Bucket

Downloading an entire S3 bucket?

AWS CLI

See the "AWS CLI Command Reference" for more information.

AWS recently released their Command Line Tools, which work much like boto and can be installed using

sudo easy_install awscli

or

sudo pip install awscli

Once installed, you can then simply run:

aws s3 sync s3://<source_bucket> <local_destination>

For example:

aws s3 sync s3://mybucket .

will download all the objects in mybucket to the current directory.

And will output:

download: s3://mybucket/test.txt to test.txt
download: s3://mybucket/test2.txt to test2.txt

This will download all of your files using a one-way sync. It will not delete any existing files in your current directory unless you specify --delete, and it won't change or delete any files on S3.

You can also do S3 bucket to S3 bucket, or local to S3 bucket sync.

Check out the documentation and other examples.

Whereas the above example is how to download a full bucket, you can also download a folder recursively by performing

aws s3 cp s3://BUCKETNAME/PATH/TO/FOLDER LocalFolderName --recursive

This will instruct the CLI to download all files and folder keys recursively within the PATH/TO/FOLDER directory within the BUCKETNAME bucket.

Amazon S3 console: download multiple files at once

It is not possible through the AWS Console web user interface.
But it's a very simple task if you install AWS CLI.
You can check the installation and configuration steps on Installing in the AWS Command Line Interface

After that you go to the command line:

aws s3 cp --recursive s3://<bucket>/<folder> <local_folder> 

This will copy all the files from given S3 path to your given local path.

Download Entire Content of a subfolder in a S3 bucket

I think your best bet would be the awscli

aws s3 cp --recursive s3://mybucket/your_folder_named_a path/to/your/destination

From the docs:

--recursive (boolean) Command is performed on all files or objects under the specified directory or prefix.

EDIT:

To do this with boto3 try this:

import os
import errno
import boto3

client = boto3.client('s3')


def assert_dir_exists(path):
try:
os.makedirs(path)
except OSError as e:
if e.errno != errno.EEXIST:
raise


def download_dir(bucket, path, target):
# Handle missing / at end of prefix
if not path.endswith('/'):
path += '/'

paginator = client.get_paginator('list_objects_v2')
for result in paginator.paginate(Bucket=bucket, Prefix=path):
# Download each file individually
for key in result['Contents']:
# Calculate relative path
rel_path = key['Key'][len(path):]
# Skip paths ending in /
if not key['Key'].endswith('/'):
local_file_path = os.path.join(target, rel_path)
# Make sure directories exist
local_file_dir = os.path.dirname(local_file_path)
assert_dir_exists(local_file_dir)
client.download_file(bucket, key['Key'], local_file_path)


download_dir('your_bucket', 'your_folder', 'destination')

how to download specific folder content from Aws s3 bucked using python

Here is an example of how to do that with Minio(Amazon S3 compatible) using python:

client = Minio(
"localhost:port",
access_key="access_key",
secret_key="secret_key",
secure=False,
)
objects = client.list_objects("index", prefix="public/")
for obj in objects:
<Do something ....>

How to download everything in that folder using boto3

Marcin answer is correct but files with the same name in different paths would be overwritten.
You can avoid that by replicating the folder structure of the S3 bucket locally.

import boto3
import os
from pathlib import Path

s3 = boto3.resource('s3')

bucket = s3.Bucket('bucket')

key = 'product/myproject/2021-02-15/'
objs = list(bucket.objects.filter(Prefix=key))

for obj in objs:
# print(obj.key)

# remove the file name from the object key
obj_path = os.path.dirname(obj.key)

# create nested directory structure
Path(obj_path).mkdir(parents=True, exist_ok=True)

# save file with full path locally
bucket.download_file(obj.key, obj.key)


Related Topics



Leave a reply



Submit