Getting S3 Objects' Last Modified Datetimes With Boto

Getting S3 objects' last modified datetimes with boto

Here's a snippet of Python/boto code that will print the last_modified attribute of all keys in a bucket:

>>> import boto
>>> s3 = boto.connect_s3()
>>> bucket = s3.lookup('mybucket')
>>> for key in bucket:
print key.name, key.size, key.last_modified
index.html 13738 2012-03-13T03:54:07.000Z
markdown.css 5991 2012-03-06T18:32:43.000Z
>>>

find last modified date of a particular file in S3

You can do it this way:

import boto3

s3 = boto3.resource("s3")
s3_object = s3.Object("your_bucket", "your_object_key")
print(s3_object.last_modified)

How to filter s3 objects by last modified date with Boto3

The following code snippet gets all objects under specific folder and check if the file last modified is created after the time you specify :

Replace YEAR,MONTH, DAY with your values.

import boto3
import datetime
#bucket Name
bucket_name = 'BUCKET NAME'
#folder Name
folder_name = 'FOLDER NAME'
#bucket Resource
s3 = boto3.resource('s3')
bucket = s3.Bucket(bucket_name)
def lambda_handler(event, context):
for file in bucket.objects.filter(Prefix= folder_name):
#compare dates
if file.last_modified.replace(tzinfo = None) > datetime.datetime(YEAR,MONTH, DAY,tzinfo = None):
#print results
print('File Name: %s ---- Date: %s' % (file.key,file.last_modified))

how to get last modified filename using boto3 from s3

There you have a simple snippet. In short you have to iterate over files to find the last modified date in all files. Then you have print files with this date (might be more than one).

from datetime import datetime

import boto3

s3 = boto3.resource('s3',aws_access_key_id='demo', aws_secret_access_key='demo')

my_bucket = s3.Bucket('demo')

last_modified_date = datetime(1939, 9, 1).replace(tzinfo=None)
for file in my_bucket.objects.all():
file_date = file.last_modified.replace(tzinfo=None)
if last_modified_date < file_date:
last_modified_date = file_date

print(last_modified_date)

# you can have more than one file with this date, so you must iterate again
for file in my_bucket.objects.all():
if file.last_modified.replace(tzinfo=None) == last_modified_date:
print(file.key)
print(last_modified_date)

How to list last modified file in S3 using Python

OK. I've resolved the "issue" and now have what I need.

import boto3

bucket_name = "actual_bucket_name"
prefix = "path/to/files/"

get_last_modified = lambda obj: int(obj['LastModified'].strftime('%s'))

s3 = boto3.client('s3')
objs = s3.list_objects_v2(Bucket=bucket_name, Prefix=prefix, Delimiter='/' ['Contents']
last_added = [obj['Key'] for obj in sorted(objs, key=get_last_modified)][0]

Thank you for the pointers. I was readin through the documentation, however, we know how it can be after staring at walls of text after a while. The "issue" was me not acutely comprehending.

Boto3 S3, sort bucket by last modified

I did a small variation of what @helloV posted below. its not 100% optimum, but it gets the job done with the limitations boto3 has as of this time.

s3 = boto3.resource('s3')
my_bucket = s3.Bucket('myBucket')
unsorted = []
for file in my_bucket.objects.filter():
unsorted.append(file)

files = [obj.key for obj in sorted(unsorted, key=get_last_modified,
reverse=True)][0:9]

How to retrieve only the last_modified key in S3 with boto3

--Update Start--

Probably, it might be better to create the objects in S3 with date prefixes.

{bucket}/yyyy/mm/dd/{object}

Example: myS3bucket/2018/12/29/myfile.txt

With this approach, your query becomes simple to find out if you got any files for that particular day and also the number files list you retrieve becomes short.

prefix="/"+str(today.year)+"/"+str(today.month)+"/"+str(today.day)+"/"
objs = bucket.objects.filter(Prefix=prefix).all()

--Update Complete--

I am not sure you gave full code but there are some indentation issues in above snippet. I just tested below and it works fine and I get correct last_modified date.

Please make sure you are on correct region as bucket. Also last_modified is in UTC timezone so your comparison should consider that.

import boto3
from datetime import date
import botocore

# Get Today's date
today = date.today()
# Get Objects date
s3 = boto3.resource('s3',region_name='us-east-1')
bucket = s3.Bucket('xxxx')
prefix="/"+str(today.year)+"/"+str(today.month)+"/"+str(today.day)+"/"
objs = bucket.objects.filter(Prefix=prefix).all()

def get_object_check_alarm():
try:
for obj in objs:
print(obj)
lastobjectdate = (obj.last_modified).date()
except botocore.exceptions.ClientError as e:
error_code = e.response['Error']['Code']
if error_code == '404':
print("There is no file")

# Compare with defined date
if today == lastobjectdate:
print(today)
print(lastobjectdate)
print("OK, lastest file comes from today")
else:
print(today)
print(lastobjectdate)
print("Mail sent")

get_object_check_alarm()

Below is the the output. I am in EST zone so date is still 12/28 but object creation date came in as 12/29 since its already 12/29 in UTC zone when object was created.

s3.ObjectSummary(bucket_name='xxxx', key='yyyy/')

2018-12-28

2018-12-29

Mail sent



Related Topics



Leave a reply



Submit