Getting S3 objects' last modified datetimes with boto
Here's a snippet of Python/boto code that will print the last_modified attribute of all keys in a bucket:
>>> import boto
>>> s3 = boto.connect_s3()
>>> bucket = s3.lookup('mybucket')
>>> for key in bucket:
print key.name, key.size, key.last_modified
index.html 13738 2012-03-13T03:54:07.000Z
markdown.css 5991 2012-03-06T18:32:43.000Z
>>>
find last modified date of a particular file in S3
You can do it this way:
import boto3
s3 = boto3.resource("s3")
s3_object = s3.Object("your_bucket", "your_object_key")
print(s3_object.last_modified)
How to filter s3 objects by last modified date with Boto3
The following code snippet gets all objects under specific folder and check if the file last modified is created after the time you specify :
Replace YEAR,MONTH, DAY
with your values.
import boto3
import datetime
#bucket Name
bucket_name = 'BUCKET NAME'
#folder Name
folder_name = 'FOLDER NAME'
#bucket Resource
s3 = boto3.resource('s3')
bucket = s3.Bucket(bucket_name)
def lambda_handler(event, context):
for file in bucket.objects.filter(Prefix= folder_name):
#compare dates
if file.last_modified.replace(tzinfo = None) > datetime.datetime(YEAR,MONTH, DAY,tzinfo = None):
#print results
print('File Name: %s ---- Date: %s' % (file.key,file.last_modified))
how to get last modified filename using boto3 from s3
There you have a simple snippet. In short you have to iterate over files to find the last modified date in all files. Then you have print files with this date (might be more than one).
from datetime import datetime
import boto3
s3 = boto3.resource('s3',aws_access_key_id='demo', aws_secret_access_key='demo')
my_bucket = s3.Bucket('demo')
last_modified_date = datetime(1939, 9, 1).replace(tzinfo=None)
for file in my_bucket.objects.all():
file_date = file.last_modified.replace(tzinfo=None)
if last_modified_date < file_date:
last_modified_date = file_date
print(last_modified_date)
# you can have more than one file with this date, so you must iterate again
for file in my_bucket.objects.all():
if file.last_modified.replace(tzinfo=None) == last_modified_date:
print(file.key)
print(last_modified_date)
How to list last modified file in S3 using Python
OK. I've resolved the "issue" and now have what I need.
import boto3
bucket_name = "actual_bucket_name"
prefix = "path/to/files/"
get_last_modified = lambda obj: int(obj['LastModified'].strftime('%s'))
s3 = boto3.client('s3')
objs = s3.list_objects_v2(Bucket=bucket_name, Prefix=prefix, Delimiter='/' ['Contents']
last_added = [obj['Key'] for obj in sorted(objs, key=get_last_modified)][0]
Thank you for the pointers. I was readin through the documentation, however, we know how it can be after staring at walls of text after a while. The "issue" was me not acutely comprehending.
Boto3 S3, sort bucket by last modified
I did a small variation of what @helloV posted below. its not 100% optimum, but it gets the job done with the limitations boto3 has as of this time.
s3 = boto3.resource('s3')
my_bucket = s3.Bucket('myBucket')
unsorted = []
for file in my_bucket.objects.filter():
unsorted.append(file)
files = [obj.key for obj in sorted(unsorted, key=get_last_modified,
reverse=True)][0:9]
How to retrieve only the last_modified key in S3 with boto3
--Update Start--
Probably, it might be better to create the objects in S3 with date prefixes.
{bucket}/yyyy/mm/dd/{object}
Example: myS3bucket/2018/12/29/myfile.txt
With this approach, your query becomes simple to find out if you got any files for that particular day and also the number files list you retrieve becomes short.
prefix="/"+str(today.year)+"/"+str(today.month)+"/"+str(today.day)+"/"
objs = bucket.objects.filter(Prefix=prefix).all()
--Update Complete--
I am not sure you gave full code but there are some indentation issues in above snippet. I just tested below and it works fine and I get correct last_modified
date.
Please make sure you are on correct region as bucket. Also last_modified
is in UTC
timezone so your comparison should consider that.
import boto3
from datetime import date
import botocore
# Get Today's date
today = date.today()
# Get Objects date
s3 = boto3.resource('s3',region_name='us-east-1')
bucket = s3.Bucket('xxxx')
prefix="/"+str(today.year)+"/"+str(today.month)+"/"+str(today.day)+"/"
objs = bucket.objects.filter(Prefix=prefix).all()
def get_object_check_alarm():
try:
for obj in objs:
print(obj)
lastobjectdate = (obj.last_modified).date()
except botocore.exceptions.ClientError as e:
error_code = e.response['Error']['Code']
if error_code == '404':
print("There is no file")
# Compare with defined date
if today == lastobjectdate:
print(today)
print(lastobjectdate)
print("OK, lastest file comes from today")
else:
print(today)
print(lastobjectdate)
print("Mail sent")
get_object_check_alarm()
Below is the the output. I am in EST zone so date is still 12/28 but object creation date came in as 12/29 since its already 12/29 in UTC zone when object was created.
s3.ObjectSummary(bucket_name='xxxx', key='yyyy/')
2018-12-28
2018-12-29
Mail sent
Related Topics
How Can My Model Primary Key Start With a Specific Number
Write a Program That Find the Largest Integer in a String
Get Rid of Columns With Null Value in Json Output
Pandas Concat: Valueerror: Shape of Passed Values Is Blah, Indices Imply Blah2
How to Get String Objects Instead of Unicode from Json
How to Split a Huge Text File in Python
I Am Trying to Split a Full Name to First Middle and Last Name in Pandas But I Am Stuck At Replace
How to Get the Current Ipython/Jupyter Notebook Name
How to Extract Integer or Float from String
Pickle - Cpickle.Unpicklingerror: Invalid Load Key, '?'
How to Further Filter a Result of Resultset
Convert Np.Array of Type Float64 to Type Uint8 Scaling Values
How to Extract Address from Raw Text Using Nltk in Python
In Python, How to Check If a String Only Contains Certain Characters
Comparing Digits in an Integer in Python