List of the Most Recently Updated Files in Python

How to find most recently modified file from all subdirectories in a directory in Python?

Use os.walk, (ignoring directories), instead of os.listdir in a nested list comprehension fed to sort

files = sorted([os.path.join(root,f) for root,_,the_files in os.walk(path) for f in the_files if f.lower().endswith(".cpp")], key=os.path.getctime, reverse = True)

As someone noted, if you only need 1 file, you can just apply max with a key (and in that case, switch to generator comprehension, since you don't need the full list to feed to sort and optimize speed):

most_recent_file = max((os.path.join(root,f) for root,_,the_files in os.walk(path) for f in the_files if f.lower().endswith(".cpp")),key=os.path.getctime)

note that your expression files = sorted(os.listdir(path), key=os.path.getctime, reverse = True) needs you to change current directory unless path is not the current path, because listdir returns the file names not the file paths (and yes, you also have to filter out directories, which complexifies the expression further. You don't have this problem anymore with the solution above since os.path.join is applied before sorting)

Program to check all last modified files in a folder using python?

In Windows a copy of a file probably has a new creation time. You can look at os.path.getctime() to check the creation time for the copy of the file.

If that works as expected then you could include os.path.getctime() as an additional check in the key to max().

def latest_change(filename):
return max(os.path.getmtime(filename), os.path.getctime(filename))

latest_file = max(list_of_files, key=latest_change)

The key function just grabs whichever of the modification or creation time is greatest, and then uses that greatest time as the key.

Python listing last 10 modified files and reading each line of all 10 files

it looks alike you almost did everything already

import os.path, glob

files = glob.glob("data_event_log.*")
files.sort(key=os.path.getmtime)
latest=files[-10:] # last 10 entries
print("\n".join(latest))
lineRegex = re.compile(r'\d{4}-\d{2}-\d{2}')
for fn in latest:
with open(fn) as f:
for line in f:
if line.startswith('Time'):
a = lineRegex.findall(line)

Edit:

Especially if you have many files a better and simpler solution would be

import os.path, glob, heapq

files = glob.iglob("data_event_log.*")
latest=heapq.nlargest(10, files, key=os.path.getmtime) # last 10 entries
print("\n".join(latest))
lineRegex = re.compile(r'\d{4}-\d{2}-\d{2}')
for fn in latest:
with open(fn) as f:
for line in f:
if line.startswith('Time'):
a = lineRegex.findall(line)

Last Modified File in Python?

Try:

newest = max(glob.iglob('Directory Name/*'), key=os.path.getctime)

Using Python, select files in a directory that have been modified in the last 60 days

Building on what you already provided and what you already know with os.path.getmtime(), you can use the time.time() function to get the current time. You can substract the modified time from the current time to get the time difference in seconds. I use (60*60*24) to get this to days.

The following code does each of those steps:

import glob
import os
import time

files = glob.glob("C:/Folder/*.csv")
modified_files = list()
current_time = time.time()

for csv_file in files:
time_delta = current_time - os.path.getmtime(csv_file)
time_delta_days = time_delta / (60 * 60 * 24)
if time_delta_days < 60:
modified_files.append(csv_file)

print(modified_files)

Edit:
A more pythonic way to write this might be:

import glob
import os
import time

def test_modified(filename):
delta = time.time() - os.path.getmtime(filename)
delta = delta / (60*60*24)
if delta < 60:
return True
return False

mfiles = [mfile for mfile in glob.glob("C:/Folder/*.csv") if test_modified(mfile)]
print(mfiles)

List of filenames modified within 1 week

Depending on how many filenames and how little memory (512MB VPS?), it's possible you're running out of memory creating two lists of all the filenames (one from glob and one from your list comprehension.) Not necessarily the case but it's all I have to go on.

Try switching to iglob (which uses os.scandir under the hood and returns an iterator) and using a generator expression and see if that helps.

Also, getmtime gets a time, not an interval from now.

import os
import glob
import time

week_ago = time.time() - 7 * 24 * 60 * 60
log_files = (
x for x in glob.iglob('/var/opt/cray/log/p0-current/*')
if not os.path.isdir(x)
and os.path.getmtime(x) > week_ago
)
for filename in log_files:
pass # do something


Related Topics



Leave a reply



Submit