How to find most recently modified file from all subdirectories in a directory in Python?
Use os.walk
, (ignoring directories), instead of os.listdir
in a nested list comprehension fed to sort
files = sorted([os.path.join(root,f) for root,_,the_files in os.walk(path) for f in the_files if f.lower().endswith(".cpp")], key=os.path.getctime, reverse = True)
As someone noted, if you only need 1 file, you can just apply max
with a key
(and in that case, switch to generator comprehension, since you don't need the full list to feed to sort
and optimize speed):
most_recent_file = max((os.path.join(root,f) for root,_,the_files in os.walk(path) for f in the_files if f.lower().endswith(".cpp")),key=os.path.getctime)
note that your expression files = sorted(os.listdir(path), key=os.path.getctime, reverse = True)
needs you to change current directory unless path
is not the current path, because listdir
returns the file names not the file paths (and yes, you also have to filter out directories, which complexifies the expression further. You don't have this problem anymore with the solution above since os.path.join
is applied before sorting)
Program to check all last modified files in a folder using python?
In Windows a copy of a file probably has a new creation time. You can look at os.path.getctime()
to check the creation time for the copy of the file.
If that works as expected then you could include os.path.getctime()
as an additional check in the key to max()
.
def latest_change(filename):
return max(os.path.getmtime(filename), os.path.getctime(filename))
latest_file = max(list_of_files, key=latest_change)
The key function just grabs whichever of the modification or creation time is greatest, and then uses that greatest time as the key.
Python listing last 10 modified files and reading each line of all 10 files
it looks alike you almost did everything already
import os.path, glob
files = glob.glob("data_event_log.*")
files.sort(key=os.path.getmtime)
latest=files[-10:] # last 10 entries
print("\n".join(latest))
lineRegex = re.compile(r'\d{4}-\d{2}-\d{2}')
for fn in latest:
with open(fn) as f:
for line in f:
if line.startswith('Time'):
a = lineRegex.findall(line)
Edit:
Especially if you have many files a better and simpler solution would be
import os.path, glob, heapq
files = glob.iglob("data_event_log.*")
latest=heapq.nlargest(10, files, key=os.path.getmtime) # last 10 entries
print("\n".join(latest))
lineRegex = re.compile(r'\d{4}-\d{2}-\d{2}')
for fn in latest:
with open(fn) as f:
for line in f:
if line.startswith('Time'):
a = lineRegex.findall(line)
Last Modified File in Python?
Try:
newest = max(glob.iglob('Directory Name/*'), key=os.path.getctime)
Using Python, select files in a directory that have been modified in the last 60 days
Building on what you already provided and what you already know with os.path.getmtime()
, you can use the time.time()
function to get the current time. You can substract the modified time from the current time to get the time difference in seconds. I use (60*60*24)
to get this to days.
The following code does each of those steps:
import glob
import os
import time
files = glob.glob("C:/Folder/*.csv")
modified_files = list()
current_time = time.time()
for csv_file in files:
time_delta = current_time - os.path.getmtime(csv_file)
time_delta_days = time_delta / (60 * 60 * 24)
if time_delta_days < 60:
modified_files.append(csv_file)
print(modified_files)
Edit:
A more pythonic way to write this might be:
import glob
import os
import time
def test_modified(filename):
delta = time.time() - os.path.getmtime(filename)
delta = delta / (60*60*24)
if delta < 60:
return True
return False
mfiles = [mfile for mfile in glob.glob("C:/Folder/*.csv") if test_modified(mfile)]
print(mfiles)
List of filenames modified within 1 week
Depending on how many filenames and how little memory (512MB VPS?), it's possible you're running out of memory creating two lists of all the filenames (one from glob
and one from your list comprehension.) Not necessarily the case but it's all I have to go on.
Try switching to iglob
(which uses os.scandir
under the hood and returns an iterator) and using a generator expression and see if that helps.
Also, getmtime
gets a time, not an interval from now.
import os
import glob
import time
week_ago = time.time() - 7 * 24 * 60 * 60
log_files = (
x for x in glob.iglob('/var/opt/cray/log/p0-current/*')
if not os.path.isdir(x)
and os.path.getmtime(x) > week_ago
)
for filename in log_files:
pass # do something
Related Topics
A Better Way Than Looping and Calling Functions That Loop and Call Another Functions
How Can My Model Primary Key Start With a Specific Number
Collecting and Reporting Pytest Results
Python Number With 1000 Separator
How to Set the Default Python Path for Anaconda on Linux
How to Pass a .Txt File to a Function in Python
Calculate Rgb Value for a Range of Values to Create Heat Map
Fbprophet Installation Error - Failed Building Wheel for Fbprophet
Get Row Value of Maximum Count After Applying Group by in Pandas
Splitting Dataframe into Multiple Dataframes
Pickle - Cpickle.Unpicklingerror: Invalid Load Key, '?'
Matplotlib Bar Chart: Space Out Bars
Django.Db.Utils.Operationalerror: (1045, Access Denied for User '<User>'@'Localhost'
Python Super :Typeerror: _Init_() Takes 2 Positional Arguments But 3 Were Given
How to Drop Rows of Pandas Dataframe Whose Value in a Certain Column Is Nan
Django Viewset Has Not Attribute 'Get_Extra_Actions'
Windowserror: [Error 193] %1 Is Not a Valid Win32 Application in Python