Detect New or Modified Files with Python

Program to check all last modified files in a folder using python?

In Windows a copy of a file probably has a new creation time. You can look at os.path.getctime() to check the creation time for the copy of the file.

If that works as expected then you could include os.path.getctime() as an additional check in the key to max().

def latest_change(filename):
return max(os.path.getmtime(filename), os.path.getctime(filename))

latest_file = max(list_of_files, key=latest_change)

The key function just grabs whichever of the modification or creation time is greatest, and then uses that greatest time as the key.

How to get the latest file in a folder?

Whatever is assigned to the files variable is incorrect. Use the following code.

import glob
import os

list_of_files = glob.glob('/path/to/folder/*') # * means all if need specific format then *.csv
latest_file = max(list_of_files, key=os.path.getctime)
print(latest_file)

Using Python, select files in a directory that have been modified in the last 60 days

Building on what you already provided and what you already know with os.path.getmtime(), you can use the time.time() function to get the current time. You can substract the modified time from the current time to get the time difference in seconds. I use (60*60*24) to get this to days.

The following code does each of those steps:

import glob
import os
import time

files = glob.glob("C:/Folder/*.csv")
modified_files = list()
current_time = time.time()

for csv_file in files:
time_delta = current_time - os.path.getmtime(csv_file)
time_delta_days = time_delta / (60 * 60 * 24)
if time_delta_days < 60:
modified_files.append(csv_file)

print(modified_files)

Edit:
A more pythonic way to write this might be:

import glob
import os
import time

def test_modified(filename):
delta = time.time() - os.path.getmtime(filename)
delta = delta / (60*60*24)
if delta < 60:
return True
return False

mfiles = [mfile for mfile in glob.glob("C:/Folder/*.csv") if test_modified(mfile)]
print(mfiles)

Python library to detect if a file has changed between different runs?

It's unlikely that someone made a library for something so simple. Solution in 13 lines:

import pickle
import md5
try:
l = pickle.load(open("db"))
except IOError:
l = []
db = dict(l)
path = "/etc/hosts"
checksum = md5.md5(open(path).read())
if db.get(path, None) != checksum:
print "file changed"
db[path] = checksum
pickle.dump(db.items(), open("db", "w")

How to get last modified file in a directory?

You can do basically this:

  • get the list of files
  • get the time for each of them (also check os.path.getmtime() for updates)
  • use datetime module to get a value to compare against (that 1h)
  • compare

For that I've used a dictionary to both store paths and timestamps in a compact format. Then you can sort the dictionary by its values (dict.values()) (which is a float, timestamp) and by that you will get the latest files created within 1 hour that are sorted. (e.g. by sorted(...) function):

import os
import glob
from datetime import datetime, timedelta

hour_files = {
key: val for key, val in {
path: os.path.getctime(path)
for path in glob.glob("./*")
}.items()
if datetime.fromtimestamp(val) >= datetime.now() - timedelta(hours=1)
}

Alternatively, without the comprehension:

files = glob.glob("./*")
times = {}
for path in files:
times[path] = os.path.getctime(path)

hour_files = {}
for key, val in times.items():
if datetime.fromtimestamp(val) < datetime.now() - timedelta(hours=1):
continue
hour_files[key] = val

Or, perhaps your folder is just a mess and you have too many files. In that case, approach it incrementally:

hour_files = {}
for file in glob.glob("./*"):
timestamp = os.path.getctime(file)
if datetime.fromtimestamp(timestamp) < datetime.now() - timedelta(hours=1):
continue
hour_files[file] = timestamp


Related Topics



Leave a reply



Submit