Python FTP get the most recent file by date
With NLST
, like shown in Martin Prikryl's response,
you should use sorted
method:
ftp = FTP(host="127.0.0.1", user="u",passwd="p")
ftp.cwd("/data")
file_name = sorted(ftp.nlst(), key=lambda x: ftp.voidcmd(f"MDTM {x}"))[-1]
Downloading the most recent file from FTP with Python
Once you have the list of filenames you can simply sort on filename, since the naming convention is S01375T-
YYYY-MM-DD-hh-mm.csv
this will naturally sort into date/time order. Note that if the S01375T-
part varies you could sort on the name split at a fixed position or at the first -
.
If this was not the case you could use the datetime.datetime.strptime
method to parse the filenames into datetime
instances.
Of course if you wished to really simplify things you could use the PyFileSystem FTPFS and it's various methods to allow you to treat the FTP system as if is was a slow local file system.
Python FTP server download Latest File with specific keywords in filename
resolved.
import ftplib
import os
import time
from dateutil import parser
ftp = ftplib.FTP('test.rebex.net', 'demo','password')
ftp.retrlines('LIST')
ftp.cwd("pub")
ftp.cwd("example")
ftp.retrlines('LIST')
names = ftp.nlst()
final_names= [line for line in names if 'client' in line]
latest_time = None
latest_name = None
for name in final_names:
time = ftp.sendcmd("MDTM " + name)
if (latest_time is None) or (time > latest_time):
latest_name = name
latest_time = time
print(latest_name)
file = open(latest_name, 'wb')
ftp.retrbinary('RETR '+ latest_name, file.write)
python how to read latest file in ftp directory
I don't think this question has anything to do with python specifically: you just need to fetch the file the same way you would fetch it with any other FTP client:
for name in names:
time = ftp.sendcmd("MDTM " + name)
if (latest_time is None) or (time > latest_time):
latest_name = name
latest_time = time
with open("myfile.xlsx", "wb") as f:
ftp.retrbinary(f"RETR {latest_name}", f.write)
As to reading the resulting file in to a pandas DF, that's a separate question, but now that you have the file you can do it as you normally would.
Python get recent files FTP
Looking at the documentation for the Python ftplib, it looks like the output from retrlines() will be a line where the file name is the last "column".
-rw-r--r-- 1 ftp-usr pdmaint 5305 Mar 20 09:48 INDEX
So a simple split and getting the last field should work. It will however only work if there are no white-space characters in the file/folder name.
name = line.split()[-1]
print(name) # Should be "INDEX"
You might want to employ a more sophisticated parsing if you want to handle names with white-spaces in them.
Get the latest FTP folder name in Python
If your FTP server supports MLSD
command, a solution is easy:
If you want to base the decision on a modification timestamp:
entries = list(ftp.mlsd())
# Only interested in directories
entries = [entry for entry in entries if entry[1]["type"] == "dir"]
# Sort by timestamp
entries.sort(key = lambda entry: entry[1]['modify'], reverse = True)
# Pick the first one
latest_name = entries[0][0]
print(latest_name)If you want to use a file name:
# Sort by filename
entries.sort(key = lambda entry: entry[0], reverse = True)
If you need to rely on an obsolete LIST
command, you have to parse a proprietary listing it returns.
A common *nix listing is like:
drw-r--r-- 1 user group 4096 Mar 26 2018 folder1-20180326
drw-r--r-- 1 user group 4096 Jun 18 11:21 folder2-20180618
-rw-r--r-- 1 user group 4467 Mar 27 2018 file-20180327.zip
-rw-r--r-- 1 user group 124529 Jun 18 15:31 file-20180618.zip
With a listing like this, this code will do:
If you want to base the decision on a modification timestamp:
lines = []
ftp.dir("", lines.append)
latest_time = None
latest_name = None
for line in lines:
tokens = line.split(maxsplit = 9)
# Only interested in directories
if tokens[0][0] == "d":
time_str = tokens[5] + " " + tokens[6] + " " + tokens[7]
time = parser.parse(time_str)
if (latest_time is None) or (time > latest_time):
latest_name = tokens[8]
latest_time = time
print(latest_name)If you want to use a file name:
lines = []
ftp.dir("", lines.append)
latest_name = None
for line in lines:
tokens = line.split(maxsplit = 9)
# Only interested in directories
if tokens[0][0] == "d":
name = tokens[8]
if (latest_name is None) or (name > latest_name):
latest_name = name
print(latest_name)
Some FTP servers may return .
and ..
entries in LIST
results. You may need to filter those.
Partially based on: Python FTP get the most recent file by date.
If the folder does not contain any files, only subfolders, there are other easier options.
If you want to base the decision on a modification timestamp and the server supports non-standard
-t
switch, you can use:lines = ftp.nlst("-t")
latest_name = lines[-1]See How to get files in FTP folder sorted by modification time
If you want to use a file name:
lines = ftp.nlst()
latest_name = max(lines)
How to get FTP file's modify time using Python ftplib
MLST or MDTM
While you can retrieve a timestamp of an individual file over FTP with MLST
or MDTM
commands, neither is supported by ftplib.
Of course you can implement the MLST
or MDTM
on your own using FTP.voidcmd
.
For details, refer to RFC 3659, particularly the:
- 3. File Modification Time (MDTM)
- 7. Listings for Machine Processing (MLST and MLSD)
A simple example for MDTM
:
from ftplib import FTP
from dateutil import parser
# ... (connection to FTP)
timestamp = ftp.voidcmd("MDTM /remote/path/file.txt")[4:].strip()
time = parser.parse(timestamp)
print(time)
MLSD
The only command explicitly supported by the ftplib library that can return standardized file timestamp is MLSD
via FTP.mlsd
method. Though its use makes sense only if you want to retrieve timestamps for more files.
- Retrieve a complete directory listing using
MLSD
- Search the returned collection for the file(s) you want
- Retrieve
modify
fact - Parse it according to the specification,
YYYYMMDDHHMMSS[.sss]
For details, refer to RFC 3659 again, particularly the:
- 7.5.3. The modify Fact section
- 2.3. Times section
from ftplib import FTP
from dateutil import parser
# ... (connection to FTP)
files = ftp.mlsd("/remote/path")
for file in files:
name = file[0]
timestamp = file[1]['modify']
time = parser.parse(timestamp)
print(name + ' - ' + str(time))
Note that times returned by MLST
, MLSD
and MDTM
are in UTC (unless the server is broken). So you may need to correct them for your local timezone.
Again, refer to RFC 3659 2.3. Times section:
Time values are always represented in UTC (GMT), and in the Gregorian
calendar regardless of what calendar may have been in use at the date
and time indicated at the location of the server-PI.
LIST
If the FTP server does not support any of MLST
, MLSD
and MDTM
, all you can do is to use an obsolete LIST
command. That involves parsing a proprietary listing it returns.
A common *nix listing is like:
-rw-r--r-- 1 user group 4467 Mar 27 2018 file1.zip
-rw-r--r-- 1 user group 124529 Jun 18 15:31 file2.zip
With a listing like this, this code will do:
from ftplib import FTP
from dateutil import parser
# ... (connection to FTP)
lines = []
ftp.dir("/remote/path", lines.append)
for line in lines:
tokens = line.split(maxsplit = 9)
name = tokens[8]
time_str = tokens[5] + " " + tokens[6] + " " + tokens[7]
time = parser.parse(time_str)
print(name + ' - ' + str(time))
Finding the latest file
See also Python FTP get the most recent file by date.
Related Topics
How to Use 'Subprocess' Command With Pipes
Django [Errno 13] Permission Denied: '/Var/Www/Media/Animals/User_Uploads'
System-Wide Mutex in Python on Linux
Tkinter: Attributeerror: Nonetype Object Has No Attribute ≪Attribute Name≫
How to Import a Module Given the Full Path
A Non-Blocking Read on a Subprocess.Pipe in Python
How to Add Value Labels on a Bar Chart
How to Install Pygame on Python Via Pip (Windows 10)
Imagemagick Not Authorized to Convert Pdf to an Image
Standard_Init_Linux.Go:178: Exec User Process Caused "Exec Format Error"
How to Sort a Dictionary by Value
Pygame Mouse Clicking Detection