How to get files in FTP folder sorted by modification time
There's no standard way to have the FTP server sort the files according to your (or any) criteria.
Though some FTP servers, notably the ProFTPD and vsftpd, support proprietary flags with the LIST
/NLST
command to sort the entries.
Both these servers support the -t
flag to sort the files by a modification time:
LIST -t
Though this is not only non-standard, it actually violates the FTP protocol.
For all options supported by ProFTPD, see its man page:
http://www.proftpd.org/docs/howto/ListOptions.html
Note that vsftpd supports only -a
, -r
, -t
, -F
and -l
with the same meaning as ProFTPD.
If your server does not support the -t
switch (or similar), your only option is to retrieve the listing with file attributes as is and sort it locally.
For this you cannot use ftp_nlist
, as it returns file names only.
The ideal solution is to use the MLSD
FTP command that returns a reliable machine-readable directory listing. But PHP supports that only since 7.2 with its ftp_mlsd
function. Check the "modify"
entry.
Or, there's an implementation of the MLSD
in user comments of the ftp_rawlist
command:
https://www.php.net/manual/en/function.ftp-rawlist.php#101071
First check if your FTP server supports MLSD
before taking this approach, as not all FTP servers do (particularly IIS and vsftpd don't).
Or, you can use ftp_rawlist
. Though it returns proprietary listing of files, that can be difficult to parse. But if you need to support one specific server only, you can hard code the parsing for that server.
How to sort files by modified date through php
If you only want to sort the files by last modified date, you can use
ftp_nlist($conn, '-t .');
This will not tell you what the date for each file is, though.
If you want to get the modified date as well, you can use ftp_rawlist
and parse the output. Here's a quick example I scraped together:
$list = ftp_rawlist($ftp, '.');
$results = array();
foreach ($list as $line) {
list($perms, $links, $user, $group, $size, $d1, $d2, $d3, $name) =
preg_split('/\s+/', $line, 9);
$stamp = strtotime(implode(' ', array($d1, $d2, $d3)));
$results[] = array('name' => $name, 'timestamp' => $stamp);
}
usort($results, function($a, $b) { return $a['timestamp'] - $b['timestamp']; });
At this point $results
contains a list sorted in ascending last modified time; reverse the sort function to get the list in most recently modified first format.
Note: ftp_rawlist
does not provide exact modification timestamps, so this code might not always work accurately. You should also verify that the output from your FTP server agrees with this algorithm and include some sanity checks to make sure things stay that way in the future.
How to order files received from FTP by the creation date in C#?
So after I found out that I needed to retrieve a detailed list of the files the sort problem was easy to solve. I just needed to call
Array.Sort(arrayOfFiles)
Here is the working code:
try
{
/* Create an FTP Request */
ftpRequest = (FtpWebRequest)FtpWebRequest.Create(URI);
/* Log in to the FTP Server with the User Name and Password Provided */
ftpRequest.Credentials = new NetworkCredential(ftpUsername, ftpPassword);
/* When in doubt, use these options */
ftpRequest.UseBinary = true;
ftpRequest.UsePassive = true;
ftpRequest.KeepAlive = true;
/* Specify the Type of FTP Request */
ftpRequest.Method = WebRequestMethods.Ftp.ListDirectoryDetails;
/* Establish Return Communication with the FTP Server */
ftpResponse = (FtpWebResponse)ftpRequest.GetResponse();
/* Establish Return Communication with the FTP Server */
ftpStream = ftpResponse.GetResponseStream();
/* Get the FTP Server's Response Stream */
StreamReader ftpReader = new StreamReader(ftpStream);
/* Store the Raw Response */
string directoryRaw = null;
/* Read Each Line of the Response and Append a Pipe to Each Line for Easy Parsing */
try
{
while (ftpReader.Peek() != -1)
{
directoryRaw += ftpReader.ReadLine() + "|";
}
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
}
/* Resource Cleanup */
ftpReader.Close();
ftpStream.Close();
ftpResponse.Close();
ftpRequest = null;
/* Return the Directory Listing as a string Array by Parsing 'directoryRaw' with the Delimiter you Append (I use | in This Example) */
try
{
string[] directoryList = directoryRaw.Split("|".ToCharArray());
Array.Sort(directoryList);
return directoryList;
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
}
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
}
/* Return an Empty string Array if an Exception Occurs */
return new string[] { "" };
Getting the modification time of a file on a FTP server
You want Net::FTP#mtime.
Example from documentation:
Net::FTP.open('ftp.netlab.co.jp') do |ftp|
ftp.login
files = ftp.chdir('pub/lang/ruby/contrib')
files = ftp.list('n*')
ftp.getbinaryfile('nif.rb-0.91.gz', 'nif.gz', 1024)
ftp.mtime('file.pdf')
end
You can use #mtime
with #nlst
to filter through the list of remote files.
Net::FTP.open('ftp.netlab.co.jp') do |ftp|
ftp.login
ftp.nlst do |file|
if ftp.mtime(file) # ...
end
end
How to get FTP file's modify time using Python ftplib
MLST or MDTM
While you can retrieve a timestamp of an individual file over FTP with MLST
or MDTM
commands, neither is supported by ftplib.
Of course you can implement the MLST
or MDTM
on your own using FTP.voidcmd
.
For details, refer to RFC 3659, particularly the:
- 3. File Modification Time (MDTM)
- 7. Listings for Machine Processing (MLST and MLSD)
A simple example for MDTM
:
from ftplib import FTP
from dateutil import parser
# ... (connection to FTP)
timestamp = ftp.voidcmd("MDTM /remote/path/file.txt")[4:].strip()
time = parser.parse(timestamp)
print(time)
MLSD
The only command explicitly supported by the ftplib library that can return standardized file timestamp is MLSD
via FTP.mlsd
method. Though its use makes sense only if you want to retrieve timestamps for more files.
- Retrieve a complete directory listing using
MLSD
- Search the returned collection for the file(s) you want
- Retrieve
modify
fact - Parse it according to the specification,
YYYYMMDDHHMMSS[.sss]
For details, refer to RFC 3659 again, particularly the:
- 7.5.3. The modify Fact section
- 2.3. Times section
from ftplib import FTP
from dateutil import parser
# ... (connection to FTP)
files = ftp.mlsd("/remote/path")
for file in files:
name = file[0]
timestamp = file[1]['modify']
time = parser.parse(timestamp)
print(name + ' - ' + str(time))
Note that times returned by MLST
, MLSD
and MDTM
are in UTC (unless the server is broken). So you may need to correct them for your local timezone.
Again, refer to RFC 3659 2.3. Times section:
Time values are always represented in UTC (GMT), and in the Gregorian
calendar regardless of what calendar may have been in use at the date
and time indicated at the location of the server-PI.
LIST
If the FTP server does not support any of MLST
, MLSD
and MDTM
, all you can do is to use an obsolete LIST
command. That involves parsing a proprietary listing it returns.
A common *nix listing is like:
-rw-r--r-- 1 user group 4467 Mar 27 2018 file1.zip
-rw-r--r-- 1 user group 124529 Jun 18 15:31 file2.zip
With a listing like this, this code will do:
from ftplib import FTP
from dateutil import parser
# ... (connection to FTP)
lines = []
ftp.dir("/remote/path", lines.append)
for line in lines:
tokens = line.split(maxsplit = 9)
name = tokens[8]
time_str = tokens[5] + " " + tokens[6] + " " + tokens[7]
time = parser.parse(time_str)
print(name + ' - ' + str(time))
Finding the latest file
See also Python FTP get the most recent file by date.
Retrieve modified DateTime of a file from an FTP Server
Until we see the output from this particular FTP server (they are all different) for directory listings, here's a path you can follow:
library(curl)
library(stringr)
Get the raw directory listing:
con <- curl("ftp://ftp.FreeBSD.org/pub/FreeBSD/")
dat <- readLines(con)
close(con)
dat
## [1] "-rw-rw-r-- 1 ftp ftp 4259 May 07 16:18 README.TXT"
## [2] "-rw-rw-r-- 1 ftp ftp 35 Sep 09 21:00 TIMESTAMP"
## [3] "drwxrwxr-x 9 ftp ftp 11 Sep 09 21:00 development"
## [4] "-rw-r--r-- 1 ftp ftp 2566 Sep 09 10:00 dir.sizes"
## [5] "drwxrwxr-x 28 ftp ftp 52 Aug 23 10:44 doc"
## [6] "drwxrwxr-x 5 ftp ftp 5 Aug 05 04:16 ports"
## [7] "drwxrwxr-x 10 ftp ftp 12 Sep 09 21:00 releases"
Filter out the directories:
no_dirs <- grep("^d", dat, value=TRUE, invert=TRUE)
no_dirs
## [1] "-rw-rw-r-- 1 ftp ftp 4259 May 07 16:18 README.TXT"
## [2] "-rw-rw-r-- 1 ftp ftp 35 Sep 09 21:00 TIMESTAMP"
## [3] "-rw-r--r-- 1 ftp ftp 2566 Sep 09 10:00 dir.sizes"
Extract just the timestamp and filename:
date_and_name <- sub("^[[:alnum:][:punct:][:blank:]]{43}", "", no_dirs)
date_ane_name
## [1] "May 07 16:18 README.TXT"
## [2] "Sep 09 21:00 TIMESTAMP"
## [3] "Sep 09 10:00 dir.sizes"
Put them into a data.frame
:
do.call(rbind.data.frame,
lapply(str_match_all(date_and_name, "([[:alnum:] :]{12}) (.*)$"),
function(x) {
data.frame(timestamp=x[2],
filename=x[3],
stringsAsFactors=FALSE)
})) -> dat
dat
## timestamp filename
## 1 May 07 16:18 README.TXT
## 2 Sep 09 21:00 TIMESTAMP
## 3 Sep 09 10:00 dir.sizes
You still need to convert the timestamp to a POSIXct
but that's trivial.
This particular example is dependent on that system's FTP directory listing response. Just change the regexes for yours.
Fetching last modified date of a file in FTP server using FTPClient.getModificationTime yields null
FTPClient.getModificationTime
returns null
when the server returns an error response to MDTM
command. Typically that would mean either that:
- "File path" does not exists; or
- the FTP server does not support
MDTM
command.
Check FTPClient.getReplyString()
.
If it turns out that the FTP server does not support MDTM
command, you will have to use another method to retrieve the timestamps. If MDTM
is not supported, MLSD
won't be either.
In that case the only other way is using LIST
command to retrieve listing of all files and lookup the file you need - Use FTPClient.listFiles()
.
FTPFile[] remoteFiles = ftpClient.listFiles(remotePath);
Arrays.sort(remoteFiles,
Comparator.comparing((FTPFile remoteFile) -> remoteFile.getTimestamp()).reversed());
FTPFile latestFile = remoteFiles[0];
System.out.println(
"Latest file is " + latestFile.getName() +
" with timestamp " + latestFile.getTimestamp().getTime().toString());
See also Make FTP server return files listed by timestamp with Apache FTPClient.
Related Topics
Sending/Displaying a Base64 Encoded Image
Destroy Session on Window Close
Commenting Interpreted Code and Performance
How to Enable Put Requests in Azure
Submit Multiple Forms with One Button
Fatal Error: Call to a Member Function Bindparam()
How to Get Attribute of Node with Namespace Using Simplexml
How to Increase My "Advanced" Knowledge of PHP Further? (Quickly)
PHP Code Formatter/Beautifier and PHP Beautification in General
Auto Increment Skipping Numbers
Replacing Invalid Utf-8 Characters by Question Marks, Mbstring.Substitute_Character Seems Ignored
Why Preg_Replace Throws Me a "Unknown Modifier" Error
Modify an Existing PHP Function to Return a String