How to Get Files in Ftp Folder Sorted by Modification Time

How to get files in FTP folder sorted by modification time

There's no standard way to have the FTP server sort the files according to your (or any) criteria.

Though some FTP servers, notably the ProFTPD and vsftpd, support proprietary flags with the LIST/NLST command to sort the entries.

Both these servers support the -t flag to sort the files by a modification time:

LIST -t

Though this is not only non-standard, it actually violates the FTP protocol.

For all options supported by ProFTPD, see its man page:

http://www.proftpd.org/docs/howto/ListOptions.html

Note that vsftpd supports only -a, -r, -t, -F and -l with the same meaning as ProFTPD.


If your server does not support the -t switch (or similar), your only option is to retrieve the listing with file attributes as is and sort it locally.

For this you cannot use ftp_nlist, as it returns file names only.

The ideal solution is to use the MLSD FTP command that returns a reliable machine-readable directory listing. But PHP supports that only since 7.2 with its ftp_mlsd function. Check the "modify" entry.

Or, there's an implementation of the MLSD in user comments of the ftp_rawlist command:

https://www.php.net/manual/en/function.ftp-rawlist.php#101071

First check if your FTP server supports MLSD before taking this approach, as not all FTP servers do (particularly IIS and vsftpd don't).

Or, you can use ftp_rawlist. Though it returns proprietary listing of files, that can be difficult to parse. But if you need to support one specific server only, you can hard code the parsing for that server.

How to sort files by modified date through php

If you only want to sort the files by last modified date, you can use

ftp_nlist($conn, '-t .');

This will not tell you what the date for each file is, though.

If you want to get the modified date as well, you can use ftp_rawlist and parse the output. Here's a quick example I scraped together:

$list = ftp_rawlist($ftp, '.');

$results = array();
foreach ($list as $line) {
list($perms, $links, $user, $group, $size, $d1, $d2, $d3, $name) =
preg_split('/\s+/', $line, 9);
$stamp = strtotime(implode(' ', array($d1, $d2, $d3)));
$results[] = array('name' => $name, 'timestamp' => $stamp);
}

usort($results, function($a, $b) { return $a['timestamp'] - $b['timestamp']; });

At this point $results contains a list sorted in ascending last modified time; reverse the sort function to get the list in most recently modified first format.

Note: ftp_rawlist does not provide exact modification timestamps, so this code might not always work accurately. You should also verify that the output from your FTP server agrees with this algorithm and include some sanity checks to make sure things stay that way in the future.

How to order files received from FTP by the creation date in C#?

So after I found out that I needed to retrieve a detailed list of the files the sort problem was easy to solve. I just needed to call

Array.Sort(arrayOfFiles)

Here is the working code:

try
{
/* Create an FTP Request */
ftpRequest = (FtpWebRequest)FtpWebRequest.Create(URI);

/* Log in to the FTP Server with the User Name and Password Provided */
ftpRequest.Credentials = new NetworkCredential(ftpUsername, ftpPassword);

/* When in doubt, use these options */
ftpRequest.UseBinary = true;
ftpRequest.UsePassive = true;
ftpRequest.KeepAlive = true;

/* Specify the Type of FTP Request */
ftpRequest.Method = WebRequestMethods.Ftp.ListDirectoryDetails;

/* Establish Return Communication with the FTP Server */
ftpResponse = (FtpWebResponse)ftpRequest.GetResponse();

/* Establish Return Communication with the FTP Server */
ftpStream = ftpResponse.GetResponseStream();

/* Get the FTP Server's Response Stream */
StreamReader ftpReader = new StreamReader(ftpStream);

/* Store the Raw Response */
string directoryRaw = null;

/* Read Each Line of the Response and Append a Pipe to Each Line for Easy Parsing */
try
{
while (ftpReader.Peek() != -1)
{
directoryRaw += ftpReader.ReadLine() + "|";
}
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
}

/* Resource Cleanup */
ftpReader.Close();
ftpStream.Close();
ftpResponse.Close();
ftpRequest = null;

/* Return the Directory Listing as a string Array by Parsing 'directoryRaw' with the Delimiter you Append (I use | in This Example) */
try
{
string[] directoryList = directoryRaw.Split("|".ToCharArray());
Array.Sort(directoryList);

return directoryList;
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
}
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
}

/* Return an Empty string Array if an Exception Occurs */
return new string[] { "" };

Getting the modification time of a file on a FTP server

You want Net::FTP#mtime.

Example from documentation:

Net::FTP.open('ftp.netlab.co.jp') do |ftp|
ftp.login
files = ftp.chdir('pub/lang/ruby/contrib')
files = ftp.list('n*')
ftp.getbinaryfile('nif.rb-0.91.gz', 'nif.gz', 1024)
ftp.mtime('file.pdf')
end

You can use #mtime with #nlst to filter through the list of remote files.

Net::FTP.open('ftp.netlab.co.jp') do |ftp|
ftp.login
ftp.nlst do |file|
if ftp.mtime(file) # ...
end
end

How to get FTP file's modify time using Python ftplib

MLST or MDTM

While you can retrieve a timestamp of an individual file over FTP with MLST or MDTM commands, neither is supported by ftplib.

Of course you can implement the MLST or MDTM on your own using FTP.voidcmd.

For details, refer to RFC 3659, particularly the:

  • 3. File Modification Time (MDTM)
  • 7. Listings for Machine Processing (MLST and MLSD)

A simple example for MDTM:

from ftplib import FTP
from dateutil import parser

# ... (connection to FTP)

timestamp = ftp.voidcmd("MDTM /remote/path/file.txt")[4:].strip()

time = parser.parse(timestamp)

print(time)


MLSD

The only command explicitly supported by the ftplib library that can return standardized file timestamp is MLSD via FTP.mlsd method. Though its use makes sense only if you want to retrieve timestamps for more files.

  • Retrieve a complete directory listing using MLSD
  • Search the returned collection for the file(s) you want
  • Retrieve modify fact
  • Parse it according to the specification, YYYYMMDDHHMMSS[.sss]

For details, refer to RFC 3659 again, particularly the:

  • 7.5.3. The modify Fact section
  • 2.3. Times section
from ftplib import FTP
from dateutil import parser

# ... (connection to FTP)

files = ftp.mlsd("/remote/path")

for file in files:
name = file[0]
timestamp = file[1]['modify']
time = parser.parse(timestamp)
print(name + ' - ' + str(time))

Note that times returned by MLST, MLSD and MDTM are in UTC (unless the server is broken). So you may need to correct them for your local timezone.

Again, refer to RFC 3659 2.3. Times section:

Time values are always represented in UTC (GMT), and in the Gregorian
calendar regardless of what calendar may have been in use at the date
and time indicated at the location of the server-PI.



LIST

If the FTP server does not support any of MLST, MLSD and MDTM, all you can do is to use an obsolete LIST command. That involves parsing a proprietary listing it returns.

A common *nix listing is like:

-rw-r--r-- 1 user group           4467 Mar 27  2018 file1.zip
-rw-r--r-- 1 user group 124529 Jun 18 15:31 file2.zip

With a listing like this, this code will do:

from ftplib import FTP
from dateutil import parser

# ... (connection to FTP)

lines = []
ftp.dir("/remote/path", lines.append)

for line in lines:
tokens = line.split(maxsplit = 9)
name = tokens[8]
time_str = tokens[5] + " " + tokens[6] + " " + tokens[7]
time = parser.parse(time_str)
print(name + ' - ' + str(time))


Finding the latest file

See also Python FTP get the most recent file by date.

Retrieve modified DateTime of a file from an FTP Server

Until we see the output from this particular FTP server (they are all different) for directory listings, here's a path you can follow:

library(curl)
library(stringr)

Get the raw directory listing:

con <- curl("ftp://ftp.FreeBSD.org/pub/FreeBSD/")
dat <- readLines(con)
close(con)
dat

## [1] "-rw-rw-r-- 1 ftp ftp 4259 May 07 16:18 README.TXT"
## [2] "-rw-rw-r-- 1 ftp ftp 35 Sep 09 21:00 TIMESTAMP"
## [3] "drwxrwxr-x 9 ftp ftp 11 Sep 09 21:00 development"
## [4] "-rw-r--r-- 1 ftp ftp 2566 Sep 09 10:00 dir.sizes"
## [5] "drwxrwxr-x 28 ftp ftp 52 Aug 23 10:44 doc"
## [6] "drwxrwxr-x 5 ftp ftp 5 Aug 05 04:16 ports"
## [7] "drwxrwxr-x 10 ftp ftp 12 Sep 09 21:00 releases"

Filter out the directories:

no_dirs <- grep("^d", dat, value=TRUE, invert=TRUE)
no_dirs

## [1] "-rw-rw-r-- 1 ftp ftp 4259 May 07 16:18 README.TXT"
## [2] "-rw-rw-r-- 1 ftp ftp 35 Sep 09 21:00 TIMESTAMP"
## [3] "-rw-r--r-- 1 ftp ftp 2566 Sep 09 10:00 dir.sizes"

Extract just the timestamp and filename:

date_and_name <- sub("^[[:alnum:][:punct:][:blank:]]{43}", "", no_dirs)
date_ane_name
## [1] "May 07 16:18 README.TXT"
## [2] "Sep 09 21:00 TIMESTAMP"
## [3] "Sep 09 10:00 dir.sizes"

Put them into a data.frame:

do.call(rbind.data.frame, 
lapply(str_match_all(date_and_name, "([[:alnum:] :]{12}) (.*)$"),
function(x) {
data.frame(timestamp=x[2],
filename=x[3],
stringsAsFactors=FALSE)
})) -> dat
dat

## timestamp filename
## 1 May 07 16:18 README.TXT
## 2 Sep 09 21:00 TIMESTAMP
## 3 Sep 09 10:00 dir.sizes

You still need to convert the timestamp to a POSIXct but that's trivial.

This particular example is dependent on that system's FTP directory listing response. Just change the regexes for yours.

Fetching last modified date of a file in FTP server using FTPClient.getModificationTime yields null

FTPClient.getModificationTime returns null when the server returns an error response to MDTM command. Typically that would mean either that:

  • "File path" does not exists; or
  • the FTP server does not support MDTM command.

Check FTPClient.getReplyString().


If it turns out that the FTP server does not support MDTM command, you will have to use another method to retrieve the timestamps. If MDTM is not supported, MLSD won't be either.

In that case the only other way is using LIST command to retrieve listing of all files and lookup the file you need - Use FTPClient.listFiles().

FTPFile[] remoteFiles = ftpClient.listFiles(remotePath);

Arrays.sort(remoteFiles,
Comparator.comparing((FTPFile remoteFile) -> remoteFile.getTimestamp()).reversed());

FTPFile latestFile = remoteFiles[0];
System.out.println(
"Latest file is " + latestFile.getName() +
" with timestamp " + latestFile.getTimestamp().getTime().toString());

See also Make FTP server return files listed by timestamp with Apache FTPClient.



Related Topics



Leave a reply



Submit