Difference Between Directoryiterator and Filesystemiterator

Difference between DirectoryIterator and FileSystemIterator

This goes out of the top of my head, where I sort of got caught in the changes prior to PHP 5.3 that were going to change in 5.3 and later, concerning the SPL (StandardPHPLibrary) and stuff that were going to be moved to the (horrible) PECL extensions.

The major thing that changed since 5.3, was that the SPL became an extension that could not be disabled anymore, see the changelog of 5.3 noting that

  • Added SPL to list of standard extensions that cannot be disabled.
    (Marcus)

so all the fancy classes like DirectoryIterator or SPLDoublyLinkedList were now a fix suite of classes that came with PHP 5.3.

There were a lot of discussions going on that the DirectoryIterator was still very clumsy in iterating over files/directories and from behaviour not anonymous enough to the filesystem being used. Because depending on the filesystem (Windows NTFS / *nix EXTx) the results the iterator would return were different from another, where *nix environments per default always resulted the dot and double dot directories (. and ..) as valid directories. These dot directories could then be filtered in the loop by using the isDot() method.

$it = new DirectoryIterator(__DIR__);
foreach ($it as $fileinfo) {
if (!$fileinfo->isDot())
var_dump($fileinfo->getFilename());
}

So FilesystemIterator became the new parent class in PHP 5.3, which prior to its release was the DirectoryIterator (where FilesystemIterator extends DirectoryIterator to implement this interchangeable behaviour by default). The behaviour, or result the FilesystemIterator produced, would then be equal to all different filesystems and interchangeable without the need of any overhead in the loop

$it = new FilesystemIterator(__DIR__);
foreach ($it as $fileinfo) {
echo $fileinfo->getFilename() . "\n";
}

It's a good question why they didn't update the documentation for noticing the user on the fact that actually the FilesystemIterator preceded the DirectoryIterator.

Get the files inside a directory

There's a lot of ways. The older way is scandir but DirectoryIterator is probably the best way.

There's also readdir (to be used with opendir) and glob.

Here are some examples on how to use each one to print all the files in the current directory:

DirectoryIterator usage: (recommended)

foreach (new DirectoryIterator('.') as $file) {
if($file->isDot()) continue;
print $file->getFilename() . '<br>';
}

scandir usage:

$files = scandir('.');
foreach($files as $file) {
if($file == '.' || $file == '..') continue;
print $file . '<br>';
}

opendir and readdir usage:

if ($handle = opendir('.')) {
while (false !== ($file = readdir($handle))) {
if($file == '.' || $file == '..') continue;
print $file . '<br>';
}
closedir($handle);
}

glob usage:

foreach (glob("*") as $file) {
if($file == '.' || $file == '..') continue;
print $file . '<br>';
}

As mentioned in the comments, glob is nice because the asterisk I used there can actually be used to do matches on the files, so glob('*.txt') would get you all the text files in the folder and glob('image_*') would get you all files that start with image_

Directory iterator and recursive directory iterator

  • 1 - why is the method described in the other accepted answer not working for me ??`

As far as i can tell . the code works perfectly but your implementation is wrong you are using the following

Code

   $dir = new RecursiveDirectoryIterator($upload_dir_real);

Instead of

    $dir = new RecursiveIteratorIterator(new RecursiveDirectoryIterator($upload_dir_real));
  • In that same answer actually calls RecursiveIteratorIterator twice ?? I mean, if it is recursive , it can not be recursive twice ... :-))

No it does not its different

RecursiveIteratorIterator != RecursiveDirectoryIterator != FilesystemIterator
^ ^
  • how come FilesystemIterator is not found , even if the php manual states (to my understanding) that it is a part of what the recursive iterator is built upon??

You already answered that your self in your comment you are using PHP version 5.2.9 which is no longer supported or recommended

  • 3 - Is there a better way to list all folder and files cross platform ??

Since that is resolved all you need is FilesystemIterator::SKIP_DOTS you don't have to call $file->isDot()

Example

$fullPath = __DIR__;
$dirs = $files = array();

$directory = new RecursiveDirectoryIterator($fullPath, FilesystemIterator::SKIP_DOTS);
foreach (new RecursiveIteratorIterator($directory, RecursiveIteratorIterator::SELF_FIRST) as $path ) {
$path->isDir() ? $dirs[] = $path->__toString() : $files[] = realpath($path->__toString());
}

var_dump($files, $dirs);

What exactly are the benefits of using a PHP 5 DirectoryIterator over PHP 4 opendir/readdir/closedir?

To understand the difference between the two, let's write two functions that read contents of a directory into an array - one using the procedural method and the other object oriented:

Procedural, using opendir/readdir/closedir

function list_directory_p($dirpath) {
if (!is_dir($dirpath) || !is_readable($dirpath)) {
error_log(__FUNCTION__ . ": Argument should be a path to valid, readable directory (" . var_export($dirpath, true) . " provided)");
return null;
}
$paths = array();
$dir = realpath($dirpath);
$dh = opendir($dir);
while (false !== ($f = readdir($dh))) {
if ("$f" != '.' && "$f" != '..') {
$paths[] = "$dir" . DIRECTORY_SEPARATOR . "$f";
}
}
closedir($dh);
return $paths;
}

Object Oriented, using DirectoryIterator

function list_directory_oo($dirpath) {
if (!is_dir($dirpath) || !is_readable($dirpath)) {
error_log(__FUNCTION__ . ": Argument should be a path to valid, readable directory (" . var_export($dirpath, true) . " provided)");
return null;
}
$paths = array();
$dir = realpath($dirpath);
$di = new DirectoryIterator($dir);
foreach ($di as $fileinfo) {
if (!$fileinfo->isDot()) {
$paths[] = $fileinfo->getRealPath();
}
}
return $paths;
}

Performance

Let's assess their performance first:

$start_t = microtime(true);
for ($i = 0; $i < $num_iterations; $i++) {
$paths = list_directory_oo(".");
}
$end_t = microtime(true);
$time_diff_micro = (($end_t - $start_t) * 1000000) / $num_iterations;
echo "Time taken per call (list_directory_oo) = " . round($time_diff_micro / 1000, 2) . "ms (" . count($paths) . " files)\n";

$start_t = microtime(true);
for ($i = 0; $i < $num_iterations; $i++) {
$paths = list_directory_p(".");
}
$end_t = microtime(true);
$time_diff_micro = (($end_t - $start_t) * 1000000) / $num_iterations;
echo "Time taken per call (list_directory_p) = " . round($time_diff_micro / 1000, 2) . "ms (" . count($paths) . " files)\n";

On my laptop (Win 7 / NTFS), procedural method seems to be clear winner:

C:\code>"C:\Program Files (x86)\PHP\php.exe" list_directory.php
Time taken per call (list_directory_oo) = 4.46ms (161 files)
Time taken per call (list_directory_p) = 0.34ms (161 files)

On an entry-level AWS machine (CentOS):

[~]$ php list_directory.php
Time taken per call (list_directory_oo) = 0.84ms (203 files)
Time taken per call (list_directory_p) = 0.36ms (203 files)

Above are results on PHP 5.4. You'll see similar results using PHP 5.3 and 5.2. Results are similar when PHP is running on Apache or NGINX.

Code Readability

Although slower, code using DirectoryIterator is more readable.

File reading order

The order of directory contents read using either method are exact same. That is, if list_directory_oo returns array('h', 'a', 'g'), list_directory_p also returns array('h', 'a', 'g')

Extensibility

Above two functions demonstrated performance and readability. Note that, if your code needs to do further operations, code using DirectoryIterator is more extensible.

e.g. In function list_directory_oo above, the $fileinfo object provides you with a bunch of methods such as getMTime(), getOwner(), isReadable() etc (return values of most of which are cached and do not require system calls).

Therefore, depending on your use-case (that is, what you intend to do with each child element of the input directory), it's possible that code using DirectoryIterator performs as good or sometimes better than code using opendir.

You can modify the code of list_directory_oo and test it yourself.

Summary

Decision of which to use entirely depends on use-case.

If I were to write a cronjob in PHP which recursively scans a directory (and it's subdirectories) containing thousands of files and do certain operation on them, I would choose the procedural method.

But if my requirement is to write a sort of web-interface to display uploaded files (say in a CMS) and their metadata, I would choose DirectoryIterator.

You can choose based on your needs.

DirectoryIterator listing in alphabetical order

As you can see, dir is not an array of strings, so you can't sort it this way.
The strings (names) are in dir[index]->getFileName().

So you should do the following steps:

  1. Make an $DirNames array

  2. Sort this array

  3. Display this array

     $dir = array();
    $dir = new DirectoryIterator("PASTAS/");

    $DirNames = array();
    foreach ($dir as $fileinfo)
    if ($fileinfo->isDir() && !$fileinfo->isDot())
    $DirNames[] = $fileinfo->getFilename();

    sort($DirNames);

    foreach($DirNames as $name)
    echo '<a href="PASTAS/'. $name .'" class="list-group-item">'. $name.'</a>';

Why use clone keyword here

Since php5, operator = creates reference to object, so without clone you will put pointer/reference to variable $file into array.

But this variable is used only within loop and will/could be destroyed after foreach, because it out of the scope.

This is why you need to create copy of it to put into array and have access to it after loop.

UPDATE: Actually the difference a little bit deeper in this case, check out this article. DirectoryIterator returns the same object, this is why you have to clone (with its current) state during iteration, but FilesystemIterator returns new object, which can be put into array without clone.

DirectoryIterator scan to exclude '.' and '..' directories still including them?

DirectoryIterator::getPath() returns the full path to the directory -- and not only the last part of it.

If you only want the last portion of that path, you should probably use SplFileInfo::getBasename() in your condition.

Or, for your specific test, you might want to take a look at the DirectoryIterator::isDot() method (quoting) :

Determine if current
DirectoryIterator item is '.' or
'..'



Related Topics



Leave a reply



Submit