How to use RegexIterator in PHP
There are a couple of different ways of going about something like this, I'll give two quick approaches for you to choose from: quick and dirty, versus longer and less dirty (though, it's a Friday night so we're allowed to go a little bit crazy).
1. Quick (and dirty)
This involves just writing a regular expression (could be split into multiple) to use to filter the collection of files in one quick swoop.
(Only the two commented lines are really important to the concept.)
$directory = new RecursiveDirectoryIterator(__DIR__);
$flattened = new RecursiveIteratorIterator($directory);
// Make sure the path does not contain "/.Trash*" folders and ends eith a .php or .html file
$files = new RegexIterator($flattened, '#^(?:[A-Z]:)?(?:/(?!\.Trash)[^/]+)+/[^/]+\.(?:php|html)$#Di');
foreach($files as $file) {
echo $file . PHP_EOL;
}
This approach has a number of issues, though it is quick to implement being just a one-liner (though the regex might be a pain to decipher).
2. Less quick (and less dirty)
A more re-usable approach is to create a couple of bespoke filters (using regex, or whatever you like!) to whittle down the list of available items in the initial RecursiveDirectoryIterator
down to only those that you want. The following is only one example, written quickly just for you, of extending the RecursiveRegexIterator
.
We start with a base class whose main job is to keep a hold of the regex that we want to filter with, everything else is deferred back to the RecursiveRegexIterator
. Note that the class is abstract
since it doesn't actually do anything useful: the actual filtering is to be done by the two classes which will extend this one. Also, it may be called FilesystemRegexFilter
but there is nothing forcing it (at this level) to filter filesystem-related classes (I'd have chosen a better name, if I weren't quite so sleepy).
abstract class FilesystemRegexFilter extends RecursiveRegexIterator {
protected $regex;
public function __construct(RecursiveIterator $it, $regex) {
$this->regex = $regex;
parent::__construct($it, $regex);
}
}
These two classes are very basic filters, acting on the file name and directory name respectively.
class FilenameFilter extends FilesystemRegexFilter {
// Filter files against the regex
public function accept() {
return ( ! $this->isFile() || preg_match($this->regex, $this->getFilename()));
}
}
class DirnameFilter extends FilesystemRegexFilter {
// Filter directories against the regex
public function accept() {
return ( ! $this->isDir() || preg_match($this->regex, $this->getFilename()));
}
}
To put those into practice, the following iterates recursively over the contents of the directory in which the script resides (feel free to edit this!) and filters out the .Trash
folders (by making sure that folder names do match the specially crafted regex), and accepting only PHP and HTML files.
$directory = new RecursiveDirectoryIterator(__DIR__);
// Filter out ".Trash*" folders
$filter = new DirnameFilter($directory, '/^(?!\.Trash)/');
// Filter PHP/HTML files
$filter = new FilenameFilter($filter, '/\.(?:php|html)$/');
foreach(new RecursiveIteratorIterator($filter) as $file) {
echo $file . PHP_EOL;
}
Of particular note is that since our filters are recursive, we can choose to play around with how to iterate over them. For example, we could easily limit ourselves to only scanning up to 2 levels deep (including the starting folder) by doing:
$files = new RecursiveIteratorIterator($filter);
$files->setMaxDepth(1); // Two levels, the parameter is zero-based.
foreach($files as $file) {
echo $file . PHP_EOL;
}
It is also super-easy to add yet more filters (by instantiating more of our filtering classes with different regexes; or, by creating new filtering classes) for more specialised filtering needs (e.g. file size, full-path length, etc.).
P.S. Hmm this answer babbles a bit; I tried to keep it as concise as possible (even removing vast swathes of super-babble). Apologies if the net result leaves the answer incoherent.
PHP SPL RegexIterator how files would be ordered?
RecursiveDirectoryIterator
uses opendir
, which doesn't sort its results.
If you want sorted results, you can use scandir
, but it's not recursive or iterative.
RecursiveIterator return array with file extension
Try to var_dump $files and you will see. If you dont want to put both elements of the $file array into your $fileList then dont use the array_merge simply do:
foreach($files as $file) {
$fileList[] = $file[0];
}
And for pretty rough and ready fix to the \ do a str_replace or similar. Something like:
foreach($files as $file) {
$fileList[] = str_replace('/','\\',$file[0]);
}
RecursiveDirectoryIterator + RecursiveIteratorIterator + RegexIterator are not working like they should
The comment doesn't tell the whole truth in this sentence
$Regex will contain a single index array for each PHP file.
You actually need to iterate over $Regex
, as a dump won't give you back a usual array
foreach($Regex as $file) {
var_dump($file);
}
Using literal numbers seems to break RegexIterator
What looks to be happening here is that the directory itself in $IMAGES_DIR
is included in the pattern returned to $r
in your iteration. Using your working pattern, if you print_r($r);
inside the loop you'll see the matched patterns:
array(6) {
[0]=>
string(19) "./images/test/4.png"
[1]=>
string(19) "./images/test/6.png"
[2]=>
string(19) "./images/test/5.png"
[3]=>
string(14) "./images/3.png"
[4]=>
string(14) "./images/1.png"
[5]=>
string(14) "./images/2.png"
}
So, you need to construct your expression to either incorporate the directory, or to ignore it and not anchor with ^
. Your pattern as attempted matches exactly patterns like 1.png
but the input string it is testing is actuall ./images/1.png
.
Instead I would recommend using
$IMG_MASK = '#/[1-3]\.png$#';
This pattern does not ^
anchor the start of the string, and instead begins matching at the /
before the digit.
If you are interested in getting the full paths, restore your .+
to the start, and use DIRECTORY_SEPARATOR
just before the digit:
$IMG_MASK = '#.+' . DIRECTORY_SEPARATOR . '[1-3]\.png$#';
This will match anything (.+
) up to a /
(or your platform's separator), then match the single digit and .png
. The result is an array like:
Array
(
[0] => ./images/3.png
[1] => ./images/1.png
[2] => ./images/2.png
)
Of course if you want those images in ./images/test/
adjust the regex to use \d\.png
to match any digit instead of just [1-3]
.
The pattern
$IMG_MASK = '#.+' . DIRECTORY_SEPARATOR . '\d\.png$#';
...produces:
Array
(
[0] => ./images/test/4.png
[1] => ./images/test/6.png
[2] => ./images/test/5.png
[3] => ./images/3.png
[4] => ./images/1.png
[5] => ./images/2.png
)
How to use RecursiveDirectoryIterator with a Modified Date filter?
Ok you may try this example:
class FilesystemDateFilter extends RecursiveFilterIterator
{
protected $earliest_date;
public function __construct(RecursiveIterator $it, $earliest_date)
{
$this->earliest_date = $earliest_date;
parent::__construct($it);
}
public function accept()
{
return ( ! $this->isFile() || $this->getMTime() >= $this->earliest_date );
}
public function getChildren()
{
return new static ( $this->getInnerIterator ()->getChildren (), $this->earliest_date );
}
}
$directory = new RecursiveDirectoryIterator("c:\\www");
$filter = new FilesystemDateFilter($directory, strtotime('2012-12-31'));
foreach(new RecursiveIteratorIterator($filter) as $filename => $file) {
echo $filename . PHP_EOL;
}
Note http://php.net/manual/en/directoryiterator.getmtime.php returns timestamp so you need also give it.
What you was missing was overwriting getChildern
which passes parameter down to to children.
Related Topics
PHP Regex to Check Date Is in Yyyy-Mm-Dd Format
Checking If Array Is Multidimensional or Not
How to Install the Ext-Curl Extension with PHP 7
How to Easily Consume a Web Service from PHP
PHP Objects VS Arrays -- Performance Comparison While Iterating
PHP Technique to Query the Apns Feedback Server
Turn Off Warnings and Errors on PHP and MySQL
PHP Curl Curlopt_Ssl_Verifypeer Ignored
PHP Add Elements to Multidimensional Array with Array_Push
How to Add New Column to MySQL Table
Save Current Page as HTML to Server