DirWalker – A Class for Recursively Traversing Directories in MODX

In a previous article, I promised to discuss writing code to produce references pages based on the MODX codebase. I’ll do that in a later post, but first, I want to talk about a class that recursively traverses directories. I call it DirWalker and it’s an adaptation of some code posted by boen dot robot on the php.net scandir manual page.

DirWalker is an essential part of any code that pulls information out of the MODX codebase. It has many other uses as well. Any time you want to process all files in a directory, and optionally, its descendants, DirWalker will make your life much easier.

DirWalker is used extensively in the MyComponent extra, in Orphans, and in the generation of the MODX object Quick Reference and xtypes pages at Bob’s Guides.

Overview

It’s most efficient if you process each file as you find it, but I find that it’s often easier and almost as fast to create an associative array containing the path to each file and its filename, then process them after the fact. You can easily process the files as they are found with DirWalker by extending the class and overriding the processFile() method.

DirWalker walks through all of the files in the specified directory (and its descendants if the recursive argument is true). It creates an associative array containing key/value pairs where the key is the full path to the file (including the filename) and the value is just the filename. Various filters can be used to exclude directories, and to exclude or include files. You can even use a regex search for including or excluding files based on their filenames.

The resetFiles() method empties the file list (in case you want to call dirWalk() more than once), and the getFiles() method gets the array of files from outside the class.

MODX logo

There is more detailed information about DirWalker at Bob’s Guides. See the note below for information on downloading the code.

The Class Code

The class code got a little too long to post here. You can see it at GitHub.

The DirWalker class is now available through Package Manager. You can install it on your site and simply include the file in a snippet or plugin with this line:

include MODX_CORE_PATH . 'components/dirwalker/model/dirwalker/dirwalker.class.php';

DirWalker will run fine outside of MODX (though you’ll need the full path to it in the ‘include’ statement). DirWalker is remarkably fast, but if your process takes long enough to bump up against the default PHP 30 second time limit, you’ll want to run it from the command line.

Usage

Here’s a typical example that recursively traverses the MODX core directory and all its descendants. It collects all class files that have .class in their filenames. It skips the cache, and packages directories. It excludes minimized and aggregated files and skips Git files and directories. The code assumes that you have installed the DirWalker package in Package Manager.

include MODX_CORE_PATH . 'components/dirwalker/model/dirwalker/dirwalker.class.php';
$searchStart = MODX_CORE_PATH;
$output = '';
$dw = new DirWalker();
$dw->setIncludes('.class');
$dw->setExcludes('-all,-min,.git');
$dw->setExcludeDirs('cache,.git,packages');
$dw->dirWalk($searchStart, true);
$fileArray = $dw->getFiles();

foreach($fileArray as $fullPath => $fileName) {
    /* process each file here */
    $output .= "\n<p>FILE: ". $fileName . ' -- PATH: ' . $fullPath . '</p>';


}
return $output;

After the code above runs, all .class files in or below the MODX core directory will be in the $fileArray associative array and you can process them however you like. If you are producing a report, you’ll probably want to sort them in some way, and you’ll probably want to remove the first part of the full path for each file if the path is being reported.

It’s tempting to reverse the key and value members of the processFile() method so the filename is the key, which would make searching and sorting easier, but don’t do it. If you reverse them, you’ll risk missing files with duplicate filenames in different directories. Every file with a name that’s already in the array will overwrite the existing one in the array.

More on Using DirWalker

There is more information on DirWalker at Bob’s Guides. It is also available as a MODX extra that you can install with Package Manager, or it can be downloaded from the MODX Repository or from GitHub. The class itself does not require MODX.

In the next few articles, we’ll look at some working examples that show how to extend DirWalker to process the files as they are found and how to use DirWalker to extract information from the MODX codebase.

 


For more information on how to use MODX to create a web site, see my web site Bob’s Guides, or better yet, buy my book: MODX: The Official Guide.

Looking for quality MODX Web Hosting? Look no further than Arvixe Web Hosting!

Tags: , , , , , , , , , | Posted under MODX, MODX | RSS 2.0

Author Spotlight

Bob Ray

Bob Ray

I am the author of MODX: The Official Guide and over 30 MODX add-on components. I host Bob's Guides, a source of valuable information for MODX users, and I've been very active in the MODX Forums with over 14,000 posts.

Leave a Reply

Your email address will not be published. Required fields are marked *


3 × = 27

You may use these HTML tags and attributes: <a href="" title="" rel=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>