A crazy idea for a (mostly) universal autoloader

I say (mostly) because I’m sure someone can come up with a scheme to break this, but it should be flexible enough to cover 80% of cases, including PSR-0, and if you can come up with a way to store classes this can’t handle you can write your own autoloader.

Well, if this one is writable…

This at the moment is a thought exercise.

The factory of the autoloader is given the directory or directories to search for files. Using [fphp]glob[/fphp] it slurps in all files matching a pattern, default *.php (though *.class.php or *.inc will likely be popular).

Then, one file at a time, it uses regex to find the namespace declaration of the file, if any. If there are multiple namespaces (the programmer used the less often invoked namespace { } construction ) the file is divided into it’s component namespaces. Again, using regex, we search for the pattern /class $ /. Each class is appended to the namespace it is found in and added to an array.

If duplicate classes are found the autoloader fails with an error (this is one of the cases in the 20% I’m not touching) telling you which two files have identical classes.
? Or would it be better to continue but to refuse to autoload either class ?
? Or proceed, first class found only ?
In all cases, an error E_USER_WARNING at least should be raised, if not an exception thrown or fatal error.

Once all the classes are found, the array is passed to the actual autoloader. This object needs to be cachable to be effective, as no sane live system would be responsive with that much file scanning occurring each page load.

Note that in the keys the class / namespace names will be passed stored as typed. When the autoloader compares it will make a case sensitive check first, but since PHP doesn’t require class identifiers and namespaces to be case sensitive it will make a second case insensitive check. If it finds the class this way it will throw an E_STRICT
? would it be better to forego doing that and just store the class / namespace paths under strtolower() and make all the checks case insensitive?

I have a working autoloader that mostly works this way - using a file system scan to find class files - but it doesn’t analyze file contents and depends on the convention that namespace significant directories will have a .ns extension. The system above however is foolproof enough to deal with the poorly thought out PSR-0 standard and also load most class hives that don’t follow the standard.

(The reason I call out that standard as poorly thought out is there is no way to determine, from the file’s position in the file structure, what the class name is going to be - yet changing a file’s position in that structure forces you to edit the class’ name. It’s a double whammy of code inflexibility for no real gain).

That sounds like a classmap autoloader.

For example, http://framework.zend.com/manual/2.0/en/modules/zend.loader.class-map-autoloader.html

EDIT: It looks like Composer can generate the classmap for you.

The class name and the file name always match.

It would force you to edit the class’s namespace. In short, Namespace\ClassName === filepath/filename.php.

I know of a couple people who find this kind of system restrictive. Though, personally, even if the autoloader didn’t require this kind of correspondence, I would do it anyway. If classes can be logically grouped in a namespace, then why wouldn’t I also logically group them in a directory? If I recall correctly, even PEAR worked this way, albeit with underscores as pseudo-namespace separators.

To me a true match must be symetrical. If A = B then B = A, for all values A and B. This isn’t true with PSR-0, and further there’s the issue of case sensitivity in file systems but not in the language itself. This is why I level the accusation that it’s an ill-thought out standard.

First, most file systems are case sensitive. PHP class identifiers are not. Therefore to PHP \myNamespace\myClass and \MYNAMESPACE\MYCLASS are the same thing. On Unix systems /myNamespace/myClass.php and /MYNAMESPACE/MYCLASS.php are not the same files.

That said, PHP’s penchant for allowing such nonsense is a design flaw that can’t be corrected due to massive BC breaks anyway, and the argument can be made that using multiple cap styles in code for the same class is bad practice because, well, it is. But that isn’t the largest problem.

The largest problem is the standard doesn’t create a one to one mapping of file structure to class names. Given file structure

/vendor/package/array/model/class

Are we talking about

\vendor\package\array\model\class
\vendor\package\array\model_class
\vendor\package\array_model_class

Or even (unlikely) \vendor\package_array_model_class

In PSR-0 it is impossible to determine the class name for a given file path. But you still have to modify a class’ name if you move it in the file structure because the file path is significant to the class name and the PSR-0 compliant autoloader’s ability to find it.

It would force you to edit the class’s namespace. In short, Namespace\ClassName === filepath/filename.php.

No. I thought it was that way too until I read it more closely while trying to implement it. There’s a huge issue of underscores in the class names as outlined above.

I know of a couple people who find this kind of system restrictive. Though, personally, even if the autoloader didn’t require this kind of correspondence, I would do it anyway. If classes can be logically grouped in a namespace, then why wouldn’t I also logically group them in a directory? If I recall correctly, even PEAR worked this way, albeit with underscores as pseudo-namespace separators.

True. That said, PSR-0 has you go to the trouble of implementing such a structure, then fails to reward you for it by making it impossible to determine what a class’ name is from looking at it’s position in the file hierarchy.

Not sure what you’re referring to here. The underscores?

Technically this issue persists all the way back to PEAR standards. But in practice, it’s never really been a problem. Probably because developers adhere to coding standards, which includes casing (e.g., StudlyCaps).

I have to grant you this one. The underscore behavior was included for backward compatibility, so that classes such as Zend_Email_Smtp could still work. Though, mixing real namespaces with pseudo-namespaces is something I’ve only ever seen in hypothetical criticisms. I’ve never seen or heard of this causing any problems in practice. Nonetheless, a new PSR autoloader is being drafted that has dropped support for PEAR-style pseudo-namespaces.

Why would you want to? Autoloaders start with a class name then try to find its file. Not the other way around. PSR-0 never intended nor claimed to provide this “feature.”

Well… yeah. The gist of PSR-0 is namespace == file path. It’s popular because it’s simple, and because it’s what we would do anyway. If a class logically belongs in a namespace, then it logically belongs in a directory of the same name.

Not true. There’s more than a few autoloaders out there that scan the directory the classes should be found in, build an array map, then cache that map for later use so that each class load request doesn’t require a disk scan.

PSR-0 never intended nor claimed to provide this “feature.”

It is a legitimate approach to the problem beyond the scope of the short sighted PSR-0 standard. As such, it’s a bad idea and a bad standard. I hope it doesn’t infect the core, although if it does it will certainly won’t be PHP’s worst [url=http://php.net/manual/en/security.magicquotes.php]mistake [url=http://php.net/manual/en/features.safe-mode.php]ever.

Those are classmap autoloaders… which PSR-0 isn’t. It seems as though you’re criticizing a non-classmap autoloader for not being a classmap autoloader.

  1. Your spec seems to say that a given class can only exist in location in the searched directory structure? If that is is the case then that puts me in the 20% group right away. I’ll often have several layers where classes in the upper layer will override the classes in the lower layer. I control the order in which layers are searched by the autoloader so it works fine.

  2. Your spec calls for the use of gob to scan files. As a long time users of Solaris I can assure you that that gob is unreliable in many non-gnu unix systems. Your fix for the case sensitivity issue seems to introduce yet anothe files system issue. Again, 20% for me.

  3. Your systems seemed intrusive for regular development in that any change to a class mapping will require regenerating the autoload map. You would really have to produce an incredibly efficient cross-platform scanner that won’t hinder the developer.

  4. The complexity of your regex processing would make it a show stopper for me. Contrast it with the 10 lines or so of code needed to make a minimal psr-0 based autoload implementation. Feel free to prove me wrong on this.

  5. Your spec clearly focuses on php classes which is fine. But psr-0 can be applied to other resources such as template files. You can get carried away with trying to take it to far but it actually works quite nicely.

  6. Many psr-0 based autoloaders currently exist and are very widely used. In fact I would estimate that their applicability rate is close to 95%. Someone would really have to produce some code based on your spec before it’s advantages will be fully appreciated.