[Advanced] Ideas for a universal PHP front controller class in the vein of PDO

Michael_Morris1 · January 30, 2013, 5:15pm

Something I think that PHP badly needs in the next release or so (or at least is approaching very high on the want list) is a default centralized front controller. The existence of such a controller, or router, comes into play the moment you turn mod_rewrite rules on. Whether it’s the one found in Zend, Cake, Symphony, or what have you, they all do the same thing. I think a lot of programs could be sped up through the adoption of a common front controller that lives in C++ as part of the PHP engine rather than being a PHP script.

At a glance there would appear to be some overlap with the auto_prepend file directive. However, the major difference is that the script that is ‘touched’ by the webserver may never get parsed at all depending on what you decide to have the your logic for the front controller do - a good example would be any page in the admin directory when authentication is failed - you’ll want to redirect to login.

From a PHP standpoint this class starts as a per_directory ini directive. At the most basic it would look like this.


php_value front_control my/path/to/front/control.ini

By default this value in the htacess file loads an ini for a front controller for the directory, and passes all requests for PHP files and all requests for UNKNOWN files pass to the front controller. The front controller’s ini will look something like this


[core]
path: my/path/to/framework
router: RoutingClass ; This must extend from PUL\\Route
autoloader: my/path/to/autoloader

The core path is where the PHP files are going to live. If not defined, it is set as the webroot. This allows a framework to place its php files outside of htdocs.
The core router is simply a class you wrote yourself that extends the core - sorta like extending the core of php
The core autoloader should be obvious. If the autoloader syntax is used multiple autoloaders and their order of precedence can be set. If no autoloader is defined an internal autoloader will fire that follows the rules of PSR0 (so if your project uses PSR0 then you can have your autoloader in C)

At the heart of this is PUL - PHP Universal Loader. This class behaves in one VERY different way than any other class you’ve written PHP for - it never really dies. Objects and data attached to it persist from page request to page request, though session data for individual users will change. This allows you to create a framework with a persistent heart to cut down on load times. Or at least, that is my conceptual idea - it may be a bad one (and that’s what this thread is for - for experts to point out what is good and bad about all of this).

The psuedocode map of the class is this


<?php
/*
 * PHP Universal Loader
 */
class PUL{
  /*
   * First off, a copy of the super globals that is read only.
   */
  private $post
  private $get
  private $cookie
  private $files
  private $server
  private $files

  /*
   * This class starts session automatically, and the session hive is attached.
   * By default this is read only, but your extending classes can modify how
   * they work with the session
   */
  protected $session

  /*
   * The HTTP Headers - this information currently lives in $_SERVER with
   * an HTTP prefix on the value.  This is misleading because unlike the
   * other server values these are user provided and therefore forgable.
   * For example, $_SERVER['HTTP_REFERRER'] is frequently forged.
   */
  private $headers

  /*
   * The config ini, after parsing, ends up here as read only by the child classes.
   */
  private $config

  /*
   * Like PDO_Statement, the constructor of PUL is protected - it can only
   * be invoked by the PHP Engine itself which does so the first time a
   * page is requested.
   *
   * This is where you'd put your startup stuff that you don't want running
   * on every page load.  The core construct function starts the autoloaders
   * and parses the configuration INI which can be set in either the htaccess
   * or the PHP ini file.
   */
  protected __construct( $config ) {

  }

  /*
   * If the webserver encountered a PHP file, it hands it to this class and
   * it gets resolved here, in the context of this function, and within an
   * output buffer.
   */
  protected resolve( $phpfile = null ) {
    ob_start();
    if (!is_null($phpfile)) {
      require($phpfile);
    }
    return ob_end_clean();
  }

  /*
   * By overriding main you can setup your own routing system.
   */
  protected main( $phpfile = null ) {
    $this->output = $this->resolve($phpfile);
  }

  /*
   * Sends a cleartext or compressed response as determined
   * by server settings.
   */
  protected sendResponse() {}

  /*
   * Actions to clean the class and prepare to handle the next request
   * go here.
   */
  protected cleanup() {}
}

The major change that PUL introduces is that you don’t have to have php docs in your htdocs directory at all, but instead use it as a cache if you like. When you write PHP files in there you can bring up a limited version of the framework to handle them, or the full framework if you like. The php files you do set in the htdocs directory are effectively wrapped by PUL and execute in it’s scope. This would allow something like this…


<?php
class MyCore extends PUL {
  /*
   * We'll attach the database object here.
   */
  protected $db;

  /*
   * We'll start the database now.
   */
  protected __construct($config) {
    parent::__construct($config);
    $this->db = new MyDatabase( $this->config['database'] );
  }
}

Then, our index.php file over in htdocs can get right off to the races…


<!doctype html>
<html>
  <title><?= $this->db->queryFirst("SELECT title FROM pages WHERE page = 1"); ?></title>

Now, keep in mind the above is meant to be illustrative of what this would allow. Like many powerful things, there’s good and bad ways to use it, and I’ll admit that quick example isn’t the best example

Giving a good example would involve pouring through a lot of files, the scope of which is largely outside what I’d like to discuss here - a universal load and setup system within the PHP Standard Library, which would be as flexible and powerful as PDO is for database accessing, but no more mandatory.

Thoughts?

MeagerSir · January 30, 2013, 6:11pm

… - it never really dies. Objects and data attached to it persist from page request to page request, though session data for individual users will change.

Sounds like you want php to behave like a Java Servlet. What can I say - switch to Java. I don’t think php will have this feature.

Michael_Morris1 · January 30, 2013, 6:34pm

Definitely not helpful. You created an account specifically to make that post?

MeagerSir · January 30, 2013, 6:38pm

Maybe. Maybe I’ll make more posts some day.

But your idea sounds interesting. How do you see using this object? When will it load into memory - on the server restart, on first time it’s used?
How do you plan on making modifications to your custom sub-class of this class? You will need to restart the server after you change something in your class?

Michael_Morris1 · January 30, 2013, 6:52pm

All of those questions I’m undecided on. For the moment I just noted that the task of locating the local controller doesn’t vary much from my own framework, or Zend, or Symphony or Cake. There is a common point task here that would be better served at the engine level, a task created by the use of mod_rewrite at the webserver level. The moment such a rewrite is introduced the framework has to begin mitigating requests PHP wasn’t orginally tasked for.

While MVC architecture is applied to many PHP projects, PHP itself doesn’t have an MVC friendly mode. It is still largely a procedural language that happens to have support for objects. This loader class’ main job is to provide the engine and the coder with an object mode without dictating the exact structure that the rest of the framework will take.

MeagerSir · January 30, 2013, 7:03pm

What’s the point of having $pot, $get, $cookie member variable in PUL object if it’s shared between requests? I mean $cookie for which request? $post for each request?
In order for this sort of thing to even work your PUL object has to create a new thread for each request - basically a copy of the object will have to serve a request, each one will have unique $post, $get, $cookie objects.
Also even with creating new thread per request multiple requests will end up with access to the same member objects, which right away will introduce concurrency problems because php does not have thread-safe versions of its classes or arrays.

And how will your PUL object behave in case the member object like $db in your example throws an exception or has fatal error? Your whole PUL object may crash, in that would be equal to a server crash because no more requests could be served.
At least with the way php currently works a crashed php file will not bring down the whole server, but with your PUL object it may.

TomB · January 30, 2013, 7:06pm

Interesting concept but how are you planning to make it “never really die”? If that is the case, the superglobals shouldn’t be attached to it because you want those (with the exception of cookies) to expire at the end of each request.

I wouldn’t force people to make their router extend from a base class either, that severly limits flexibility. Especially if your goal is to be unobtrusive.

For me, it depends how you’re handling persistence. For each user who’s on the site, will you have a set of all the framework-level resources (e.g. database connection) loaded in memory and constantly running? Will each of the users use a different instance of this resource or the same one? How will you handle state changes within that object if everyone is using the same one? (consider pdo::lastInsertId as an example)

Michael_Morris1 · January 30, 2013, 7:18pm

It wouldn’t be. I’ve misspoke while jotting this down I think.

By ‘never really die.’ I want the object to basically remember it’s state once construct is done. When the PHP engine gets a new request, it clones the object from that ready state instead of building a new one, and sends the clone off on its way to serve that one request. Each clone normally closes with its connection, but if a persistent connection (websockets) is requested, it would stay around. Another possibility is the clone sits around for the life of the session. If a new request against that session ID is made that clone is woke up instead of creating a new one. In this event it needs to do some connection close cleanup, and perhaps some final session close actions.

For end programmer simplicity I’d like this layer of complication to be hidden as much as possible without creating new headaches I’m also trying to come up with a means to more effectively cache the universal loads and startup sequences all frameworks go through.

TomB · January 30, 2013, 7:40pm

Michael_Morris:

It wouldn’t be. I’ve misspoke while jotting this down I think.

By ‘never really die.’ I want the object to basically remember it’s state once construct is done. When the PHP engine gets a new request, it clones the object from that ready state instead of building a new one, and sends the clone off on its way to serve that one request. Each clone normally closes with its connection, but if a persistent connection (websockets) is requested, it would stay around. Another possibility is the clone sits around for the life of the session. If a new request against that session ID is made that clone is woke up instead of creating a new one. In this event it needs to do some connection close cleanup, and perhaps some final session close actions.

For end programmer simplicity I’d like this layer of complication to be hidden as much as possible without creating new headaches I’m also trying to come up with a means to more effectively cache the universal loads and startup sequences all frameworks go through.

I think you’ve highlighted the real issue here: Most framework entry point routines are ridiculously weighty. Simplify those and you don’t need to cache them. All you really need to do is:

-Initialise router
-Find the route
-Create object graph for the module in use
-Dispatch controller

And only the first of those is identical each time and would be a candidate for caching. Everything else should be created on the fly as it’s required. Really, the answer is for frameworks to have minimalist bootstraps.

Back on topic, the problem with the clone method is that it’s likely slower than the traditional method because you need to read both the class definition from the file and the state from the cache. In some cases, this may be beneficial because it will allow you to retain state. The problem is, how are you mapping state to users? And what kind of state are you planning to store? Everything?

Let’s say you have a database query result. Someone searched the database and was presented with a list of products. Now, it may be useful to keep that list in memory, the user might want to sort them by price, relevance or name… so the user clicks “sort”. Presumably your front controller transparently fetches this list from the cache, mutates the data and returns the sorted data to the user. Without PUL, my controller code simply calls, $this->model->setSort(PRODUCT::SORT_NAME); the view then calls $model->getProducts(); and the list is returned to the user. Whether those records come from the database or the cache, I can’t quite see how maintaining a state is beneficial, because the state should be encapsulated within the model.

Michael_Morris1 · January 30, 2013, 7:43pm

Agreed on all points above. Still, I think a generic, “init, find route, create object, dispatch” would be well worth migrating to the engine, if only to encourage a common baseline for the process as PDO provides a common baseline for database interactivity.

Jeff_Mott · January 30, 2013, 8:28pm

It’s an ambitious goal, but I think the generic process you describe may be an oversimplification. Often times, before you can find a route, the framework will first need to load modules/plugins/bundles, and every framework does that differently. Routes are often defined in config files. What format and what parameters should those config files have? Where should they live? After the config files are parsed, should the result be cached? How? Where? Should there be a security authentication and authorization step before dispatching the controller? That opens a whole other bag of worms.

I think there may be too many framework-specific details to nail down a generic process.

Though, for what it’s worth, Symfony’s HttpKernal component sounds reasonably close to what you’re describing.

Jeff_Mott · January 30, 2013, 8:59pm

Info on Symfony’s HttpKernel for anyone interested.

Serenarules · January 31, 2013, 1:30am

I’m not sure I see any benefit in this. In my opinion, your web server (IIS or Apache) IS your front controller. Unless you plan on using something like .NET’s Application scope (which can manage the lifetime of an object and reuse it over the course of an entire application, all sessions, and all requests (which is something PHP doesn’t do), then this idea doesn’t really add anything. Unless I’ve missed something.

Michael_Morris1 · January 31, 2013, 12:44pm

mod_rewrite

Serenarules · January 31, 2013, 1:31pm

Obviously. I consider that part of apache though. What I meant was, I don’t see any need for a PHP based script on top of that, which does the same thing. Apache + mod_rewrite and htaccess is already a fully functioning front controller.

TomB · January 31, 2013, 1:46pm

Michael, I think there’s some merit in this idea but you need a much clearer, more focussed mission statement.

Essentially what you’re trying to achive is similar to what apache does (maps a URI to a local file). but you want to ignore the concept of “files” entirely and go straight for URI => Class->method();

This makes sense, consider the current chain:

Apache recieves URI
mod_rewrite it to a central entry point
The PHP script analyses the URI for routing (for the third time! apache has alread done this once, and mod_rewirte has done it again)

You want to eliminate the redundancy and simply give PHP the URI to begin with. You could skip apache entirely and use PHPs inbuilt web server to do this. You would, however, lose all the power apache offers and have to add code to handle serving of non-php files. I’m not sure you’ll gain much by offering an apache+php solution because all of the redundancy has happened by the time the first PHP code is even executed.

Michael_Morris1 · January 31, 2013, 3:52pm

My terse post is an allusion to this - mod_rewrite pretty much bypasses Apache’s role as a front controller. With it in place PHP must deal with situations like 404 that, prior to mod_rewrite’s deployment, it wasn’t expected to handle. What I’m suggesting is a general structure to use in this environment - not a replacement to Apache outright.

cpradio · January 31, 2013, 4:00pm

Hmm… I’ve been reading/following this unsure if I full grasped the concept you were trying to unload or not (still out on that one), but I think the statement “PHP must deal with situations like 404” is invalid.

The only reason PHP must deal with a 404 is because you failed to use a proper expression in your mod_rewrite and sent too much to PHP to begin with. I believe @dklynn ; would agree with me on that. You should really only be forwarding what you intend PHP to handle, by passing it more than you expect it to handle, you have set yourself up for that situation, it is by no means the fault of anything else.

TomB · January 31, 2013, 4:57pm

I’m not sure I agree with that statement. As a trivial example let’s say I have a simple site that pulls some HTML into a generic layout. I use /foo.html and it displays the contents of foo.html within an overall layout. I don’t want mod_rewrite to be concerned with whether or not foo.html exists because it may? in the future. It’s up to the PHP script to decide how to handle files which don’t exist in this case. Otherwise you create extra work for yourself because you have to make two changes every time you add a page. Having to do this creates problems because it can get out of sync. KISS and all that!

Jeff_Mott · January 31, 2013, 5:22pm

Even if you were very careful to mod_rewrite only URLs that PHP is expected to handle, the PHP would still have to deal with 404s, because some URLs such as /blog/{slug} would require a database lookup before we could know whether it’s a 404. And if our PHP is already required to handle 404s, then we don’t gain anything by mod_rewriting only URLs that PHP is expected to handle.