It's me again, that guy that never stops whining about global variables... :-)
I believe that all of these objects can be seen as 'layers' or 'tiers' (?): a database abstraction layer, user interface, authorization layer and configuration.
It's exactly the opposite: when using global variables, you LOSE the layers. When you think you have a system of, say, 6 layers, and you put the database object $DB in the first and use it in the sixth (as 'global $DB', that sixth layer becomes, by definition, the second layer: in an n-layered system, layer n directly depends on layer n-1, but not on any layer lower than n-1. That's the whole point of layering. When your sixth layer becomes in effect the second layer, the third, fourth and fifth layers no longer exist. And gone is your layered code. Thank you global variables!
On the other hand: how many layers does a typical PHP application have? That depends on how you code it, of course, but most of the code I get to see (in forums and such) has at most two. In that case you can argue that using global variables isn't that bad, because there are no layers to mess up anyway... My answer to that: if an application use only 2 layers, it has a bad design.
Let's examine the typical PHP code for printing a list of articles (or books, or whatever):
PHP Code:
function printList($category_id)
{
global $DB;
$DB->query("select id, name from article where category = $category_id");
while ($row = $DB->fetchRow())
{
print "<a href=\"article.php?id=${row['id']}\">${row['name']}</a><br>\n";
}
}
Code like this is very common. Although a very short function, it does many things:
- It executes an SQL query
- It traverses all rows in a query result
- It prints each row in some way
When you think about this for a while, you'll see that there are forced dependencies in this code that needn't be there. You can see these dependencies by looking at the code in 'reverse order':
- Information on a single article is printed. Where does this information come from? It currently comes from a set of rows, and there's no way to change that.
- A set of rows is traversed. Where do these rows come from? They currently come from an executed query, and so it will be forever.
- A query is executed on a database connection. Where does that connection come from? It is currently imported in the function by using 'global', making this function directly dependent on the layer the database object was declared in.
I'm not saying this is all wrong, or that it should be any different. For the sake of the argument, I'm only using a very simple example. Anyway, it's clear that this code, although doing much, has only one layer. And that has its consequences. What if the articles should be printed differently, or the query must be changed, or a different database connection should be used, or a whole different datasource (a flat file)? The only way to change those things now is by updating ALL of the code. Because it's all in the same layer. (Note again: it's a very simple example, hopefully you get the idea.)
Another way of implementing the above example is like this:
PHP Code:
function selectList(&$database, $category_id)
{
return $database->execute("select id, name from article where category = $category_id");
}
function printList(&$it)
{
for ($it->reset(); $it->isValid(); $it->next())
{
$row = &$it->getCurrent();
print "<a href=\"article.php?id=${row['id']}\">${row['name']}</a><br>\n";
}
}
$result = selectList($database, 12);
$iterator = new QueryIterator($result);
printList($iterator);
// Or, more compact:
printList(new QueryIterator(selectList($database, 12)));
An explanation is in order: the method 'execute' of class Database returns an object of class QueryResult, which can be used to access rows in the result. These methods are not part of the Database class (like before in '$DB->fetchRow()), because I want to be able to run another query while I'm still processing the first. By separating the database connection from the query result this is suddenly possible (and the code gets simpler as well). To process the rows I could either access each row through the interface of class QueryResult (e.g. '$row = $result->getRow(3)'), but in this case I wrap it up in an 'iterator', which is passed along to the function 'printList'. I also have iterators for built-in arrays and strings, lines in files, nodes in XML trees... All iterators have the same interface, so function 'printList' has become a generic function (it works on any datastructure I have an iterator for) instead of a specialized one.
Looking at the functions calls in the pipeline, you'll see that they are printed in the same order as the dependencies (or: reversed from what they were at first): show results -> get list -> select data. That's not just a coincidence!
Even though this is a very simple example, hopefully you'll notice the following:
- Function 'selectList' knows about the database connection.
- The QueryIterator knows nothing about the database connection, only about the result from some query.
- Function 'printList' knows nothing about the database connection or the fact that the data it processes comes from an executed query.
If I have code like this, and I use a global variable in function 'printList', the nice layering I got is immediately destroyed. And that's definitely not what I aimed for when I designed and coded it this way! Maybe in this particular case it's not much of a problem. But in a larger application, with many layers, it certainly is.
There are many other remarks I could make about this example (why this is so much better ;-)), but this is not the place (nor the time; it's getting late here). The subject was 'global variables'. Once again. <<Sigh>> I'll say it one more time, and I will probably have to repeat this 'til my dying day: "global variables are evil!"
Vincent
Bookmarks