Some PHP, Generated with Python

Tweet

Been looking at some code recently that provides a mechanism to localize a user interface for different human languages. The mechanism for doing so looks something like this (repeated many times over);


function drawNameForm() {

global $lang;

?>

:

}

The global variable $lang is a giant associative array and is loaded before this function is called, depending on the users preferred language, perhaps something like these files, in terms of a PHP arrays;

// Filename: lang.en.php;

$lang = array (

'name' => 'Name',
'btn_ok' => 'OK',

);

?>

// Filename: lang.es.php;

$lang = array (

'name' => 'Nombre',
'btn_ok' => 'AUTORIZACION',

);

The code in question isn't an exception. I've seen many PHP applications handle localization in this way. In Internationalization and Localization with PHP the conclusion a "naive" reader might come to is this is the default best practice. Nothing wrong with the article but the necessity of simple example / short prose leaves other areas to the readers imagination.

The big problem with localization in this manner is that it's happening at runtime. Every page request resulted in the localized version of the page being generated dynamically - see here for a rough idea of why that's not so good.

Now it may be possible to cache the output HTML and store it as a static file but in a real example it's likely we'll have truly dynamic data mixed in there, such as something from a database, which makes caching tricky. Without caching the translation adds a significant baseline overhead to the generation of each page, which will increase the more complex the UI becomes.

Meanwhile there will be only a finite number of translations of the user interface, which will vary only when the UI itself changes (rarely - certainly not on a pre-request basis). So wouldn't it be better to have multiple versions of the drawNameForm() function, one per language e.g.;

// Filename: ui.en.php

function drawNameForm() {

?>

Name:

}

And...

// Filename: ui.es.php

function drawNameForm() {

global $lang;

?>

Nombre:

}

Of course who want's to manually maintain multiple versions of the same code?

For this specific problem solutions have already evolved, typically packaged as template engines. Jeff's file schemes implementation in WACT does "JIT" generation on PHP scripts, as you can see here. I believe (haven't looked) Smarty has similar functionality. Fine as long as you're happy with the template engine.

There are other, similar, categories of problem though, which have not been well solved yet. For example application configuration. Here's another suggestive snippet;

if ( $config['allow_bbcode'] ) {

if ( $user_sig != '' && $user_sig_bbcode_uid != '' ) {

$user_sig = ( $config['allow_bbcode'] ) ?

bbencode_second_pass($user_sig, $user_sig_bbcode_uid) :

preg_replace('/:[0-9a-z:]+]/si', ']', $user_sig);

}

if ( $bbcode_uid != '' ) {

$message = ( $board_config['allow_bbcode'] ) ?

bbencode_second_pass($message, $bbcode_uid) :

preg_replace('/:[0-9a-z:]+]/si', ']', $message);

}

}

}

The key line is;

if ( $config['allow_bbcode'] ) {

On every page request this condition (as well as many others) is being re-evaluated. Wouldn't it be better to simply eliminate the block of code, if $config['allow_bbcode'] is false? There have been attempts to build PHP installers, some very successful, designed for a single application (e.g eZ publish 3) with the most mature, generic attempt being Sandro's Zzoss Installer, which I've mentioned before here.

Around the point of writing installers in PHP is where I think things start to "fall apart", in general because PHP was designed specifically for web sites - it's not a general purpose solution. That's not to say "never" but rather, by taking other technologies into account, can life be made easier / PHP applications better?

Code generation has come up on this blog before. Another link from http://www.codegeneration.net/ since then, here;

PHP is amazing because it's so inexpensive to run. All you need is Apache, MySQL and PHP and you can run a web site or service.

...and add to that how easy it is to deploy a script - just drop it somewhere on your web server and away you go - something that opens the doors for "just in time" generation, for example.

Code generation often tends to be thought about in terms of building complete applications with some friendly "drag and drop", such as CodeCharge Studio. While that has it's place, it tends to be "all or nothing".

An interesting and alternative form of code generation, that I ran into today, is Aspect-Oriented PHP. It uses Java to do some basic parsing of a PHP script and merge with a second "aspect" script (see here to get a sense of what AOP is about). Right now it probably isn't a realistic proposition, in terms of performance but it's interesting because, first, they're using Java to do the work, which gives them a solid basis for dealing with things like multibyte characters or more complex parsing operations and because they're generating PHP on the fly.

Returning to the localization problem above, been looking at empy recently: "A powerful and robust templating system for Python". Some things that make empy attractive, so far, is it's intended for general purpose templating (not HTML specific), it's mature, the markup is very distinct from PHP and HTML (the only thing you really need to watch is the @ symbol) and it allows you to use Python itself as the template language, when needed.

An empy template to generate PHP from the drawNameForm() could look like this;

function drawNameForm() {

?>

@(name):

}

The empy markup I've used is: @(name) and @(btn_ok) (empy has alot more than that but, for now, sticking with some basic variable references).

Viewing the output from this function in a browser results in the following HTML source;

@(name):

In other words I can still use it as a working PHP script (in this example at least).

Meanwhile if I run it through a python script (which in turn invokes empy) like;


#!/usr/bin/python
# translate.py

# Load empy
import em

# Use a dictionary (like an associate PHP array)
# for sake of example. In practice use external files...

langs = {}

langs['en'] = {

'name': 'Name',
'btn_ok': 'OK',

}

langs['es'] = {

'name': 'Nombre',
'btn_ok': 'AUTORIZACION',

}

# Create an empy interpreter

interpreter = em.Interpreter()

# Load the template PHP script
tpl = open('ui.tpl.php').read()

# Loop through the available translations

for lang in langs:

# Give the interpreter a new file to write output to
interpreter.output = open('ui.'+lang+'.php','w')

# Reset for new parse (making it use the output file)
interpreter.reset()

# Populate the interpreters data space with the word list
# to write into the template
interpreter.globals = langs[lang]

# (Re-) Parse the template
interpreter.string(tpl)

interpreter.shutdown()

It spits out localized PHP scripts with filenames identifying the language e.g.;


function drawNameForm() {

?>

Nombre:

}

I can now use it as part of the build process for my application, as it's being prepared for a release. At runtime the code (hand coded) which uses this script might look like;


// Default to English in case of invalid values...
if ( !isset($_GET['lang'])|| !preg_match('/^[a-z]{2}$/',$_GET['lang']) {
$_GET['lang'] = 'en';
}

// Load the file of localized functions
require_once MY_LIB . 'ui.'.$_GET['lang'].'.php';

// Call the localized function
drawNameForm();

Anyway - just musings and an obvious solution, if you're already doing it. What I can't say is how well this kind of approach scales to a large PHP codebase but, guessing, so long as the scripts involved remain simple, focusing on performing a single task, no real problems.

Some particular points about Python. By taking advantage of things like py2exe or py2app you could distibute executable installers. Throw in some wxPython and you've got a cross platform GUI (which looks like the real thing - uses native widgets) for installing.

Any stories?

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • Alan Knowles

    Alot of this is built into the Flexy template engine – It parses out strings at compile time, stores them so you can write a translation tool (example is included in the package) – and uses pear’s translation2 package to translate the strings to create language specific compiled templates.

    You can also place translation markers in the HTML to replace specific HTML blocks with tranlated versions (espessically usefull when the grammer mix of variables, text and HTML differes in different languages). It also has the long term benefit of encouraging you to remove all the langage specific code from the application and put all output words etc. into the template.

    Of course this is still wonderfully undocumented – but has been tested on some very high traffic sites, in a few european and a few double byte languages.

  • Ren

    I’ve been tempted to experiement in using double xslt transforms for producing interm templates.
    I guess this would be good usage of it, if it turned out a useful method.

    First tranformation would be a language independant template & language xml to produce a language specific template which then cached and then feed content xml to the final result.

    Writing a xslt which produces an xslt maybe bit too much hassle. Possibly could automate (with a few more transforms) this somewhat thou.

  • http://www.procata.com/ Selkirk

    I really like the phrase “Just in Time Code Generation.”

  • patrikG

    Excellent blog entry, Harry. Been thinking about Python and where it makes sense to use it for quite a while now. I’ve been thinking about localization from within MVC, but was somewhat horrified by the number of views necessary. It has boiled down to me using a template engine again (which I don’t generally like, as PHP is/was a templating language). Python, yet again, seems to offer a great solution. Time for me to have a much closer look at it.

  • culley

    I think if you are really going to internationalize an application you need to make peace with a template engine.

    I am a little surprised that the php community hasn’t glommed on to clearsilver templates (http://www.clearsilver.net/) with the php bindings (http://www.geodata.soton.ac.uk/software/php_clearsilver/). Clearsilver forces a clean separation between business logic from presentation logic. You can localize an application by simply switching the hdf data files. Bloglines.com is using this technology for their recent internationalization efforts.

    Everyone has their own preferences for template engines but many huge applications (bloglines, google groups 2, yahoo groups, Plaxo) are using clearsilver. The ability to swap out the back end programming language and leave your templates entact is convenient.

    culley

  • http://www.phppatterns.com HarryF

    I’ve been tempted to experiement in using double xslt transforms for producing interm templates.

    Could be this makes interesting reading. Markus says good things about the relationship between XSLT and Haskell.

    I am a little surprised that the php community hasn’t glommed on to clearsilver templates

    Thanks for the link. Now glommed.

    I really like the phrase “Just in Time Code Generation.”

    Thanks – hadn’t occurred to me while writing – wonder if it makes the same league as Moore’s Law? Could also make a nice subheading in a book on PHP project management, under “Meeting Client Expectations”…

  • mlemos

    I find it odd, to say the least, suggesting the use another language to generate PHP. It sounds as if PHP can only be used to generate HTML language pages, but not source code in other languages. It is all text after all.

    PHP generating PHP code is an old story. If you just think of Smarty and most other template compiler engines, you realize that using PHP to generate PHP code has been done for many years now.

    Smarty concept of compiling templates into executable PHP code was started in this thread of list php-template. If you prefer newsgroup access you may find the thread here:

    news://news.php.net:119/4598.84T2941T13386538mlemos@acm.org

    The idea of antecipating template processing to a compile time stage and generate executable PHP code was taken from one of the modules of the MetaL engine.

    Smarty generated code is not as efficient as the code generated by the MetaL engine template module but that is another story.

    Anyway, after Smarty many template compiler engines appeared imitating Smarty’s approach, but all were written in PHP.

    As a side note, regarding the use of the MetaL engine for generating PHP code, Metastorage is the largest PHP code generator application that I have written based on MetaL.

    Despite it comprises now about 44,000 lines of code, Metastorage usage is simple, throughly documented, and the gains in productivity and code efficiency and overwhelming because the developer does not have to spend an iternity on tedious jobs like writing repetitive code, testing and debuging manually.

    I have recently launched the PHP Classes support forums using Metastorage to generate classes that act as interfaces to store the forum, threads, messages, subscriptions objects in a persistent storage container (read the MySQL database of the site). I am very happy with the gains in efficiency.

    Back to the subject, besides template compilers, there are many types of code generators written in PHP. Besides the generators of PHP code listed in CodeGeneration.net, which are not all written in PHP, you may find here several classes written in PHP to generate PHP code for several purposes.

    The bottom line is: if you open your eyes and realize that PHP can also be used to generate PHP code, you do not have to learn yet another language and probably write PHP generators more proefficiently.

  • akie

    I agree with mlemos. Why Python? PHP has enough power to accomplish such tasks.

    One quote from you article, Harry :-)

    Before I go any further I should mention that, as you’re probably aware, PHP is not the only choice for writing command line scripts. Both Perl and Python, to name just two, are widely used for writing command line applications and, in many cases, make better a choice than PHP. They provide a mature set of tools for common problems and, typically, better performance.

    So, why use PHP? An obvious — and very good — reason may simply be that you know PHP better than the alternatives. Less obvious is that, if you’re developing a Web application in PHP, writing supporting scripts in another language can lead to extra headaches — even if you’re confident in both. There’s both the human aspect of having to switch programming “mind sets”, and the overhead of having to support two platforms and the potential missed opportunities for re-use; data access logic may need to be implemented twice, for example.
    http://www.sitepoint.com/article/php-command-line-1

  • http://www.lastcraft.com/ lastcraft

    Hi.

    I actually prefer to write generators in a different langauge if I can. Otherwise escaping special characters which are also special in it’s own syntax can become a real nightmare.

    The Empty use of “@” not clashing with PHP is a good example.

    Adds to deployment hassles though, and your developers have to be “multilingual” in the programming language sense. Take your pick.

    yours, Marcus

  • http://www.phppatterns.com HarryF

    Can see I’ve suggested sacrilege ;)

    Don’t have the time right now to do this discussion justice so some quickish remarks for the moment, that may clarify where I’m coming from.

    The example here involves templating where the “static” parts of the templates themselves are PHP and may also contain HTML (vs. normal PHP templating, where the static elements are only HTML). Considering existing PHP templating tools, I could see nightmares ahead from the perspective of template syntax.

    I want the template’s to have their own, imperative language for this problem, because I want flexiblity. That the template language can be Python helps make the problem alot simpler.

    Also Python already has a bunch of parsing tools (excepting what Marcus and Alan have done, PHP doesn’t – PHP developers aren’t really focusing on that sort of problem and when serious parsing is needed, the core team uses C).

    Even better that there’s empy and it’s ready to go – I don’t even need to think about parsing or even a template language.

    Ultimately think the solution speaks for itself – no effort to implement.

    To re-use that quote of myself;

    Both Perl and Python, to name just two, are widely used for writing command line applications and, in many cases, make better a choice than PHP. They provide a mature set of tools for common problems and, typically, better performance

    For me this was a situation where Python was the better choice.

  • http://www.bitExpert.de Ab7zCh3kR


    Of course who want's to manually maintain multiple versions of the same code?

    Not me :)

    It`s actually all about creating software system families (aka system-family engineering or product-line engineering). A software system family describes all possible ways in which a specific software can be build (or put together, if you prefer). The customer describes “his software” by using a domain specifc language and a software generator automatically creates the software from this specification. The resulting software is optimized for it`s purpose.


    On every page request this condition (as well as many others) is being re-evaluated. Wouldn't it be better to simply eliminate the block of code, if $config['allow_bbcode'] is false?

    This is also a member of a software system family. In this case we have an familiy member without the feature bbcode. This might be done by an installer, but as far as I can tell installers work for simple solutions, but not for a software with many possible features. A software generator with configuration knowledge (this is to know which feature depends or excludes other features) can handle it.

  • donald lobo

    I recently did something very similar to the above and used a combination of:

    1. XML for the input using SimpleXML in PHP5
    2. Smarty for the output
    3. PHP_Beautifier to make it look pretty :)

    worked great and gives me good seperation and flexibility

  • mymame

    i am using flexy and when piece of html code is sent to template from controller, the html tags such as are getting parsed to < and &rt; respectively. How to send the code as it it?