SitePoint Sponsor

User Tag List

Results 1 to 3 of 3
  1. #1
    SitePoint Wizard gold trophysilver trophy
    Join Date
    Nov 2000
    Location
    Switzerland
    Posts
    2,479
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    parsekit- regenerating PHP with PHP?

    Been looking at parsekit - interesting stuff.

    Unlike the tokenizer extension which gives you the output from lexical analysis of a PHP script, parsekit it spits out OPCODEs - nice diagram here to help understand this.

    Now there doesn't yet seem to be a way to pass the type of output parsekit delivers back to the PHP VM for execution (perhaps we'll see this come from runkit in due course), which would mean you could write PHP code in OPCODEs, but this output could perhaps be used for code generation.

    Why code generation (and I mean active code generation - you don't manually interact with the generated code) ? Because if you can flatten all those requires, includes, classes and functions into a single procedural script, with zero function calls, you could radically improve performance. The knowledge contained in the OPCODEs is enough to make a start on generating code like this.

    Not sure right now if that would be major effort or not but could be interesting to try.

    What's obviously missing from parsekit's output is runtime information, which can drastically alter the "code path" though a collection of PHP scripts, based perhaps on values a user entered (i.e. unpredictable)

    At the same time, the more I think about it, I start to wonder if I'm seeing a "big picture".

    That goto business explained. Who's the common author on parsekit, runkit (notice the sandbox?) and classkit?

    It's worth taking a look at psyco, which is a tool for speeding up Python. From the intro;

    Unlike the Java JITs, which writes one machine-code version of each of your function and delivers a constant speed-up (typically around 2x), Psyco uses the actual run-time data that your program manipulates to write potentially several versions of the machine code, each differently specialized for different kinds of data. Depending on how well it can do it, you can get smaller or higher speed-ups. In extreme cases, when all computations can be done in advance, nothing remains to be done at run-time.
    It's also worth checking out the psyco docs, esp. the animated ones (you'll need to install Python and PyGame to run them).

    The point here is if you want to regenerate some nicely abstracted PHP scripts into a single giant spagetti monster, which uses no functions at all, you'd most likely want a fast goto statement which would be used for multiple representations of a function you've generated that get executed based on runtime types etc.

    The final piece of this "big picture" (read: I've been watching too much X-Files) is http://www.ning.com - (./ annouced here). Now Marc Andersen recently joined Zend and there's some names you'll know to be found here (not Sara's though but George is on the list).

    The buzz with Ning is you can write PHP scripts and execute them within the "Ning Framework". Not sure what the fine print is here but so far all I've read suggests there's no particular limitations on how you use PHP - much like a typical shared host but your code is also being mingled in with the Ning framework.

    Now Stefan Esser raises some relevant security questions about Ning. And unless Marc is employing a lot of syadmins to keep his servers running, there's no way Ning is simply executing vanilla PHP scripts (side note - Sourceforge recently made their filesystem read-only to PHP, after years of suffering).

    Perhaps safe mode is being used by Ning but it's not exactly safe and you could still grind ning to a halt with an infinite loop, executed enough times. Alternatively perhaps ning uses a highly modified PHP install, where you've either removed or re-implemented alot of native functions to be well sandboxed, but we're talking C here and without the years of live testing that have gone into PHP's own functions.

    Perhaps instead you'd use a sandbox to test user scripts in. Would also be cool if you could reconstruct those scripts, using a language that's fairly quick to work in (like PHP itself), into their fastest possible runtime form while replacing potentially problematic calls to things like fopen to your own safe API.

    So perhaps this has already been done. Probably not but anyway - it's Friday

  2. #2
    SitePoint Addict
    Join Date
    Aug 2003
    Location
    Toronto
    Posts
    300
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Nice post. I agree that it would be interesting if userland could supply raw opcodes to the runtime. As it happens, I have asked the "common" author about this on irc a little while back: apparently it is true that there is no known existing extension for this purpose but it was suggested that APC may have relevant code that can be used as a base if someone wants to tackle that. Of course, the whole thing borders on that razor-sharp sanity line

  3. #3
    SitePoint Guru
    Join Date
    Nov 2002
    Posts
    841
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Nice post Harry. I had no idea ning ran in a sandbox. Brings new life to the PHP is the template philosophy.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •