SitePoint Sponsor

User Tag List

Results 1 to 8 of 8
  1. #1
    messing with my mind fristi's Avatar
    Join Date
    Feb 2009
    Posts
    292
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    evaluating expression from File

    Hi,

    I have been pondering over this problem for a while and I can't come up with anything.

    The situation is as follows:

    I'm reading a line from a file. This line contains an expression using the Ternary Operator. In the script I need to evaluate this expression to know how to proceed. I can't change the files, they come from a different source, and the expressions can become quiet complex to split everything up and rebuild it in the script itself.

    I know that the eval function could be used, but i'm a bit opposed to this method. Because there is no control then, what if a person with the wrong ideas manages to send a file to the script which will eval() to a delete or whatever...

    So I was wondering if any of you have any special ideas or techniques for this kind of thing?

    I would appreciate it. Thanks
    To PHP or to Perl, that is the question!
    (Bucket - simpletest) User

  2. #2
    Theoretical Physics Student bronze trophy Jake Arkinstall's Avatar
    Join Date
    May 2006
    Location
    Lancaster University, UK
    Posts
    7,062
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Without eval() there's not much you can do, as the task is to basically parse the statement.

    What you can do, however, is first verify that the statement is safe - have a blacklist of keywords (eval, unlink, mysql, etc) before running eval. It's not foolproof, but for what you want to do it might be the best thing to do.
    Jake Arkinstall
    "Sometimes you don't need to reinvent the wheel;
    Sometimes its enough to make that wheel more rounded"-Molona

  3. #3
    SitePoint Addict
    Join Date
    Jan 2005
    Location
    Ireland
    Posts
    349
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Writing a parser is more than possible, although admittedly could feel llike you are re-inventing the wheel (within an already existing wheel... I'm confused).

    You could look for someone who has done this before (http://phpexcel.codeplex.com/SourceC...w/10632#172258 for example).

    If you want to go down this route, you could use the shunting-yard algorithm to get a postfix (polish notation) expression, and evaluate that.

    There is plenty of articles and source code detailing arithmetic parsing, as it is something often covered in computer science courses, and of course, needed in many applications.

    EDIT: In the user comments of http://php.net/eval you can see a function named matheval. This version uses regular expressions to see if the formula being passed in is valid maths.
    This is a similar concept to what arkinstall described, however, instead of having a blacklist you are using a whitelist: that is, you only specify what valid input is, not what invalid input is. This is more secure*, as the whitelist set is a smaller set and can be covered completely, whereas the blacklist may be missing values.

    * I haven't looked at the code indepth to see if it holds up in security.

  4. #4
    Theoretical Physics Student bronze trophy Jake Arkinstall's Avatar
    Join Date
    May 2006
    Location
    Lancaster University, UK
    Posts
    7,062
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Ryan Wray View Post
    If you want to go down this route, you could use the shunting-yard algorithm to get a postfix (polish notation) expression, and evaluate that.

    There is plenty of articles and source code detailing arithmetic parsing, as it is something often covered in computer science courses, and of course, needed in many applications.
    The problem here is that the ternary operator is in PHP syntax, and could easily contain functions native to PHP - sorting through that mess would be alot of code and very 'dodgy'. In effect you'd have to write something to fully parse PHP in PHP, which is the same thing as eval anyway.

    But I agree with the whitelist idea, though it'd be one hell of a long list.
    Jake Arkinstall
    "Sometimes you don't need to reinvent the wheel;
    Sometimes its enough to make that wheel more rounded"-Molona

  5. #5
    messing with my mind fristi's Avatar
    Join Date
    Feb 2009
    Posts
    292
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I think I will go with the white list idea then. There are some restrictions to what is allowed in the expression I have learned. So I hope it will be a limited list.

    Writing my own parser looks too far fetched and will take too much time for this one liner problem. I don't plan on inventing my own language

    Thanks for taking the time to reply people
    To PHP or to Perl, that is the question!
    (Bucket - simpletest) User

  6. #6
    SitePoint Evangelist
    Join Date
    Jun 2006
    Location
    Wigan, Lancashire. UK
    Posts
    523
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Ryan Wray View Post
    You could look for someone who has done this before (http://phpexcel.codeplex.com/SourceC...w/10632#172258 for example).

    If you want to go down this route, you could use the shunting-yard algorithm to get a postfix (polish notation) expression, and evaluate that.
    The PHPExcel parser is working with a very limited function set (albeit about 350 of them in all, it's limited in comparison with PHP), a limited operand set (number, string, boolean, matrix, error and null... no objects etc), and a limited set of operators. Even so, it took about 4 months of work to get the formula parser to its current state... and it's still a work in progress (albeit stable enough now for an initial production release)... and I did have previous experience of language parsing.
    While an interesting experience for learning the intricacies of parsers, and an invaluable learning exercise in how computer languages work "under the hood", it's almost certainly overkill for this situation.

    The original PHPExcel parser simply built a PHP function based on the formula string, converting the Excel functions and operands to their PHP equivalents, and calling it with a lambda function. In that regard, it was little different to an eval. Certainly there were some restrictions, but effectively it was just whitelisting the valid operations and functions.
    ---
    Development Projects:
    PHPExcel
    PHPPowerPoint

  7. #7
    SitePoint Wizard
    Join Date
    Mar 2008
    Posts
    1,149
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    PHP comes with a tokenizer you can access via a PHP script. Parsing PHP with it is very easy.

    http://php.net/tokenizer

    You can then write a parser yourself, or use the tokenizer to validate the expression (and use eval() later).

  8. #8
    SitePoint Addict
    Join Date
    Jan 2005
    Location
    Ireland
    Posts
    349
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by arkinstall View Post
    The problem here is that the ternary operator is in PHP syntax, and could easily contain functions native to PHP - sorting through that mess would be alot of code and very 'dodgy'. In effect you'd have to write something to fully parse PHP in PHP, which is the same thing as eval anyway.

    But I agree with the whitelist idea, though it'd be one hell of a long list.
    I apologise: Somehow, I got into my head that he wanted to evaluate mathematically statements.

    Don't know where that notion from reading the post again, my post is effectively useless. :P


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •