SitePoint Sponsor

User Tag List

Results 1 to 10 of 10
  1. #1
    It's all Geek to me silver trophybronze trophy
    ralph.m's Avatar
    Join Date
    Mar 2009
    Location
    Melbourne, AU
    Posts
    24,338
    Mentioned
    465 Post(s)
    Tagged
    8 Thread(s)

    Peventing malicious code input with clean_html?

    Hi all,

    I'd like people to be able to submit code in a textarea form field but also prevent them from posting malicious code.

    At the moment, I use a regular expression to prevent code being posted, and the only option in my newbie repertoire to allow code to be posted is to ask the poster to convert selected characters into their equivalent entity references--which, of course, is a burden on the poster.

    I stumbled on a reference to clean_html. Would that be a useful option here, and if so, how is it used for this purpose?

    Or is there a better option?

    I stumbled on clean_html while looking through the FormMail.cgi script (which I don't use, BTW):

    Code:
    # This function will convert <, >, & and " to their HTML equivalents. #
    sub clean_html {
        local $value = $_[0];
        $value =~ s/\&/\&amp;/g;
        $value =~ s/</\&lt;/g;
        $value =~ s/>/\&gt;/g;
        $value =~ s/"/\&quot;/g;
        return $value;
    }
    I'm not quite sure how that works, though, or how to use it along with a regular expression.

  2. #2
    PHP Guru lampcms.com's Avatar
    Join Date
    Jan 2009
    Posts
    921
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    This is not even a php code. You can do better in php using filter_var
    just lookup the filter_var function in php manual
    My project: Open source Q&A
    (similar to StackOverflow)
    powered by php+MongoDB
    Source on github, collaborators welcome!

  3. #3
    . shoooo... silver trophy logic_earth's Avatar
    Join Date
    Oct 2005
    Location
    CA
    Posts
    9,013
    Mentioned
    8 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by ralph.m View Post
    Or is there a better option?
    Yes, HTML Purifier. http://htmlpurifier.org/
    Logic without the fatal effects.
    All code snippets are licensed under WTFPL.


  4. #4
    SitePoint Wizard bronze trophy
    Join Date
    Jul 2008
    Posts
    5,757
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    also see htmlspecialchars()

  5. #5
    It's all Geek to me silver trophybronze trophy
    ralph.m's Avatar
    Join Date
    Mar 2009
    Location
    Melbourne, AU
    Posts
    24,338
    Mentioned
    465 Post(s)
    Tagged
    8 Thread(s)
    Quote Originally Posted by Sharedlog.com View Post
    This is not even a php code.
    So THAT's why I couldn't find it in the manual.

    Thanks for your feedback guys. I'll check out these options.

    Much appreciated!

  6. #6
    SitePoint Wizard
    Join Date
    Mar 2008
    Posts
    1,149
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    It was Perl, for future reference.

  7. #7
    It's all Geek to me silver trophybronze trophy
    ralph.m's Avatar
    Join Date
    Mar 2009
    Location
    Melbourne, AU
    Posts
    24,338
    Mentioned
    465 Post(s)
    Tagged
    8 Thread(s)
    Quote Originally Posted by sk89q View Post
    It was Perl, for future reference.
    Ah, thanks for your Perls of wisdom.

    So far I've tried out htmlentities, and it worked nicely, although only if I got rid of the regex for that field. Not sure what I need to check for now. As htmlentities come before the regex, should I leave out characters that htmlentities will (presumably) already have stripped out?

    Also, htmlentities did not convert characters like { and $. Is there an equivalent function for php / css characters, by any chance?

  8. #8
    It's all Geek to me silver trophybronze trophy
    ralph.m's Avatar
    Join Date
    Mar 2009
    Location
    Melbourne, AU
    Posts
    24,338
    Mentioned
    465 Post(s)
    Tagged
    8 Thread(s)
    Hi again. I found that this worked nicely for replacing a dollar sign with a character reference:

    $msg = str_replace("$","& #36;",$msg);

    (The gap in the character reference is only to allow it to display here.)

    Would I still need to allow for the dollar sign in a regular expression, or have I disabled the use of the dollar sign in malicious code by using the character reference?

    By the way, I've had trouble creating a regular expression that allows the dollar sign. Even if I escape it, a dollar sign does not pass muster.

  9. #9
    . shoooo... silver trophy logic_earth's Avatar
    Join Date
    Oct 2005
    Location
    CA
    Posts
    9,013
    Mentioned
    8 Post(s)
    Tagged
    0 Thread(s)
    Why do you need to filter out the dollar sign?
    It doesn't have any reserved action in HTML.

    So what are you doing with the submitted content?
    Logic without the fatal effects.
    All code snippets are licensed under WTFPL.


  10. #10
    It's all Geek to me silver trophybronze trophy
    ralph.m's Avatar
    Join Date
    Mar 2009
    Location
    Melbourne, AU
    Posts
    24,338
    Mentioned
    465 Post(s)
    Tagged
    8 Thread(s)
    Quote Originally Posted by logic_earth View Post
    Why do you need to filter out the dollar sign?
    It doesn't have any reserved action in HTML.

    So what are you doing with the submitted content?
    Yes, I should have specified. This is just a form-to-email issue. No database involved. I want people to be able to post HTML and PHP code without leaving open the risk of malicious email injections etc.

    I'm not really sure what extent I need to go to in order to avoid nasty stuff.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •