SitePoint Sponsor

User Tag List

Results 1 to 5 of 5
  1. #1
    SitePoint Zealot
    Join Date
    Dec 2006
    Location
    Gothenburg, Sweden
    Posts
    135
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Need help cleaning strings for a webpage

    Hi all,

    I need help on writing a regular expression for cleaning a string to be published on a website. The database charset is in UTF-8 aswell as all the html.

    What I would like to achieve is that when a user enters some text in a textarea I would like to check that the string doesn't contain any garbage characters (kinda like what trim() does, except that trim() only checks in the beginning and end of string, I want everywhere in the string) to avoid spamming.

    I would like redundant spaces (more than one in a row) to be deleted. Newlines (\r\n) should be allowed but if there are more than 2 in a row then they should be deleted so that there are only 2 in a row.

    Could someone help me with this please? Thanks in advance

  2. #2
    SitePoint Evangelist BJ Duncan's Avatar
    Join Date
    Jun 2007
    Location
    North Richmond
    Posts
    495
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    There have been numerous theads on this forum about this:
    Go to this link. This may assist in some way.

    Or maybe this:
    PHP Code:
     <?php
    # place your $_POST into the variable container

    $subject $_POST['subject']; 
    $titletext $_POST['titletext']; 
    $maintext $_POST['maintext']; 

    #clean variables - excluding any errors the humans may have made

    function clean($value) {
              
    $value trim($value);
              
    $value strip_tags($value); 
              
    $value mysql_real_escape_string($value); 

             return 
    $value// this could all be done on one line, but for simplicity I seperated it all 
    }  

    $subject clean($_POST['subject']); 
    $titletext clean($_POST['titletext']); 
    $maintext clean($_POST['maintext']); 
    ?>
    Regards,
    BJ Duncan

  3. #3
    SitePoint Zealot
    Join Date
    Dec 2006
    Location
    Gothenburg, Sweden
    Posts
    135
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Thanks for your help, appreciated. The link you posted relates more to SQL injection. Which is not what my thread is about.

    The code you posted suggests to use the trim() function but like I wrote it's not sufficient for my needs.

  4. #4
    SitePoint Wizard Hammer65's Avatar
    Join Date
    Nov 2004
    Location
    Lincoln Nebraska
    Posts
    1,161
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Since you said "spam" I assume that you either mean mail injection or XSS. Two different things.

    Google for "php mail injection" on Google to get the info on that because what characters are dangerous depends on the part of the email it is going to go in (which header or is it the body).

    For XSS read this.

  5. #5
    SitePoint Zealot
    Join Date
    Dec 2006
    Location
    Gothenburg, Sweden
    Posts
    135
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    No no no, by spam I mean inserting alot of jibberish characters in the textfield, which would fill the database with jibberish.

    Forget about spam.

    This is what I'm truly looking for:

    A function that will clean up a string from space chars (one space between words allowed of course), tab chars, vertical tab chars, NUL byte chars and finally newline chars (although 1 or 2 newline chars in a row is allowed but more than 2 newline chars should be removed).


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •