SitePoint Sponsor

User Tag List

Results 1 to 13 of 13
  1. #1
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,931
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)

    When to use HTMLENTITIES

    When are you supposed to use code like this...
    PHP Code:
    htmlentities($nameENT_QUOTES
    Looking back at my recent code, I think I forgot to put it in some places, and now I'm confused when and where to use it?!


    Debbie

  2. #2
    SitePoint Wizard bronze trophy Jeff Mott's Avatar
    Join Date
    Jul 2009
    Posts
    1,314
    Mentioned
    19 Post(s)
    Tagged
    1 Thread(s)
    You'd use it just before you echo content for an HTML page.

  3. #3
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,931
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Jeff Mott View Post
    You'd use it just before you echo content for an HTML page.
    So I just do it before displaying it, but not necessarily while I am using working with variables from my database or in general?

    --------
    Also, what happens if a variable/field is Blank/Empty/Null with respect to HTMLENTITIES?

    Do I need to always use something like this...
    PHP Code:
        $answerEnt = (isset($name) ? htmlentities($nameENT_QUOTES) : ''); 
    Thanks,


    Debbie

  4. #4
    SitePoint Wizard bronze trophy Jeff Mott's Avatar
    Join Date
    Jul 2009
    Posts
    1,314
    Mentioned
    19 Post(s)
    Tagged
    1 Thread(s)
    Quote Originally Posted by DoubleDee View Post
    So I just do it before displaying it, but not necessarily while I am using working with variables from my database or in general?
    Correct. The main reason is that there may be several ways that you present your data. HTML is certainly the most common, but you might also present your data as JSON, or you might use it to compose an email, or for a private administrative or reporting task. Only the presentation layer of your application should know the specifics of how the data is being rendered, so only the presentation layer should handle escaping.

    Quote Originally Posted by DoubleDee View Post
    Also, what happens if a variable/field is Blank/Empty/Null with respect to HTMLENTITIES?
    I didn't have the answer to that off the top of my head, so I double checked the docs, and it made me think that you would get back an empty string. I also ran a quick test script and confirmed that behavior.

    $x = null;
    $y = htmlentities($x);
    var_dump($y);

  5. #5
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,931
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Jeff Mott View Post
    Correct. The main reason is that there may be several ways that you present your data. HTML is certainly the most common, but you might also present your data as JSON, or you might use it to compose an email, or for a private administrative or reporting task. Only the presentation layer of your application should know the specifics of how the data is being rendered, so only the presentation layer should handle escaping.



    I didn't have the answer to that off the top of my head, so I double checked the docs, and it made me think that you would get back an empty string. I also ran a quick test script and confirmed that behavior.

    $x = null;
    $y = htmlentities($x);
    var_dump($y);
    Jeff,

    Sorry for the late reply.

    So let me ask this...

    I'm trying to be a "good girl" and use htmlentities($variable, ENT_QUOTES) on all of my outputted variables, but I honestly find that a DRAG from both a reptitiveness standpoint and from a Code Prettiness standpoint.

    Isn't there a way do something like this before outputting things...

    PHP Code:
    function createSafeOutput($x){
        
    $safe htmlentities($xENT_QUOTES);

        return 
    $safe;
    }

    $firstname createSafeOutput($firstName);
    $address createSafeOutput($address); 
    and so on...


    Debbie

  6. #6
    SitePoint Wizard bronze trophy Jeff Mott's Avatar
    Join Date
    Jul 2009
    Posts
    1,314
    Mentioned
    19 Post(s)
    Tagged
    1 Thread(s)
    Quote Originally Posted by DoubleDee View Post
    I'm trying to be a "good girl" and use htmlentities($variable, ENT_QUOTES) on all of my outputted variables, but I honestly find that a DRAG from both a reptitiveness standpoint and from a Code Prettiness standpoint.

    Isn't there a way do something like this before outputting things...
    In fact that's a very good and smart change, and the way you wrote it is just fine too. The only change I might make would be to rename the function escapeHTML, because I think that would be a bit more clear and specific about what it does.

  7. #7
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,931
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Jeff Mott View Post
    In fact that's a very good and smart change, and the way you wrote it is just fine too. The only change I might make would be to rename the function escapeHTML, because I think that would be a bit more clear and specific about what it does.
    Jeff, I have no problems renaming things. (I just threw that name together on a whim last night.)

    So, I have one vote "Yes" for my proposal above.

    Are there some more PHP gurus out there who would like to chime in and let me know if they think my proposal is a good or bad idea?

    Thanks,


    Debbie

  8. #8
    Hosting Team Leader silver trophybronze trophy
    cpradio's Avatar
    Join Date
    Jun 2002
    Location
    Ohio
    Posts
    5,234
    Mentioned
    154 Post(s)
    Tagged
    0 Thread(s)
    Debbie,

    It is definitely a good thing. I usually abstract my data escaping like you outlined as well, simply from the point of if I need to alter it in the future, I rather update one location instead of thousands.

  9. #9
    SitePoint Wizard bronze trophy
    Join Date
    Jul 2006
    Location
    Augusta, Georgia, United States
    Posts
    4,191
    Mentioned
    17 Post(s)
    Tagged
    4 Thread(s)
    So long as the data entry forms do not allow HTML your good with that approach. Things become about 1000 times more complex once you allow HTML or an abstraction of it like bbcode. The most secure method though will be stripping HTML on user input or going as far as to make it a validation requirement before allowing data entry. If you would do those things which are probably more reliable anyway entity conversion matters little besides for valid HTML. Not that valid HTML isn't important but just about all modern browsers are forgiving with the common character entity conversion cases.
    The only code I hate more than my own is everyone else's.

  10. #10
    SitePoint Addict kduv's Avatar
    Join Date
    May 2012
    Location
    Atlanta, GA
    Posts
    244
    Mentioned
    5 Post(s)
    Tagged
    0 Thread(s)
    +1 for validating data on input. Ideally, you would validate/sanitize any data from an external source, whether it be from a user, an external API, an XML file from another site, or whatever. If you validate/sanitize all external data before you work with it, then you don't need to be "as" worried about how you display it.

    Don't get me wrong, you still should be mindful of how you output data but it's always a good idea to sanitize input before working with it.

  11. #11
    SitePoint Wizard DoubleDee's Avatar
    Join Date
    Aug 2010
    Location
    Arizona
    Posts
    3,931
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by oddz View Post
    So long as the data entry forms do not allow HTML your good with that approach. Things become about 1000 times more complex once you allow HTML or an abstraction of it like bbcode. The most secure method though will be stripping HTML on user input or going as far as to make it a validation requirement before allowing data entry. If you would do those things which are probably more reliable anyway entity conversion matters little besides for valid HTML. Not that valid HTML isn't important but just about all modern browsers are forgiving with the common character entity conversion cases.
    And how would you go about "sanitizing" Form data?

    In my "create_account.php", I do this...
    PHP Code:
            // ************************
            // Validate Form Data.        *
            // ************************

            // Check First Name.
            
    if (empty($trimmed['firstName'])){
                
    // No First Name.
                
    $errors['firstName'] = 'Enter your First Name.';
            }else{
                
    // First Name Exists.
                
    if (preg_match('#^[A-Z \'.-]{2,30}$#i'$trimmed['firstName'])){
                    
    // Valid First Name.
                    
    $firstName $trimmed['firstName'];
                }else{
                    
    // Invalid First Name.
                    
    $errors['firstName'] = 'First Name must be 2-30 characters (A-Z \' . -)';
                }
            }
    //End of CHECK FIRST NAME


            // Check Username.
            
    if (empty($trimmed['username'])){
                
    // No Username.
                
    $errors['username'] = 'Enter your Username.';
            }else{
                
    // Username Exists.


                // ************************
                // Check Username Format.    *
                // ************************
                
    if (preg_match('~(?x)                # Comments Mode
                            ^                # Beginning of String Anchor
                            (?=.{8,30}$)        # Ensure Length is 8-30 Characters
                            [a-z0-9_.-]*        # Match only certain Characters
                            $                # End of String Anchor
                            ~i'
    $trimmed['username'])){

                    
    // Valid Username.

                    // Check Username Availability.

                
    }else{
                    
    // Invalid Username.

                    
    $errors['username'] = 'Username must be 8-30 characters (A-Z 0-9 _ - .)';
                }
    //End of CHECK USERNAME FORMAT
            
    }//End of CHECK USERNAME 

    Debbie

  12. #12
    SitePoint Addict kduv's Avatar
    Join Date
    May 2012
    Location
    Atlanta, GA
    Posts
    244
    Mentioned
    5 Post(s)
    Tagged
    0 Thread(s)
    Those to methods of filtering name/username work. If they have passed that test, it's probably pretty safe to display firstname and username without additional filtering.

  13. #13
    SitePoint Wizard bronze trophy
    Join Date
    Jul 2006
    Location
    Augusta, Georgia, United States
    Posts
    4,191
    Mentioned
    17 Post(s)
    Tagged
    4 Thread(s)
    Quote Originally Posted by DoubleDee
    And how would you go about "sanitizing" Form data?
    When NO HTML is allowed everything can be simple as comparing the user supplied value to that of the value passed through strip_tags. When the values are not equal the input contains HTML. In which case cancel form processing and provide a message to the user. That is probably the simplest method. Though using such a simple approach does have its pitfalls like false positives as discussed in the php docs. Probably 90% of cases though the simple approach with tag stripping will be adequate when no HTML or abstraction of it is allowed. What can be done is to wrap the strip tags call in another function so it can be extended upon as you run into edge cases with false positives or exceptions.
    The only code I hate more than my own is everyone else's.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •