SitePoint Sponsor

User Tag List

Page 1 of 2 12 LastLast
Results 1 to 25 of 36
  1. #1
    SitePoint Wizard Wolf_22's Avatar
    Join Date
    Jul 2005
    Posts
    1,700
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Question Escape function...

    Is it a good idea to want to write a function that can escape any type of input passed into it? I understand that various PHP-native functions exist that do things like this, such as addslashes(), htmlentities(), and strip_tags(), but I was thinking (possibly, naively) that having a single function that was given an input item (i.e.- file or variable) that would escape everything it contained and return it.

    Is this silly? I find it hard to believe that something like this hasn't been done before, but ironically, I'm having a hard time finding one like the one I envision that can accept anything from a string, file, or array and return the fully-escaped data for use.

    Anyone know of a script or function that does this?

  2. #2
    Theoretical Physics Student bronze trophy Jake Arkinstall's Avatar
    Join Date
    May 2006
    Location
    Lancaster University, UK
    Posts
    7,062
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    The problem there is that escaping can be very different depending on what you're escaping it for.

    For example, escaping it for database input could involve mysql_real_escape_string(), for example. Escaping it for output to the browser may involve htmlentities and strip_tags, and escaping it from jail would involve a crowbar. (Please excuse (Or escape, if you will) that comment, my sense of humour is terrible at this time of night )

    I suppose the first stage of making something like this, in that case, is with a class. Allow it to accept a string variable and include different functions for different kinds of escaping. Then extend that class to accept different types of variables.

    Then you can simply write a function that will route the given variable to the correct object and run the correct method to return the data you want.
    Jake Arkinstall
    "Sometimes you don't need to reinvent the wheel;
    Sometimes its enough to make that wheel more rounded"-Molona

  3. #3
    SitePoint Wizard Wolf_22's Avatar
    Join Date
    Jul 2005
    Posts
    1,700
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Jake Arkinstall View Post
    The problem there is that escaping can be very different depending on what you're escaping it for.

    For example, escaping it for database input could involve mysql_real_escape_string(), for example. Escaping it for output to the browser may involve htmlentities and strip_tags, and escaping it from jail would involve a crowbar. (Please excuse (Or escape, if you will) that comment, my sense of humour is terrible at this time of night )

    I suppose the first stage of making something like this, in that case, is with a class. Allow it to accept a string variable and include different functions for different kinds of escaping. Then extend that class to accept different types of variables.

    Then you can simply write a function that will route the given variable to the correct object and run the correct method to return the data you want.
    You say write a class with different methods, but to my simple mind, it seems as if the same effect could be achieved using a single function with a few different arguments.

    For example (generic, I know):

    Code:
    function some_escape_function($input, $type){
       if($type == 'database'){}
       elseif($type == 'whatever'){}
       and so on...
    }
    Why do you prefer a class? What am I missing here?

    What you say makes perfect sense concerning what it's escaped for, but with that being said, I could see data being escaped for the following "facets":

    1.) Input
    2.) Database
    3.) Output

    Am I being too narrow-minded about this?

  4. #4
    SitePoint Guru risoknop's Avatar
    Join Date
    Feb 2008
    Location
    end($world)
    Posts
    834
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I don't understand benefits of having a function like this. Different types of situations require different escaping and usually escaping at different levels.

    For example, if you are using MVC, you would usually escape database variables in models, you would escape view variables in templates and you would probably escape some other variables in controllers. So you would have to access this function at three different levels of applications which would usually lead to having three exactly the same helpers.

    1.) Input
    2.) Database
    3.) Output
    How would you escape inputs, for instance? Not all inputs need escaping (majority don't) and those that do might need different ways of escaping.

    Outputs would be htmlentities(), database mysql_real_escape_string(), that's clear but inputs?

  5. #5
    SitePoint Enthusiast
    Join Date
    Jan 2010
    Posts
    61
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    In general, I use this:

    Code PHP:
    function mysql_prep($string) {
    	$magic_quotes_on = get_magic_quotes_gpc();
    	$php_is_recent = function_exists("mysql_real_escape_string");
    	if ($php_is_recent == true) {
    		if ($magic_quotes_on) { $string = stripslashes($string); }
    		$string = mysql_real_escape_string($string);
    	}
    	else {
    		if (!$magic_quotes_on) { $string = addslashes($string); }
    	}
    	return $string;
    }

  6. #6
    SitePoint Wizard Wolf_22's Avatar
    Join Date
    Jul 2005
    Posts
    1,700
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by risoknop View Post
    I don't understand benefits of having a function like this. Different types of situations require different escaping and usually escaping at different levels.

    For example, if you are using MVC, you would usually escape database variables in models, you would escape view variables in templates and you would probably escape some other variables in controllers. So you would have to access this function at three different levels of applications which would usually lead to having three exactly the same helpers.



    How would you escape inputs, for instance? Not all inputs need escaping (majority don't) and those that do might need different ways of escaping.

    Outputs would be htmlentities(), database mysql_real_escape_string(), that's clear but inputs?
    I think I made a mental mistake of not only using the word "escape" but also relating it to something a bit different. To me, at least when I posted this, it seemed logical to think that the word "escape" meant to "make safe". In this light, the input would be made safe at the form and so on while the data going into the database and coming out for output would be made safe using those functions I / you referenced (such as the htmlentities, mysql_real... and so on. Again, I'm probably being narrow-minded here, but it's what it is.

    I guess I was just hoping for 1 single function that could do all this escaping / sanitization / filtering for me and deep down, I was hoping it was possible that I could make it but as usual, it seems my ideas got the best of moi'.

  7. #7
    SitePoint Wizard
    Join Date
    Mar 2008
    Posts
    1,149
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    A "universal" escape function would be base64!

    Kekekeke.

    (Assuming that the letters of the alphabet in ASCII are acceptable.)

  8. #8
    Non-Member
    Join Date
    Oct 2009
    Posts
    1,852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    alexp91k your's wrong way
    you shouldn't use addslashes for mysql and there must be no magic quotes stuff in this function

    Wolf_22, look.
    Washing hands, using condoms and keeping your wallet deep in the pocket is all for safety. Does hardly washed wallet wrapped in condom and put into inner pocket make you safe for anything?
    That's your "universal" function

  9. #9
    Theoretical Physics Student bronze trophy Jake Arkinstall's Avatar
    Join Date
    May 2006
    Location
    Lancaster University, UK
    Posts
    7,062
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Actually, Shrapnel, I think Alex's function is perfectly good.

    It checks if mysql_real_escape string is available - the only real alternative is addslashes if it's not available, so it uses that if it can't use mysql_real_escape_string.

    And as for the magic_quotes, it simply removes them if they were put on.

    Of course, this function is purely for form input - otherwise I'd agree with your point about magic_quotes.

    @Wolf - the reason I suggested using a few classes rather than just one function is more tidiness and organisation than anything else.
    Jake Arkinstall
    "Sometimes you don't need to reinvent the wheel;
    Sometimes its enough to make that wheel more rounded"-Molona

  10. #10
    Non-Member
    Join Date
    Oct 2009
    Posts
    1,852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    And as for the magic_quotes, it simply removes them if they were put on.
    1. It should be removed far before. There can be no Mysql actions at all. But magic quotes must be removed anyway.
    2. Database quoting function should not take into account magic quotes. Because there can be no magic quotes affected data, e.g. you're reading data from the file. So, this function can strip wrong slashes.
    Of course, this function is purely for form input
    That's one of most terrible misbeliefs.
    It shouldn't be for the form input by any means. Mysql functions should be used for mysql only.

  11. #11
    Theoretical Physics Student bronze trophy Jake Arkinstall's Avatar
    Join Date
    May 2006
    Location
    Lancaster University, UK
    Posts
    7,062
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    If you reread my post you'll see I was referring to magic quotes.

    Well, not just form input but anything that can be modified by the user- cookies, post and get.
    Jake Arkinstall
    "Sometimes you don't need to reinvent the wheel;
    Sometimes its enough to make that wheel more rounded"-Molona

  12. #12
    Non-Member
    Join Date
    Oct 2009
    Posts
    1,852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    It doesn't matter.
    I've just explained why Alex's function is perfectly wrong.
    It must be 2 separated functions, totally independent from each other.

  13. #13
    SitePoint Wizard
    Join Date
    Mar 2008
    Posts
    1,149
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Shrapnel_N5 View Post
    Wolf_22, look.
    Washing hands, using condoms and keeping your wallet deep in the pocket is all for safety. Does hardly washed wallet wrapped in condom and put into inner pocket make you safe for anything?
    That's your "universal" function
    That analogy makes absolutely no sense. Your wallet will still be safe. The analogy also is completely irrelevant.

    You can make a "universal" "escaping" function. If you are only working with MySQL, HTML, and JavaScript, then yes, you can chain the individual escaping operations in a certain order that will make it safe for all cases. Primarily it depends on whether the sets of allowable characters are subsets of the allowable characters of the parent coding. The issue is that you cannot make a universal unescaping function, because you do not know how many levels to unescape, unless you are in full control of the end systems that will process the unescaped data. If you do not unescape, then your output is no longer the same as the input, and further escaping will create further layers of escaping.

    As an extreme case, base64 encoding a character sequence will make it safe for the majority of systems that you may come across. It would be that holy grail of a make-safe algorithm that you are looking for.

  14. #14
    Non-Member
    Join Date
    Oct 2009
    Posts
    1,852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    hehe, I would had to develop more relevant analogy, to satisfy such a through investigation, my bad
    yes, my wallet's safe, but I'm not safe in the other ways.
    If I am only working with MySQL, HTML, and JavaScript. But I am not.

    Your ideal escape algorithm reminds me ideal compressing algorithm - MD5
    Unfortunately, you cannot unescape strip_tags, especially if these tags designated to format your own article posted to the site.

    You're good with theoretical matters but what's your practical recommendation?

  15. #15
    SitePoint Wizard
    Join Date
    Mar 2008
    Posts
    1,149
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    (1) If you want to just to store data safely without a care on using it, then base64 it is!
    (2) If you do care about using your data again, then there is no universal escaping function.

  16. #16
    SitePoint Enthusiast
    Join Date
    Sep 2008
    Posts
    68
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    http://php.net/manual/fr/book.filter.php

    If you have access to an up to date php version, this should help you.
    About mysql_real_escape_string : useless if you use (and you should) parameterized queries.

  17. #17
    Non-Member
    Join Date
    Oct 2009
    Posts
    1,852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Filtering is another world.
    One should never mix filtering, database escaping and XSS safety in their mind.

    Also I have to state that function itself will never help anyone. Only understanding does

  18. #18
    SitePoint Wizard
    Join Date
    Mar 2008
    Posts
    1,149
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Filtering gets rid of data that you do not want, whereas escaping keeps the integrity of the data.

  19. #19
    SitePoint Enthusiast
    Join Date
    Sep 2008
    Posts
    68
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Wait, the php "filter" functions can be used for filtering and validating, in case you don't know them.
    http://www.php.net/manual/en/filter....s.validate.php
    http://www.php.net/manual/en/filter....s.sanitize.php

  20. #20
    Non-Member
    Join Date
    Oct 2009
    Posts
    1,852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Another bunch of useless functions, doubling existing ones.
    I wish I know why should I use something like
    filter_var($a, FILTER_SANITIZE_SPECIAL_CHARS);
    instead of
    htmlspecialchars($a);

  21. #21
    SitePoint Enthusiast
    Join Date
    Sep 2008
    Posts
    68
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Because of these kind of mass sanitizers : http://www.php.net/manual/en/functio...-var-array.php and http://www.php.net/manual/en/functio...nput-array.php

    And to be able to use the mail filters instead of some yet again broken regexp.

  22. #22
    Non-Member
    Join Date
    Oct 2009
    Posts
    1,852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    I can understand email validation, but I can't understand "sanitizing" email
    I would never "sanitize" email by removing any character of it.
    I think it has no sense.
    Another example please.

  23. #23
    SitePoint Enthusiast
    Join Date
    Sep 2008
    Posts
    68
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Shrapnel_N5 View Post
    I can understand email validation, but I can't understand "sanitizing" email
    I would never "sanitize" email by removing any character of it.
    I think it has no sense.
    Mail header injection anyone ? If some character can not, by the RFC, be part of an email address it should be deleted. But yeah, usually used more for validating than sanitizing.
    I know, using the same kind of functions for 2 different things is evil for you. Well, I think the names of the filter is enough (and the all caps thing is like a blinking sign in the night) to see the difference.

    Quote Originally Posted by Shrapnel_N5 View Post
    Another example please.
    Check the filter documentation for some examples.

  24. #24
    Non-Member
    Join Date
    Oct 2009
    Posts
    1,852
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    If some character can not, by the RFC, be part of an email address it should be deleted.
    Nope. Whole address should be rejected.
    Check the filter documentation for some examples.
    Unfortunately, there are same useless email example

  25. #25
    SitePoint Enthusiast
    Join Date
    Sep 2008
    Posts
    68
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Shrapnel_N5 View Post
    Nope. Whole address should be rejected.
    Then you use a validating filter instead of sanitizing one. I even encourage you to start raising the php team attention about the FILTER_SANITIZE_EMAIL danger.

    Quote Originally Posted by Shrapnel_N5 View Post
    Unfortunately, there are same useless email example
    http://fr.php.net/manual/en/function...array.examples
    There are examples (even some user examples) for almost all functions. You should be able to find something usefull. If not, just continue using stripslashes, htmlentites, mysql_real_escape_string, and some preg_match.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •