Escape function

Is it a good idea to want to write a function that can escape any type of input passed into it? I understand that various PHP-native functions exist that do things like this, such as addslashes(), htmlentities(), and strip_tags(), but I was thinking (possibly, naively) that having a single function that was given an input item (i.e.- file or variable) that would escape everything it contained and return it.

Is this silly? I find it hard to believe that something like this hasn’t been done before, but ironically, I’m having a hard time finding one like the one I envision that can accept anything from a string, file, or array and return the fully-escaped data for use.

Anyone know of a script or function that does this?

The problem there is that escaping can be very different depending on what you’re escaping it for.

For example, escaping it for database input could involve mysql_real_escape_string(), for example. Escaping it for output to the browser may involve htmlentities and strip_tags, and escaping it from jail would involve a crowbar. (Please excuse (Or escape, if you will) that comment, my sense of humour is terrible at this time of night :p)

I suppose the first stage of making something like this, in that case, is with a class. Allow it to accept a string variable and include different functions for different kinds of escaping. Then extend that class to accept different types of variables.

Then you can simply write a function that will route the given variable to the correct object and run the correct method to return the data you want.

You say write a class with different methods, but to my simple mind, it seems as if the same effect could be achieved using a single function with a few different arguments.

For example (generic, I know):

function some_escape_function($input, $type){
   if($type == 'database'){}
   elseif($type == 'whatever'){}
   and so on...
}

Why do you prefer a class? What am I missing here?

What you say makes perfect sense concerning what it’s escaped for, but with that being said, I could see data being escaped for the following “facets”:

1.) Input
2.) Database
3.) Output

Am I being too narrow-minded about this?

I don’t understand benefits of having a function like this. Different types of situations require different escaping and usually escaping at different levels.

For example, if you are using MVC, you would usually escape database variables in models, you would escape view variables in templates and you would probably escape some other variables in controllers. So you would have to access this function at three different levels of applications which would usually lead to having three exactly the same helpers.

1.) Input
2.) Database
3.) Output

How would you escape inputs, for instance? Not all inputs need escaping (majority don’t) and those that do might need different ways of escaping.

Outputs would be htmlentities(), database mysql_real_escape_string(), that’s clear but inputs?

In general, I use this:

function mysql_prep($string) {
	$magic_quotes_on = get_magic_quotes_gpc();
	$php_is_recent = function_exists("mysql_real_escape_string");
	if ($php_is_recent == true) {
		if ($magic_quotes_on) { $string = stripslashes($string); }
		$string = mysql_real_escape_string($string);
	}
	else {
		if (!$magic_quotes_on) { $string = addslashes($string); }
	}
	return $string;
}

I think I made a mental mistake of not only using the word “escape” but also relating it to something a bit different. To me, at least when I posted this, it seemed logical to think that the word “escape” meant to “make safe”. In this light, the input would be made safe at the form and so on while the data going into the database and coming out for output would be made safe using those functions I / you referenced (such as the htmlentities, mysql_real… and so on. Again, I’m probably being narrow-minded here, but it’s what it is. :slight_smile:

I guess I was just hoping for 1 single function that could do all this escaping / sanitization / filtering for me and deep down, I was hoping it was possible that I could make it but as usual, it seems my ideas got the best of moi’.

A “universal” escape function would be base64!

Kekekeke.

(Assuming that the letters of the alphabet in ASCII are acceptable.)

alexp91k your’s wrong way
you shouldn’t use addslashes for mysql and there must be no magic quotes stuff in this function

Wolf_22, look.
Washing hands, using condoms and keeping your wallet deep in the pocket is all for safety. Does hardly washed wallet wrapped in condom and put into inner pocket make you safe for anything?
That’s your “universal” function :slight_smile:

Actually, Shrapnel, I think Alex’s function is perfectly good.

It checks if mysql_real_escape string is available - the only real alternative is addslashes if it’s not available, so it uses that if it can’t use mysql_real_escape_string.

And as for the magic_quotes, it simply removes them if they were put on.

Of course, this function is purely for form input - otherwise I’d agree with your point about magic_quotes.

@Wolf - the reason I suggested using a few classes rather than just one function is more tidiness and organisation than anything else.

And as for the magic_quotes, it simply removes them if they were put on.

  1. It should be removed far before. There can be no Mysql actions at all. But magic quotes must be removed anyway.
  2. Database quoting function should not take into account magic quotes. Because there can be no magic quotes affected data, e.g. you’re reading data from the file. So, this function can strip wrong slashes.

Of course, this function is purely for form input

That’s one of most terrible misbeliefs.
It shouldn’t be for the form input by any means. Mysql functions should be used for mysql only.

If you reread my post you’ll see I was referring to magic quotes.

Well, not just form input but anything that can be modified by the user- cookies, post and get.

It doesn’t matter.
I’ve just explained why Alex’s function is perfectly wrong.
It must be 2 separated functions, totally independent from each other.

That analogy makes absolutely no sense. Your wallet will still be safe. The analogy also is completely irrelevant.

You can make a “universal” “escaping” function. If you are only working with MySQL, HTML, and JavaScript, then yes, you can chain the individual escaping operations in a certain order that will make it safe for all cases. Primarily it depends on whether the sets of allowable characters are subsets of the allowable characters of the parent coding. The issue is that you cannot make a universal unescaping function, because you do not know how many levels to unescape, unless you are in full control of the end systems that will process the unescaped data. If you do not unescape, then your output is no longer the same as the input, and further escaping will create further layers of escaping.

As an extreme case, base64 encoding a character sequence will make it safe for the majority of systems that you may come across. It would be that holy grail of a make-safe algorithm that you are looking for.

hehe, I would had to develop more relevant analogy, to satisfy such a through investigation, my bad :slight_smile:
yes, my wallet’s safe, but I’m not safe in the other ways.
If I am only working with MySQL, HTML, and JavaScript. But I am not.

Your ideal escape algorithm reminds me ideal compressing algorithm - MD5 :slight_smile:
Unfortunately, you cannot unescape strip_tags, especially if these tags designated to format your own article posted to the site.

You’re good with theoretical matters but what’s your practical recommendation?

(1) If you want to just to store data safely without a care on using it, then base64 it is!
(2) If you do care about using your data again, then there is no universal escaping function.

http://php.net/manual/fr/book.filter.php

If you have access to an up to date php version, this should help you.
About mysql_real_escape_string : useless if you use (and you should) parameterized queries.

Filtering is another world.
One should never mix filtering, database escaping and XSS safety in their mind.

Also I have to state that function itself will never help anyone. Only understanding does

Filtering gets rid of data that you do not want, whereas escaping keeps the integrity of the data.

Wait, the php “filter” functions can be used for filtering and validating, in case you don’t know them.
http://www.php.net/manual/en/filter.filters.validate.php
http://www.php.net/manual/en/filter.filters.sanitize.php

Another bunch of useless functions, doubling existing ones.
I wish I know why should I use something like
filter_var($a, FILTER_SANITIZE_SPECIAL_CHARS);
instead of
htmlspecialchars($a);