Escaping numerical values in MySQL

felgall · May 3, 2013, 2:15am

“But I don’t see how real_escape_string when used properly is less secure.”

The text in bold is what makes the difference between the two. With prepare/bind you don’t need to look at anything other than the prepare and bind calls to know whether it is 100% secure or not. With real_escape_string you need to look at the query and then track back all of the data references to where they were validated in order to see whether they might contain something that needs to be escaped and then examine the intervening code to confirm that those fields that need to be escaped are escaped.

Checking that real_escape_string is used properly can take several minutes as compared to the one second glance that checking prepare/bind requires - and that is assuming that you don’t misread something during the checking - if you misread something then there might be a security hole that you have missed.

DrQuincy · May 3, 2013, 5:12am

Lemon_Juice:

I would add one more to this list, simply validating the number:
// 5)
if (!is_numeric($number)) {
  $number = 0;
}
This has the advantage in some edge cases where the number to be passed to SQL is outside of possible storage range in PHP:

INT, BIGINT and DECIMAL numbers that exceed 32/64-bit

large DOUBLE numbers

even INT UNSIGNED may be too much for 64-bit PHP since PHP only supports signed integers, which makes the range 2x smaller for positive numbers

Aside: Could you elaborate on this? If a number is outside PHP’s storage range, what happens?

DrQuincy · May 3, 2013, 5:16am

I see your point and actually think we’re all agreeing with each other. Lemon Juice and I are saying that real escape is 100% secure if used on string. Your point is that you can execute a query and not run escape for whatever reason (you forget, code patch, whatever) whereas with prepared statements you must bind the params before executing so injection cannot occur.

To be pedantic you could “forget” to escape a dynamic table name generated from user input when using prepared statements and still suffer an injection.

I’m new to prepared statements. Out of interest what happens if you set the incorrect type in bind_param (E.g. i instead of s)? Does the query just fail?

Lemon_Juice · May 3, 2013, 7:50am

Oh, now I see where our assumptions are different. When real_escape_string is used like you described above, that is only when necessary at all costs, then yes, it can become harder to spot security holes. But by proper use I mean escaping values always, no matter whether they come from a sanitized source or not. In such case there’s no difference because we are doing the same amount of work - you have to make sure that all your variables are bound, I have to make sure that all of my variables are escaped. Therefore, I don’t need to track my code back to where the values come from, I just know that all of them must be escaped right where I inject them into my query.

Sorry, I wasn’t clear enough, I was specifically commenting on this method:


$number = (int)$_GET['myvalue'];

Imagine that myvalue is 18446744073709551610. This number is within mysql BIGINT UNSIGNED range but is too large for PHP to handle so after casting to INT this is what you will get on a 32-bit system:

2147483647

It will be larger on 64-bit systems but still the value will be changed because it is too large to be handled as a 64-bit signed integer, which PHP uses. Similar problems arise when you use $number = (float) $_GET[‘myvalue’];. Aside from the same problem of ‘truncating’ the number to another within the allowed range you take the risk of losing precision. Imagine you have this number:

1844674407370.123456789012

which mysql can store just fine in a DECIMAL field with perfect precision. PHP doesn’t support precision decimal numbers so casting to a float is as good as you can get. But then the number will be changed to

1844674407370.1

Of course, PHP can handle such numbers just fine when you store them as strings so that is what you should do in such cases.

Also beware of filter_var - while the numbers won’t be changed the filtering mechanisms are very dumb as they apply some very simple character removal. So a number like this will pass the filtering without problems:

filter_var('---184467440++++7370123456789012---', FILTER_SANITIZE_NUMBER_INT);

Using such a number will result in SQL syntax error - you will prevent SQL injection in this way but having syntax errors is something you would probably also want to avoid. This is why I generally dislike the filter_ functions in PHP - while the idea is nice, the implementation is far from satisfactory…

When you sanitize your numbers with is_numeric() then you avoid such problems. Prepared statement should also work fine.

Yes, it basically comes to coding standards and for security reasons you always have to do something with the values, whether you are binding them or escaping. When you do that always then you are safe.

Moreover, I believe that prepared statements are not the right tool for avoiding sql injections. While they have this nice side effect I believe it is a kind of workaround when they are used only for that reason. I can see a few downsides of using prepared statements in this manner - but maybe I’m going too much off-topic here :).

DrQuincy · May 3, 2013, 7:54am

Thanks, so I guess is_numeric is the way to go then.

Do you see prepared statements as only useful if you are reusing the queries then?

Lemon_Juice · May 3, 2013, 8:24am

is_numeric() is the simplest general check that will work fine, you can also use regular expressions (preg_match) when you want to target some more specific number formats. ctype_digit() is also worth considering.

The one weird thing about is_numeric() is that it will accept a number if it contains any number of white space characters in front of the number. However, this should not cause any issues in sql. If you want to avoid this just remember to use trim() on the value.

Basically, yes. In most cases there some trade-offs I don’t think are worth accepting:

with prepared statements it is difficult to log queries with actual data (for any purpose) because you send queries with placeholders only and data is sent separately
binding values break the linear flow of code as compared to injecting values to sql by the use of concatenating - the sql is in a different place than the bind() statements, so when you look at sql you don’t see where the values come from because there are only placeholders. This becomes tedious if a query is larger and you have to scroll your code to see what value/variable is behind the placeholder. Placeholders look cleaner at first glance but after a few times of using them I find the ‘uglier’ concatenation to be actually easier to read and debug.

Jeff_Mott · May 3, 2013, 8:26am

Hi, folks. Sorry I’m late to the party. I wanted to ask about the original issue.

Has anyone provided a testable and repeatable demonstration of this? Because if your values – numeric or otherwise – are quoted and escaped, then the vulnerability isn’t at all obvious to me.

DrQuincy · May 3, 2013, 8:58am

While you could quote and escape a numerical as MySQL will convert it to a number if it can, I would consider this bad practice—and doesn’t it prevent indexing doing it this way?

The example I read would go something like:

$safe = mysql_real_escape_string($_GET[‘page’]);

In this case $_GET[‘page’] = “0 = 0”

So the query could be:

SELECT … WHERE somefield = 0 = 0

Which would return every record.

DrQuincy · May 3, 2013, 9:01am

Lemon_Juice:

is_numeric() is the simplest general check that will work fine, you can also use regular expressions (preg_match) when you want to target some more specific number formats. ctype_digit() is also worth considering.

The one weird thing about is_numeric() is that it will accept a number if it contains any number of white space characters in front of the number. However, this should not cause any issues in sql. If you want to avoid this just remember to use trim() on the value.

Basically, yes. In most cases there some trade-offs I don’t think are worth accepting:

with prepared statements it is difficult to log queries with actual data (for any purpose) because you send queries with placeholders only and data is sent separately

binding values break the linear flow of code as compared to injecting values to sql by the use of concatenating - the sql is in a different place than the bind() statements, so when you look at sql you don’t see where the values come from because there are only placeholders. This becomes tedious if a query is larger and you have to scroll your code to see what value/variable is behind the placeholder. Placeholders look cleaner at first glance but after a few times of using them I find the ‘uglier’ concatenation to be actually easier to read and debug.

You’ve got me questioning whether I should moved to prepared statements now! I tend to agree with everything you’ve said. With MySQLi is there anyway to get the actual query executed as a string in case of error? If you can’t I would have this down as a disadvantage since logging the query upon error is very useful for debugging.

Jeff_Mott · May 3, 2013, 9:17am

Either PHP or MySQL will eventually have to convert the value to a number. Is it so bad if we let MySQL handle that job? It comes with the added benefit that you can escape those values just like any other value.

I don’t see how this would prevent indexing.

Lemon_Juice · May 3, 2013, 10:10am

It shouldn’t prevent indexing but again I’ve come across some edge cases involving very large numbers where quotes actually changed what the query did, unfortunately I can’t remember the details. Quoting numbers is non-standard and although nothing wrong should happen in mysql I prefer to always use them unquoted. MySQLi’s real_escape_string() is very basic and not very universal so what I do is I use my own escaping function that can handle strings (even those from objects with __toString()), numbers, NULLs and also arrays for IN() clause. Actually, this is a method in an extended MySQLi class:


class Db extends mysqli {
	/**
	 * Escape value for sql query
	 * @param mixed $value
	 * @param string $type 's'=string, 'n'=number, null=automatic detection based on $value PHP type
	 * @return string 
	 */
	public function quote($value, $type = null) {
		if ($value === null) {
			return "NULL";
			
		} elseif (is_array($value)) {
			// comma-separated list of values for IN()
			$items = array();
			
			foreach ($value as $item) {
				if ($item === null) {
					$items[] = "NULL";
				} elseif ($type === "s" || ($type === null && (is_string($item) || is_bool($value)))) {
					// string
					$items[] = "'". $this->real_escape_string($item). "'";
					
				} elseif ($type === "n") {
					// number
					if (is_numeric($item)) {
						$items[] = $item;
					} elseif (is_object($item)) {
						$items[] = (float) $item;
					} elseif ($item === true) {
						$items[] = '1';
					} else {
						$items[] = '0';
					}
				
				} elseif ($type === null && (is_int($item) || is_float($item))) {
					// auto-detected number
					$items[] = $item;

				} else {
					throw new Exception("Db: Wrong IN() parameters for quote()");
				}
			}
			
			return implode(",", $items);
		
		} elseif ($type === "s" || ($type === null && (is_string($value) || is_bool($value) || is_object($value)))) {
			// string
			return "'". $this->real_escape_string($value). "'";
			
		} elseif ($type === "n") {
			// number
			if (is_numeric($value)) {
				return $value;
			} elseif (is_object($value)) {
				return (float) $value;
			} elseif ($value === true) {
				return '1';
			} else {
				return '0';
			}
			
		} elseif ($type === null && (is_int($value) || is_float($value))) {
			// auto-detected number
			return $value;
		}
			
		throw new Exception("Db: Wrong parameters for quote(): $value,$type");
	}	
}

This may not be exactly what you want so take is as an inspiration. For PDO a similar method could be constructed.

I don’t think you can get the query with data either in Mysqli or in PDO. Even MySQL slow query log will not show the data, as far as I know. You’d have to construct your own mechanisms that would capture data from all bind…() calls along with the prepared statement and reconstruct the query by replacing the placeholders with their corresponding values. While certainly feasible, I think it’s too complicated to bother doing this.

DrQuincy · May 3, 2013, 10:14am

Thanks. I meant you can do:

$query = “SELECT * FROM table”;
$result = mysqli_query($link, $query);

if(!$result) {

log_query($query); // This just writes $query to a text file

}

My point was if you using prepared statements and aren’t storing the query as a string as above, can you just as easily log the error?

Lemon_Juice · May 3, 2013, 10:22am

DrQuincy:

Thanks. I meant you can do:

$query = “SELECT * FROM table”;
$result = mysqli_query($link, $query);

if(!$result) {
log_query($query); // This just writes $query to a text file
}

My point was if you using prepared statements and aren’t storing the query as a string as above, can you just as easily log the error?

Sure you can, the difference is you will want to catch errors in mysqli_stmt_prepare() and mysqli_stmt_execute() instead of query(). You would then log what you have passed to mysqli_stmt_prepare(). Of course, your logged query will be without data.

DrQuincy · May 3, 2013, 10:46am

That’s what I meant so to me that’s a bit of a disadvantage.

logic_earth · May 3, 2013, 10:54am

Try using:
http://www.php.net/manual/en/mysqli-driver.report-mode.php
http://www.php.net/manual/en/class.mysqli-sql-exception.php

It should give more information when there is an error. Including the failing SQL query. Maybe.

However, you know. If you don’t store the query even if you use the old interface there is no way to get the query unless you have it saved in a variable. But using the exceptions should provide some insight.

felgall · May 3, 2013, 8:39pm

No - even if you always use real_escape_string it is still not as secure as using prepare/bind.

As an analogy consider the query as a table and the sql as ribbons with the data as presents your visitors place on the table to be wrapped by the ribbons. Escaping puts the presents in bags so that any presents that are ribbons (and a ribbon is a perfectly valid present) don’t get mixed up with the ones on the table. The person attempting sql injection is trying to get a ribbon onto the table without it ending up in a bag - their ribbon is not intended as a present. With your suggestion of escaping everything all the tools, toys, furniture, used chewing gum, dead mice etc also get put in bags even though there’s no way they can be confused with ribbons just to ensure that you can tell the difference between the ribbons you put on the table and those put there by your visitors as presents and those put there as an injection attempt. It doesn’t mean that the bags don’t have holes in them or some other way for ribbons to get out of the bags even if you bag everything. It also doesn’t mean that all the ribbons in the bags are injection attempts - depending on what you actually accept as valid data.

Prepare/bind is two tables where you put all the ribbons on one and lock it away in the back room and put the second table where your visitors can put the presents. Now it doesn’t matter whether you put anything in bags or not because they are not even in the same room as the ribbons you need to keep separate.

In the first case a successful sql injection simply requires that the ribbon somehow not end up in a bag or somehow gets out of the bag - unlikely but at least possible for the person placing the ribbon to achieve. In the second case no matter what they do they can’t possibly get the ribbon onto the table with the other ribbons because they don’t have any access whatever to that table.

That’s the difference between escaping data in the hope that it doesn’t get confused with code and keeping the two completely separate so that they can’t possibly get confused.

Now if you add an extra step at the start to validate the data then the fields can’t contain junk - if John wants tools as presents then you validate his presents and reject all the toys, furniture, ribbons, dead mice etc. and the bagging of his presents just uses up extra bags without achieving anything else, if Jane only wants ribbons and hairclips then everything except those can be rejected as presents for her but for her presents if you only use one table then the bagging/escaping is essential as her presents can be confused with the ribbons on the table (sql). If you use two separate tables (prepare/bind) then her presents are still on the data table not the sql table so bagging them also just uses up extra bags.

Without validation you can not only get ribbons being given as presents for John, you can also get billions of dead mice and used chewing gum as well - ie your database will get filled with meaningless junk.

Have you tested every possible character combination via real_escape_string to check that there are no possible combinations that would allow sql intended to be stored as data to be processed instead (that is that there is no possibility whatsoever that the bags might have a hole in them to allow a carefully designed ribbon to escape onto the table)?

Jeff_Mott · May 3, 2013, 9:06pm

This might be one of the more confusing analogies I’ve ever read.

I’m a guy who favors prepared statements, but I think I agree with Lemon and the Dr on this one. I agree that it’s easier to make mistakes with escaping than with preparing, but I also agree that if both are used properly, then both are equally secure.

Lemon_Juice · May 3, 2013, 9:34pm

Why not? How is that they don’t have access to that table? Have you locked part of your code so that you can’t get to it anymore? I am the programmer and I am the one who prepares the query, what force will ban me from injecting a potentialy unescaped variable into the statement apart from my own sanity? Nobody forces me to use bind() as I should, it’s just that I know that I should be using bind() and try to remember that.

I wouldn’t say SQL containing data is that confusing. SQL has been designed as human-readable language and having data inside the statements is what makes it easily comprehensible. It’s like you wanted someone to bring you dinner and beer you would be making two phone calls - in the first one you say “Dinner and beer”, hung up and call again and say “Bring me at 4pm, please”.

Sorry if this analogy doesn’t make 100% sense but having read yours about tables, ribbons and dead mice I let my imagination run wild

I wouldn’t mix validation into the whole thing as I think it should be kept separate and totally independend from both binding and escaping.

The extra bags are completely free (or the cost is negligible), 100% eco-friendly and disappear after usage without leaving any waste, so I really couldn’t care less

Why would I have to? Escaping is a pretty simple algorythm and I trust that whoever wrote the native function implemented it well and that is enough. Let’s not go paranoid here. By the same analogy I could ask you - have you tested adding every possible combination of numbers to check there are no errors in your programming language + operator? If not, then your application is insecure because you may never know how it will behave when faulty addition breaks computing a critical hash!

I think I have no more arguments in this case anymore so I’ll simply agree to disagree!

Lemon_Juice · May 3, 2013, 9:46pm

Haha, this wasn’t easy but after reading it for a second time I think I grasped the gist more or less

Exactly, it’s a little bit easier to keep good security discipline when coding binding than escaping but I don’t think the difference is big. It’s more because of some of the trade-offs of prepared statements that I mentioned earlier that I tend to favour escaping.

But it’s all down to personal preference. Some people prefer the idea of wrapping up all presents in a paper bag, putting them on table and then getting two guys move the whole table with presents from one room to another, whereas some prefer to mark the spots on the table where each present should be placed and then getting the guys move the table first and come back and get the presents separately and leaving it up to a lady in the other room to place the presents in their proper places marked on the table :D.

felgall · May 3, 2013, 10:07pm

No you haven’t locked it so you can’t get to it - you have locked it so that your visitors can’t get to it.

The prepare statement does not contain ANY variables - it just contains SQL - therefor no matter what your visitor manages to put into any variable in your code it cannot get run as SQL because the only SQL that gets run is in the prepare statement and there are no variables there.

One last try to explain this. Let’s look at the following code which uses prepare/bind to update a database table.

if ($p = $db->prepare('UPDATE memaddr SET missing=?, mail_centre=?, address=?, suburb=?, state=?, postcode=?, country=?, phone=? WHERE member_number=? AND from_date=?')) {
   $p->bind_param('ssssssssss',$missing,$mail_centre,$address,$suburb,$state,$postcode,$country,$phone,$mn,$date);
   $p->execute();

Now imagine that there is no validation at all and no escaping of any data prior to that code running.

Now tell me how anything can be supplied in any of the variables that will change the SQL in the prepare statement given that there are no variables in the prepare statement? All the SQL that gets run by the execute call is in the prepare statement so it doesn’t matter what the variables in the bind statement contain as whatever is passed there is data - injection into the prepare statement is impossible because it doesn’t contain any variables.

A query call would be just as secure provided that you don’t include any variables in the SQL. Of course that would mean that it can only do what you have predefined it to do and it cannot use any data supplied by visitors as they’d need to be supplied in variables.

Once you include variables in the same call as the SQL you introduce the possibility of that variable changing the SQL. While you escape the variables to try to prevent that confusion they are still in the same statement and so there is at least a possibility of finding a value that will get past the escape call and modify the SQL. While the escape function is supposed to prevent injection how do you know that it doesn’t contain a security hole that will allow injection to occur? You are trusting someone else’s code doesn’t have a security hole - which is potentially less secure than using an alternative that doesn’t depend on code not having security holes.