Writing Safe SQL Queries

Hi all, just looking for a bit of direction in writing safe SQL queries e.g. single quotes, double quotes, etc. What’s the best practice?

What do you mean by “safe”? As in ensuring no SQL injection? If so, just use this function: mysql_real_escape_string().

Or take a look at [fphp]PDO[/fphp] and prepared statments.

mysql_real_escape_string also has it’s flaws

Hey ScallioXTX, I’ll take a look at prepared statements - sounds interesting. In your opinion, would you use mysql_real_escape_string or prepared statements?

I used mysql_* for quite a long time, but switched to PDO with prepared statements about a year ago or so. I’m never going back! :slight_smile:

+1 what he said, I sleep better at night too :wink:

I don’t believe many people modify the character set. I’ve never modified the character set myself.

I don’t (yet) use PDO but I have a mechanism for emulating prepared statements so I can use :named placeholder just like PDO. The weird thing is that I hardly ever use them. I don’t know if it’s out of habit - but I have tried many times and then again I come back to queries like this:


$db->query("UPDATE table SET
    id="        .(int) $id. "
  , name="      .$db->quote($name). "
  , edit_time=" .$db->quote($time). "
  WHERE
   status="     .$db->quote($status));

I really tried to like the pretty queries with placeholders and I agree they are very readable but the above style is much less typing: same work in less time. I don’t have to contruct a statement object, I don’t have to use a separate method for each placeholder to bind the value - these things are abstract because they don’t add any functional value to the code. Although this query is a little less readable it’s counterbalanced by the fact that I have the origin of each variable visible in the query itself - I don’t have to look several lines around the code to find out where all the bound values come from. Besides, it’s often 2x fewer lines of code. In many cases this means I can a have a method that is visible in whole on my screen versus occupying two screen heights and having to scroll. I prefer the first one.

So to me prepared statements are just a coding style (or perhaps standard?) that doesn’t thrill me in any way. I have a deep rooted habit of escaping every value I inject in queries so the security aspect is of no concern to me. It’s just as much effort to get in the habit of using prepared statements as it is to get in the habit of escaping every value.

Don’t get me wrong - I’m not against prepared statements at all. It’s just that I see many people seem to recommend and praise them as something really great - but I just can’t see the benefit, or the whole benefit.

There are two main benefits really:

  1. PHP first sends the query to MySQL (or any DBMS for that matter) without any variables filled in. The DBMS can than discern what part of the queries is fixed, and what part(s) of are variables. So after the values of the variables are send to the DBSM, the DBMS know exactly which parts of the query are variables and it can threat those parts as such. If you escape variables in PHP you just send one big string to the DBMS and it has no idea what part of the string is supposed to be what.

  2. Because the string is sent to the DBMS beforehand it can work out an execution plan for that query before you send the variables. That allows you to send multiple sets of variables to perform the same query multiple times with different values and the DBMS only has to work out the execution plan once, and not for every different set of variables as it has to when you don’t use prepared statements.

Does that make sense?

Hey guys, thanks for all the posts, they make very interesting reading. So I’m presuming if I use prepared statements I would not need to use the mysql_real_escape_string?

Indeed. It’s one or the other, not both :slight_smile:

Ok perfect, I’ll try to get to grip with using prepared statements! :smiley:

But how is that a benefit? Mysql does something differently in its internals but there’s actually no practical difference for me - the one who sends the query. What will I lose if I send all queries as one big string? I don’t care much about differences in microseconds - and in most cases there are actually none - see below.

  1. Because the string is sent to the DBMS beforehand it can work out an execution plan for that query before you send the variables. That allows you to send multiple sets of variables to perform the same query multiple times with different values and the DBMS only has to work out the execution plan once, and not for every different set of variables as it has to when you don’t use prepared statements.

I know about this but again how does that benefit me or my application? I’ve studied a few benchmarks and they reported some 10% speed increase when preparing a statement and running the query a large number of times in a loop. But to me it’s completely irrelevant because I believe it’s really bad design if I have to run >1000 queries in a loop! I never do this, why would I? And for single queries there is no performance gain, so I don’t care if mysql will prepare the ‘execution plan’ beforehand or after I send the whole string in plain text.

Does that make sense?

Well, it does make sense but I think these advantages are very weak and insignificant. However, there are some disadvantages to prepared statements that are much more serious because they impact my coding and my time:

  1. I cannot log problematic queries in my application for debugging because I send queries with placeholders instead of real values. Either I need to make my own system for reconstructing expanded queries (not 100% reliable) or I spend more time debugging the possible problems later on.

  2. I spend more time typing all the code to prepare the statement object and bind the values.

Here is some good info about prepared statements: http://blog.ulf-wendel.de/?p=187 - they have their disadvantages, too.

I have also read a comment from a programmer who knows the internals of mysql code for prepared statements and he was explaining why they were not really as good for performance as some people might think. Unfortunately, I don’t have the link, if I find the page I’ll post it here.

I’m on same bloat as Lemon Juice used prepare statements for a while than reverted back. Its so much less convoluted to create reusable data access methods when using strings rather than bound variables. Not to mention much easier to debug by just printing out the query with all values in tact. I’m not against prepared statements but I don’t agree that argument binding is superior to embedding values if you know what you are doing.

That’s quite a nice article, although I have to disagree with some of the reasoning outlined there. I won’t go through all of it, but just as a few examples

* Parse SQL only once and “cache” parse tree etc.
* MySQL: Parsing can take up to ~25% of the run-time of a short query

vs

* A prepared statement and its parse tree occupies limited server resources
* Multiply the number of concurrent connections by the number of prepared statements per connection to know why your server eats hardware - read Brian’s post
* To to scale a “cache” move it from the server to the client. Scaling servers is tricky
* For statements executed only once prepare() causes an unnecessary round-trip and “caching” work
* MySQL: no “caching” of the execution plan, you won’t save the SQL optimization step

The way that’s written it looks like 2 pro’s against five cons, but if you look closely at the cons there are actually only 2, “takes more server resources” and “problems when implementing caching”. Writing stuff this way to make something look bad is easy …

Futhermore

Preventing SQL injections should be an application task not a task of the (interchargable) storage layer

Why? According to whom?

mysqli_real_escape_string() is available to build secure SQL statements

How is that a con of prepared statements or PDO? It’s like saying apples are bad because I like lemons better.

Input validation requires application logic not available in the database

That’s not a con of prepared statements or PDO either. You need to do input validation either way; using prepared statements or not. Saying this is a con of prepared statements is rubbish.

That being said it very much looks to me this author was biased against prepared statements from the start.

Beside that, the benchmark he’s doing is emperical at best; using just a few queries on just one table a bunch of times without testing different data types, different storage engines, well, without testing much at all really, doesn’t prove anything. He might just as well have set up a different benchmark that would have shown that prepared statements are faster.
I reject to take this over simplified benchmark as prove of anything.

That being said, I never said that prepared statements are superior over mysql_real_escape_string, I just said I like them better. That says more about me than the techniques.

As for the problem of the code being too elaborate, I don’t have that problem. I’m not using PDO myself directly, but a class provided in the Yii framework I use, and that allows me to just bind parameters using an array, like


$models=User::model()->findAll('LOWER(username)=? AND password=?', array($username, $password));

And because this array is supplied it can also debug the query with the supplied parameters.

That looks something like :


Executing query SELECT * FROM user WHRERE LOWER(username)=? AND password=?. Bound with 0=ScallioXTX, 1=some-password.

Works fine for what I need :slight_smile:

This is an interesting thread.

No doubt you concur then, that in light of the OPs question, then suggesting PDO is then a reasonable suggestion alongside mysql_real_escape_string.

My advocation of PDO was borne from the sigh of relief I let out when I realized I could stop assessing and learning all the different DBAL (Data Base Access Layers) which were washing around PHP circles - and rely on something which I was sure would evolve into almost a standard.

I find I very rarely write any straight sql these days, relying solely on mappers and crud classes which probably do contain more LOC than if I had my own escaping methods, but as I say, I have now forgotten about the amount of brain cycles I lost worrying about “did I, didnt I?” escape, on which layer? whose responsibility is to escape? etc.

Sure that article is biased, I don’t mean to argue that all his points are valid, my purpose was to simply say that prepared statements have their drawbacks, too, and I don’t think it’s a good idea to always recommend prepared statements - while they won’t hurt their use most of the time is purely cosmetic for a developer who likes to use this kind of style.

That being said, I never said that prepared statements are superior over mysql_real_escape_string, I just said I like them better. That says more about me than the techniques.

All right, then I think we are in agreement on this point! :smiley:

As for the problem of the code being too elaborate, I don’t have that problem. I’m not using PDO myself directly, but a class provided in the Yii framework I use, and that allows me to just bind parameters using an array, like


$models=User::model()->findAll('LOWER(username)=? AND password=?', array($username, $password));

This also looks like a stylistic preference. I also have a class that extends the built-in extension (currently mysqli but that’s not the point) - in this way we can design any API regardless of PS and be happy with our style.

And because this array is supplied it can also debug the query with the supplied parameters.

That looks something like :


Executing query SELECT * FROM user WHRERE LOWER(username)=? AND password=?. Bound with 0=ScallioXTX, 1=some-password.

Okay, then I guess it wouldn’t be too difficult to emulate PS just for the sake of logging. Still, it doesn’t convince me to use them - they are an alternative style and are simply okay for me.

Here is another biased article: Prepared statements are dead, long live prepared statements?. There is one point there that seems logical to me - preparing a statement makes sense when I need to use this statement many times and so I prepare it in order to have it ready for use whenever I need and thus improving performance. Preparing a statement just to use it once and ditch it doesn’t make much sense apart from personal stylistic preference. And one-time usage is 99% of cases in the stateless php language. Escaping values is just a side effect and has never been the intended reason for their usage (actually I should say avoiding sql injection as escaping doesn’t occur with PS).

Maybe it looks like I’m trying to defend a point but I think we are actually in agreement :).

Just curious - how is it much less convoluted?

At first I also thought this kind of unification was great, however later on I realized that even PDO lacks many convenience methods I often used, like fetching one value/one column/etc. with a single method and I needed to wrap PDO in my own class because the build-in methods took too many lines to type. Therefore, it’s almost the same as using dedicated extensions for each database type and writing my own wrapper around them and unifying the API. PDO makes it a little bit easier but not as much as I first hoped.

Another alternative to consider is to use the mysqli_ interface instead of mysql_

Then you have acces both to be able to use classes for your database calls and also use prepare statements without needing anything extra like PDO.