Stripping and slashing

When I receive payment details from PayPal, names like “O’Toole” have backslashes added: “O\'Toole”.

Which practice is better: to remove the slashes 1) before inserting in the db or 2) before displaying the name in the browser? (And why?)

Do you have magicquotes on? Its been a while, but I don’t think paypal is shipping you the quotes if you don’t send them first.

You should never have magicquotes on. One of the most horrible ideas in the string of horrible ideas that went into PHP.

It is important that the slashes remain when inserted into the database. In the example you give, the apostrophe could be read as ending the string, causing an error. In general, slashes are also added to prevent malicious code from being executed.

Here is a bit of php code I uses to process the data before displaying:

$row=mysql_fetch_array($result);

foreach ($row as $k=>$v){
 $row[$k]=stripslashes($v);
}

E

Thanks, that’s what I thought.

I think that the data in the database should not be escaped, so in the database it should be O’Toole.

Of course, when the data is going to the database, it should be escaped, as it’ll wreck your query otherwise (as Eruna says). Your database abstraction layer should be taking care of this for you though (e.g. PDO if you’re using PHP).

So in the PayPal example:

receive data from paypal
remove slashes from the name
process the order (send e-mails etc.)
insert into/ update in db (letting your abstraction layer take care of quoting, assuming you’re using something like PDO)

When displaying the information, the data should be escaped before it’s rendered into HTML. Not so much for the quotes, but more for HTML tags and JS injection etc.

Please take care not to confuse stripping and slashing with slashing strippers!
You can sent to jail!

In my particular case, I’ve got this order:

  • Receive data from PayPal
  • Check to make sure this isn’t a duplicate
  • Insert into (flat-file) database
  • Process the order

There’s no abstraction layer, but I understand the concept. I’m just wanting to make sure I do this similar to if I was using, say, MySQL.

So I’m leaving the slashes in place and using stripslashes() whenever the data is being displayed. Doing a search for “O’Toole” when it’s in the db as “O\'Toole” seems to work just fine. Any problems with that?

No problems there in terms of functionality, but there are concept issues.

I’m guessing the problem is that, if you’re using PHP, you have magic_quotes enabled. This means that everything is slashed before being put into the $_POST and $_GET arrays. It’s also very bad coding practise to have it enabled - escaping should be done by the developer for the correct circumstance.

If you are using MySQL, you’d use mysql_real_escape_string() on a string. That would convert “O’Toole” into “O\'Toole”, and the database reads that as “O’Toole” (the \ acts a denominator to prevent ’ meaning the end of a string value).

So if you have O\'Toole in the database, it means that “O’Toole” is being escaped twice - most likely by magic_quotes being enabled.

Of course, stripslashes would convert the \’ to ', but it makes no sense in terms of the programming to need to have to do that in the first place.

wwb_99 and Jake, you both are correct: magic_quotes_gpc is enabled. So, I guess I should be checking for this. Or should I?

[By way of explanation, this is a script that I inherited from a programmer who was, shall I say, loose in his coding. I’ve already dealt with bitwise OR that should have been logical OR, coding as if register_globals was on, code that resulted in a ton of notify-type errors,…]

The script is used by a lot of people on a lot of different servers. Up until now, magic quotes hasn’t been an issue; even now, the only problem seems to be names with apostrophes in them are displayed with a preceding backslash.

[As I mentioned before, I’m not using a “real” database. The information from PayPal is written to a text file. For various reasons, that won’t change soon.]

What’s the appropriate course of action given these circumstances? stripslashes() on all incoming data regardless of magic_quotes? Check for magic_quotes and do something different based on the setting? Or simply save the data as-is and stripslashes() when it’s used?

BTW, no strippers were harmed during the creating of this reply.

I have this in my bootstrap files:

  if (function_exists('get_magic_quotes_gpc') && get_magic_quotes_gpc()) {
    $_GET    = array_map('stripslashes', $_GET);
    $_POST   = array_map('stripslashes', $_POST);
    $_COOKIE = array_map('stripslashes', $_COOKIE);
  }//if

Everyone above is correct as well - you should store ‘raw’ data ‘as-is’ in the database, and only escape it on output. Different applications using the data may require different escaping rules (or none at all maybe) so it’s better to have the original data in the database.

Of course, you should escape the data as appropriate when inserting into the database to ward of SQL injection attacks & invalid SQL, but that is a different issue.

There it is: clean and neat. Thanks, everybody, for your input. (Now if I could just get some love on this post.)

Quite frankly, if you are dealing with it in the database using a MODERN API like mysqli or PDO, OR in the markup, it should make ZERO difference as single quotes and double quotes outside of tags or inside queries SHOULD have zero effect.

There are ONLY three places it would be a concern is if you are using said values as a property for an attribute (in which case I’d probably strip it altogether), as part of a $_GET value or URL, or if you are not using prepared queries.

There is NO legitimate reason for doing it as content in your markup, and no legitimate reason for doing it in your database unless you’re building your query strings on the fly like the pinnacle of 1998 PHP coding.

But then, there’s a reason I laugh hysterically when I see people still talking about sanitizing values and magic quotes.