Turn this in to a positive

I need to vent a little guys!

I have just been asked to look at a PHP script that takes a number of arguments via $_GET, loads some data from the database, and then checks that a checksum (also passed via $_GET) matches, before deciding to display the information or not. Fairly straightforward (though I do NOT approve of only using an auth key in the URL) but then I stumbled upon something like this:

if ($_GET['hash'] !== $calculatedHash){
    die('Incorrect hash '.$_GET['hash'].' - Should be '.$calculatedHash);
} ...

Now, apart from the fact that the guy who wrote this is clearly a tool, how’s about we turn this in to a positive for new developers to teach them why NOT to do this? How about one point per person?

I’ll start with why I disagree with passing any kind of auth-token in a URL when it’s the only method required:

It’s far too easy to allow people access who shouldn’t be able to. Your URLs could be recorded on a proxy or a firewall log. They could be captured in the history of the browser that you just used at the coffee shop. If someone can work out the logic involved in generating the hash (ie it’s something as simple as md5($firstname.$lastname)) then they’ll be able to get in to your system and look at pretty much anything that they want. If you send the link by email then what happens if it’s intercepted somehow? Gets redirected or sent to the wrong address? If it’s sensitive data then even more so than ever, you should verify that a person is who they claim to be. So even if you do pass some easily cracked hash via URL, then have the user confirm that they are who they say they are with a password, or a one-time-key that you can email or SMS to them when the page is loaded (so only genuine users intended to receive the email can view it).

Anyone else care to add comment?

Off Topic:

They used an en dash in the error message when the situation clearly calls for an unspaced em dash.

It resembles a line of debug which someone forgot to comment out/replace.

I mean, echoing out the actual required hash is simply thick, so I’ll take side with the view that the developer just forgot s/he was echoing variables out just so s/he could see what was going on.

Notwithstanding all the security points being made here, there are things you can do to avoid leaving lines of debug code as easter eggs for opportunists .

I am taking liberties by calling it debug code, but in effect we are talking about mechanisms to look into variables in order to understand what variables contain, but debug is the term I will use.

I am not talking about trapping Exceptions or even much about error handling… just the kind of basic debug/echo advice we give out to new users here daily in order for us (and them) to fully grok what is going in inside some of the vars.

Here is a really simple trick I learned early on:

Lets say your entire application is just one file.

Right at the top do this:


<?php
$debug  = 1 ;

// .. rest of you code

You can then simply turn your local var $debug off and an one again by changing it to 0 (equates to true/false in an if() condition).

Then as you are coding you can choose to echo or var_dump variables out onto the page as you code.



// a) what is ACTUALLY being passed to PHP in terms of values and types?

if($debug) var_dump($_POST) ;

// b ) you are about to fork your code depending on a condition

// what is in setting is it a string "1" or an int 1 or is it boolean true?
if($debug) var_dump($setting) ;

if( $setting == 1 ){
// fork direction a

}else{
// fork direction b

}

// complex sql statement building going on here
$sql = $what . $where . $limit ;

// c) take a look at it on screen, copy the output into your db admin tool, check you have
// some matching data and that the statement is assembled correctly

if($debug) echo $sql . '<hr />' . PHP_EOL;

// send your sql to database down here

// carry on


So your code can be littered with lines of debug such as that and you can turn the echo/dump statements off at the top of the file so you can start to work out what went wrong and where.

No magic is involved. it is very easy method for new coders to adopt and if you really want to remove all the lines of debug before going live you can easily grep through the file looking for the word “debug”.

The lines are easy to spot, they are ugly – but it is immediately obvious what they do.

You can then take that basic idea and include the instruction in a single config file – although I rarely use this method now, there are much better ways of finding out what your program is up to – but these all involve some amount of “magic” (or abstraction) which is fine once you are up to speed with the basics, you will be able to adopt better practices.

From memory I think I then wrote a function with dumped variables and included that in startup files, then I auto included it depending on which server I was on, I went on and tried stuff like that … then started throwing errors into log files so I could pick up the vars … Exceptions, XDebug, TDD yada yada…

I digress.

As I say, you can start to work out what went wrong and where, you should then be starting to ask yourself, before you writing code :

“I wonder what could go wrong here? In fact, how can I prove to myself it is all going to plan?”.

Then you will make real progress because you are starting to think defensively about your code – and is likely to be an underlying cause of the original developers error.

You’re wrong on one point - The security hole is not that he was using $_GET. $_POST is equally vulnerable to being sniffed in transit. While posts do not get cached the way get strings do, they are not a replacement for using an HTTPS connection for security and further increased security is not the intent of POST. The intent is that GET is for read operations, POST is for write operations. The initial log in should still be a POST since it presumably writes a log in state to the server, but that has nothing to do with security.

Also, if they know the hash and the hash is being transmitted explicitly they can use a rainbow table attack. They don’t have to have the precise origin password if salt wasn’t used to generate the hash (it likely wasn’t).

Finally, bear in mind that a $_SESSION key is nothing more than an md5 hash as well. Before you lay into this guy too hard keep in mind that PHP itself does transmit a hash back and forth with the browser to maintain session state. These also can get caught up in proxy logs and caches, but they expire frequently enough to prevent that from being valuable.

It is possible, perhaps even likely, this code is old enough to predate PHP 4.3. Internal session handling did not stabilize before then. If that isn’t the case then this is just an example of a programmer who didn’t read the manual.

@Cups - That is a very good point. I tend to set a DEBUG constant or similar and when enabled have it log values to a file outside of the public webroot via a debug function. However, there’s always the problem with that in that I end up having to pick through loads of code to find what I’m looking for. My biggest issue with this error was the fact that yes, it probably was done for debugging, but in that case why was such an English error required? Why not just echo the computed hash? The developer really ought to be able to copy and paste the two and compare them, or to copy the one on the page and put it in the URL. It ought to be obvious to him and not so much to other users.

@Michael Morris - Who are you saying is wrong? Me? If so then I may not have been clear enough. I wasn’t condoning revealing secure data via a simple passing of $_POST data instead of $_GET, but $_POST is definitely preferable. I completely agree that HTTPS should be used whenever such data is being transmitted, but HTTPS + hashes passed by $_GET = bad still. HTTPS is not a magic answer to web-security; more security is still required! Also, while $_POST data is just as vulnerable to being sniffed as $_GET data, $_GET data is far more likely to be logged inadvertently by logging mechanisms than $_POST. People sniffing for $_POST are much more likely to be doing so for malicious reasons, sure, but the increased likelihood of $_GET data being captured favours the opportunistic hacker.

PS; don’t get me started on “rainbow tables”. They’re unlikely to be of any use if a hash is composed of $firstname.$lastname since that would produce a string that would be a lot longer than most people generate in rainbow tables. If it’s as obvious as that, and there’s no salt involved, then all you need to do is to find the names of other users and you can just construct the hash yourself. Easy. Besides, I’d be far more concerned about collisions with MD5 and SHA1 than rainbow tables. Pants encryption for anything actually secure and shouldn’t be used for security full stop. Anyway, that’s another rant altogether and I don’t want to go down that heated route again :smiley:

I was just explaining why the hash might be there. If the code was written prior to the stabilization of the PHP sessions API (4.3 or so) then there’s good reason for it to be there - it’s the only way to maintain state. At the end of the day, a session ID is only a hash, and it is transmitted as part of the headers of every request, Encryption is the only way to protect that session key - post or get isn’t really relevant.

Attaching a hash to the request when sessions aren’t working is an old trick. Some real old browsers don’t support sessions and so attaching a hash to maintain state was required. vBulletin still has support for doing this even now.

So a bit of devil’s advocacy here - the original coder might not be an idiot, he was just coding for a different circumstance than is dealt with these days. Though leaving a debug statement in the code like that is rather foolish.

@Michael_Morris, I hear what you’re saying, but this is basically a URL that is emailed out to customers for them to directly view their invoices out without needing to log in, which I think is a terrible practice. It’s like an infinite session that anyone can use to view anyone else’s data. Today, I needed to view a particular invoice and I couldn’t find a link to generate it, so I very quickly created the URL from scratch and lo, it worked. Suffice to say that this kind of access will not be around for long now I’ve found it!

Wow… Just… wow… Yeah. I thought we where talking about an antiquated session maintenance between page refreshes, not something that boneheaded.

Yeah, I know I can be a harsh critic (and I don’t even consider myself an expert, despite many years working professionally in enterprise PHP/MySQL dev) but I was starting to think that I was being too harsh for a while there, until your last post :smiley: Made my day, cheers :wink:

I’m thinking about assembling all of these kinds of things in to book that I’ll write after I’ve rid us of this God-awful code (don’t get me started how the latest app that these guys developed was based on a 2-year old version of Code Igniter and includes a 5-year old version of TinyMCE) and I shall call this book “PHP: Every mistake under the sun, made on one system”. I have recently posted about a classic character replacement that instead of using something like str_replace() or preg_replace(), they looped through every single character in a list of allowed characters, for every single character in the string that needed to be sanitised. It timed out at 30+ seconds trying to deal with 3000 customers. After I’d updated it to use preg_replace() it took less than a second. Sigh.

This does also bring me on to something that I’ve long thought, and having recently read “PHP Objects, Practice and Patterns” I see I’m not alone. It basically says that PHP is easy and it flatters you, as you appear to be getting the desired results and you have no idea that you’re actually making every mistake under the sun. That’s what has happened here IMHO. Just because it looks like it’s working, it doesn’t mean that it’s right

That’s not unique to PHP. If it wasn’t for people making stupid mistakes, the rest of us wouldn’t get work from ‘fixing’ (… rewriting) these people’s code.

Firstly I’d point the finger at the person who gave this developer a list of things to do. Not only did they hire someone with no common sense without getting a second opinion, but they gave the request of a button to log the user into the website without any requirements to login… from a link… which tells you what your hash should be? And the developer agreed to that?

A pair of idiots if you ask me; If a client requests that I develop an application with flaws, I decline with a bulletproof reason and an alternative. If they persist, I tell them to find another developer to do that part.

Firstly, bypassing login is always going to have theoretical consequences, so that’s a dodgy one. There are ways to make it more secure: what springs to mind would be a handshake between software on the client PC (downloadable once login is ensured, and after a change in password online this application would have to ask for a new password to use, rather than automatic login).


Though its not the most stupid thing I’ve seen recently, unfortunately. Reviewing a new developer’s code for a company I’m contracted to (to review and debug code written by new employees after their first task, as part of the recruitment procedure), I found that their latest developer was lazy about reading the custom framework’s manual. They had merely copied modified an existing form class to create a new one. Existing form: Login, new form: Captcha.

The form handling was essentially handled in 4 stages in methods inside a form class. They had done well with the form output - it was, indeed, a captcha form. The validation - yes, it validated that the data was of the correct form. The processing - yep, the entered words were verified correctly and the application was lead off to a standard route of offering a new captcha. Awesome. The code wasn’t brilliant and had several questionable sections of code, but it worked to this point - and thats all that matters to this company. Sigh…

Now, for the final stage - post-processing. This is essentially a method to make any session/database updates, redirect pages etc. The user had not modified this from the login code, out of probable clumsiness.

The chain of events was rather remarkable as to how this turned into an immense f*** up:

  • The following code remained:

    php $this->setSession('user', (int)$this->getUserID());
  • setSession was a method from the parent Form class.
  • The parent Form class also had a __call method so that set[…] and get[…] could be called without explicit variable/method creation in the child classes of Form. (YUCK)
  • If get[…] requested a variable which did not exist, false was returned.
  • False, when cast to an integer, is 0.
  • The default admin ID was 0.
  • The new programmer and junior tester were both using accounts already loaded with administrative powers.

This was a very, very unfortunate set of circumstances.

By correctly entering a captcha image, any user was instantly an administrator.

The next task that specific team in the company is going for is a university portal - which involves everything from coursework uploading and accessible private information / personal forms. So I say, with no jest, that I hope the guy lost his job or was moved to a more basic team.

Wow. That’s pretty bad!

To be honest, I dislike the method of casting data that is returned from a function anyway. why would $this->getUserID() return anything other than either an int or a bool? I’ve seen many functions that prepare a value for use further on. I remember previously seeing an age value in a form being passed through a prepare_int() function, which essentially just cast it as an integer. If I put in a postcode by accident (ie SW1 1AA) it would come out of that function as 11. The developer was under the impression that this was preferable to trying to put SW11AA in to the database, whereas I suggested that maybe it’s better to return to the form with an error that the age value was invalid, rather than putting erroneous data in to the database. Typically, in my own functions, the first few lines are checking that the arguments passed to the function are valid, so in this example:

function nameAndAge($name, $age){}

The first things that I’d be doing would be to check that $name is valid (ie doesn’t have numbers or obscure symbols in it; basically only has the characters that we’ve defined application-wide as acceptable), and then that $age is an integer (no, I won’t accept 24.5 as your age ;)). What I absolutely wouldn’t be doing would be running $age through any function that prepares (casts) it as an integer. However, that’s not to say that I don’t approve of a prepare_int() function that will actually trigger an error, throw an exception or at the very least return false if the argument passed to it was not actually an integer, but I would still be doing some additional logic in the function to say “if the age (not) is valid then…”

function nameAndAge($name, $age){
    if (!$this->validName($name)) throw NEW Exception('Invalid name');
    if (!$this->validAge($age)) throw NEW Exception('Invalid age');

    // Rest of function...

}

…assuming that the validName() and validAge() functions don’t throw exceptions themselves, that is. I would probably have the validName() and validAge() functions merely return a BOOL to be honest, as they are essentially like isset(), is_int() etc and just confirming that an argument passed to them is valid. Whether an exception is necessary or not would depend on the function calling them.

I’m not alone there, am I?

So what does validAge() do?

Cast the incoming var to an integer, then check the bounds?

Say between 13 and 110?

No, there is no casting. If you need to cast an age to an INT then the input is incorrect. But yes, I would check that it’s between various limits. For example, in my previous job we sold (essentially) credit contracts and as a result we would reject anyone under the age of 18 and over another age that I forget. Basically it was just because they would fail a credit check anyway, so we just didn’t bother accepting them before we even passed their details on for a credit check. If that level of functionality isn’t required then something like an is_int() or ctype_digit() would probably suffice, depending on your needs, obviously.O

Oh, and just quickly for the record; that was completely a made-up example off the top of my head and not actual code :wink:

Right, so you are saying you already filtered the input of age?

I feel like you’re trying to make a point but I have no idea what it is. No, it is not already filtered. If it was already filtered then there’d be no need to run it through any kind of validation function, shirley?

So we’re clear, this is not real code, I’m simply stating that just casting something to the type that you want it to be doesn’t mean that it was suitable in the first place or that the result of the cast will actually be what you expect. If you know that a function will return a number as a string and you need it to be an INT then fine, but if you could potentially receive completely unsuitable data from a form (especially if someone has mangled your form) such as text where you should receive numbers then casting it is useless. You need to know whether the data that you received is of the expected type, as if you were type hinting:

function nameAndAge(string $name, int $age){
    // No type validation required
}

Languages that support this syntax will effectively throw an exception if the type is incorrect here, as PHP does if you use it to hint for the class of an object passed to the function. I’m merely replicating that in my own code

No, not trying to make any particular point, you started this thread appealing for it to turn up some positive discussion. I think this will be revealing for anyone reading it.

I rely on typecasting quite frequently, and am wary of the unexpected, or so I thought.

So if someone entered ‘bollx’ as an age, you mean you would just see if it is between 18 and 90 without typecasting it, or do you check to see if it is an integer first, then fail, then go on and check the bounds?

Well if it’s passed from a form then it will be a string unless you either cast it to a numeric type or perform some kind of arithmetic on it (ie $var + 0). Take a couple of examples:

$age1 = 35;
$age2 = "35";
$age3 = "hello";
$age4 = "h3ll0";

Now let’s see what happens when we check for validity:

<pre>
<?php
// Base test
$age1 = 35;
$age2 = "35";
$age3 = "hello";
$age4 = "h3ll0";

var_dump(ctype_digit($age1));   // bool(false)  35
var_dump(ctype_digit($age2));   // bool(true)   "35"
var_dump(ctype_digit($age3));   // bool(false)  "hello"
var_dump(ctype_digit($age4));   // bool(false)  "h3ll0"
                             
var_dump(is_int($age1));        // bool(true)   35
var_dump(is_int($age2));        // bool(false)  "35"
var_dump(is_int($age3));        // bool(false)  "hello"
var_dump(is_int($age4));        // bool(false)  "h3ll0"
                                    
// Integer test                     
$age1i = (int)$age1;                
$age2i = (int)$age2;                
$age3i = (int)$age3;                
$age4i = (int)$age4;                
                                    
var_dump(ctype_digit($age1i));  // bool(false)  35
var_dump(ctype_digit($age2i));  // bool(false)  35
var_dump(ctype_digit($age3i));  // bool(false)  0
var_dump(ctype_digit($age4i));  // bool(false)  0
                              
var_dump(is_int($age1i));       // bool(true)   35
var_dump(is_int($age2i));       // bool(true)   35
var_dump(is_int($age3i));       // bool(true)   0
var_dump(is_int($age4i));       // bool(true)   0
                              
// String test                
$age1s = (string)$age1;       
$age2s = (string)$age2;       
$age3s = (string)$age3;       
$age4s = (string)$age4;       
                              
var_dump(ctype_digit($age1s));  // bool(true)   "35"
var_dump(ctype_digit($age2s));  // bool(true)   "35"
var_dump(ctype_digit($age3s));  // bool(false)  "hello"
var_dump(ctype_digit($age4s));  // bool(false)  "h3ll0"
                              
var_dump(is_int($age1s));       // bool(false)  "35"
var_dump(is_int($age2s));       // bool(false)  "35"
var_dump(is_int($age3s));       // bool(false)  "hello"
var_dump(is_int($age4s));       // bool(false)  "h3ll0"
?>
</pre>

Apologies for the scruffy code, but put together very quickly as an example.

As you can see, taking the original input which could be an actual numeric type, or a string (we don’t know at this point) then ctype_digit() says that a string with numbers ONLY in it is TRUE, whereas is_int() only says that the numeric type integer is an integer, as expected. Neither recognise the strings that contain non-numeric characters as an integer.

Then I cast all four as integers and ctype_digit no longer recognises any of them as strings containing numbers (as it shouldn’t). However, is_int() finds all of them to be TRUE, because the non-numerics become 0

Then I cast all four as strings and this time, ctype_digit() correctly identifies the fully numeric values and not the others. is_int() finds them all to be FALSE, as you would expect.

Which you use would be completely dependant on your requirements, but the prepareInt() function that I alluded to earlier was using a preg_replace to remove any non-numeric characters, something like this:

// Preg_Replace test
$age1p = preg_replace("/[^\\d]+/", "", $age1);
$age2p = preg_replace("/[^\\d]+/", "", $age2);
$age3p = preg_replace("/[^\\d]+/", "", $age3);
$age4p = preg_replace("/[^\\d]+/", "", $age4);

var_dump(ctype_digit($age1p));  // bool(true)   "35"
var_dump(ctype_digit($age2p));  // bool(true)   "35"
var_dump(ctype_digit($age3p));  // bool(false)  ""
var_dump(ctype_digit($age4p));  // bool(true)   "30"

var_dump(is_int($age1p));  // bool(false)  "35"
var_dump(is_int($age2p));  // bool(false)  "35"
var_dump(is_int($age3p));  // bool(false)  ""
var_dump(is_int($age4p));  // bool(false)  "30"

So that one was taking the “h” and “ll” out of “h3ll0” and leaving us with 30. That would be a bit awkward when I put in postcode in the wrong field and “SW1 1AA” becomes “11” years old!

Lastly, when it doesn’t matter whether it be a string or an actual int, a simple function does the job and you can cast it as necessary if you really need to (and you don’t for most PHP purposes, though I accept that it’s good practice):

// Simple Int Function
function isInt($var){
    return (is_int($var) || ctype_digit($var)) ? TRUE : FALSE;
}

var_dump(isInt($age1));  // bool(true)   35
var_dump(isInt($age2));  // bool(true)   "35"
var_dump(isInt($age3));  // bool(false)  "hello"
var_dump(isInt($age4));  // bool(false)  "h3ll0"

That would probably actually meet most people’s requirements (discounting the fact that 35.0 would return FALSE despite the fact that technically it IS an integer)