Preventing Cross-Site Request Forgeries (CSRF)

Tweet

Cross-site request forgery (CSRF) is a common and serious exploit where a user is tricked into performing an action he didn’t explicitly intend to do. This can happen when, for example, the user is logged in to one of his favorite websites and proceeds to click a seemingly harmless link. In the background, his profile information is silently updated with an attacker’s e-mail address. The attacker can then use the website’s password reset feature to e-mail herself a new password and she’s just successfully stolen the account. Any action that a user is allowed to perform while logged in to a website, an attacker can perform on his/her behalf, whether it’s updating a profile, adding items to a shopping cart, posting messages on a forum, or practically anything else.

If you’ve never heard of CSRF before or you haven’t written your code with prevention in mind, then I hate to break it to you but more than likely you’re vulnerable. In this guide I will show you exactly how CSRF attacks work and what you can do to protect your users.

How It Works

To understand how a CSRF attack works, it’s best to see it in action. To illustrate an attack, I’d like to create a simple example that has the ability to logout an active session. We will need a login page (login.php), a processing script to handle logging in and logging out of the session (process.php), and finally an example attack (harmless.html).

First, here’s the code for login.php:

<?php
session_start();
?>
<html>
 <body>
<?php
if (isset($_SESSION["user"])) {
    echo "<p>Welcome back, " . $_SESSION["user"] . "!<br>";
    echo '<a href="process.php?action=logout">Logout</a></p>';
}
else {
?>
  <form action="process.php?action=login" method="post">
   <p>The username is: admin</p>
   <input type="text" name="user" size="20">
   <p>The password is: test</p>
   <input type="password" name="pass" size="20">
   <input type="submit" value="Login">
  </form>
<?php
}
?>
 </body>
</html>

The login.php script begins by initializing the session data. It then checks to see if $_SESSION["user"] has been set, and if so displays a welcome message along with a link to logout. Otherwise it displays the login form.

This is the processing script, process.php:

<?php
session_start();

switch($_GET["action"]) {
    case "login":
        if ($_SERVER["REQUEST_METHOD"] == "POST") {
            $user = (isset($_POST["user"]) &&
                ctype_alnum($_POST["user"]) ? $_POST["user"] : null;
            $pass = (isset($_POST["pass"])) ? $_POST["pass"] : null;
            $salt = '$2a$07$my.s3cr3t.SalTY.str1nG$';

            if (isset($user, $pass) && (crypt($user . $pass, $salt) ==
                crypt("admintest", $salt))) {
                $_SESSION["user"] = $_POST["user"];
            }
        }
        break;

    case "logout":
        $_SESSION = array();
        session_destroy();
        break;
}

header("Location: login.php");
?>

The process.php script also begins by initializing the session data, and then checks to see if there is an action to work with. We perform some basic input validation using PHP’s ternary operator along with the ctype_alnum() and crypt() functions, and then set or destroy the session variable accordingly. The user is redirected back to login.php at the end of the script.

Now let’s focus on the file an attacker might create to exploit the code in our previous examples. This is the exploit code, harmless.html:

<html>
 <body>
  <p>This page is harmless... Or is it?</p>
  <!-- Address to target website -->
  <img src="process.php?action=logout" style="display: none;">
 </body>
</html>

If you visit login.php and log in to your account, and then while logged in you proceed to visit the attacker’s page, you will be automatically logged out even though you didn’t click the logout link. The browser sends a request to the server to access the process.php script, expecting it to be an image file. The processing script has no way of differentiating between a valid request initiated by a user clicking on the logout link and a cleverly-crafted request the browser was tricked into sending.

The harmless.html file could be hosted on an entirely different server than the one you’re logged into, and it would still work because the attacker’s page is making a request on your behalf using the session you have open in the background. It doesn’t even matter if the website you’re logged into is on a private network, the request will be submitted from your IP address as if you made the request yourself, making a trace to the source of the attack nearly impossible.

Additionally, if you allow your users to link to images as a profile avatar or the like, without proper escaping and sanitizing of the user supplied data the attack may even be possible within your own web domain.

While logging someone out of a website isn’t that impressive, harmless.html could have just as easily contained a hidden inline frame (as opposed to an image tag) with a form that automatically submits when the page is loaded, which would make any of the attacks mentioned at the beginning of this guide fair game.

Hopefully now you understand just how serious CSRF attacks can be, so let’s take a look at how you can prevent them.

Protecting Your Users

In order to ensure that an action is actually being performed by the user rather than a third party, you need to associate it with some sort of unique identifier which can then be verified. To prevent the attack, we can modify login.php as follows:

<?php
// make a random id
$_SESSION["token"] = md5(uniqid(mt_rand(), true));
echo '<a href="process.php?action=logout&csrf=' . $_SESSION["token"] . '">Logout</a></p>';

Then to verify the identifier, we can modify process.php as follows:

case "logout":
    if (isset($_GET["csrf"]) && $_GET["csrf"] == $_SESSION["token"]) {
        $_SESSION = array();
        session_destroy();
    }
    break;

With these simple modifications, harmless.html will no longer work because the attacker has been given the additional task of having to guess an additional random token value. To protect forms, you can also include the identifier inside of a hidden field as follows so it is submitted along with the rest of the form data.

<input type="hidden" name="csrf" value="<?php echo $_SESSION["token"]; ?>">

In my own opinion, despite the best intentioned harassing of my esteemed friends and colleagues, I prefer to use PHP’s session_id() rather than generating a random token since I’m not particularly fond of the “security through obscurity” approach. In addition to using session_id(), I also use session_regenerate_id() whenever logging in or updating sensitive information in order to mitigate the risk of any session fixation attacks, and I never append the id to a URL that will be stored in the browsers history. Arbitrarily exposing the session id more than necessary is never a good idea, but so long as you’re careful I think it’s a more elegant approach. Of course, if your website uses some type of authentication that doesn’t use sessions, then you’d need to generate your own id anyway.

Conclusion

By now you should understand the basic principles underlying a CSRF attack and what steps you can take to protect your site and your users. As Ben Franklin said, “an ounce of prevention is worth a pound of cure.” I’m sure all of us would rather take the time to make sure the code we write is secure than deal with the stress, headaches and possible lawsuits surrounding a compromise.

Image via Blazej Lyjak / Shutterstock

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • http://www.5jc.net Huarong

    Could you give an example of this case :
    ‘ In the background, his profile information is silently updated with an attacker’s e-mail address.’ ?

    I am much more interested in this case

    • http://www.psinas.com Martin Psinas
      &lt;html&gt;
       &lt;head&gt;
        &lt;script language=&quot;javascript&quot;&gt;
        &lt;!--
        function autoSubmit() {
          document.myform.submit();
        }
        //--&gt;
        &lt;/script&gt;
       &lt;/head&gt;
       &lt;body onload=&quot;autoSubmit();&quot;&gt;
        &lt;p&gt;This page is harmless... Or is it?&lt;/p&gt;
        &lt;form name=&quot;myform&quot; action=&quot;...&quot; method=&quot;post&quot; target=&quot;myiframe&quot;&gt;
         &lt;input type=&quot;hidden&quot; name=&quot;email&quot; value=&quot;attacker@example.com&quot; /&gt;
        &lt;/form&gt;
        &lt;iframe name=&quot;myiframe&quot; style=&quot;display: none;&quot;&gt;&lt;/iframe&gt;
       &lt;/body&gt;
      &lt;/html&gt;
      • Dave

        Wouldn’t this attack only work if you don’t require password re-confirmation when a user changes their email address?

        • http://www.psinas.com Martin Psinas

          This particular example would not work if you’re asking the user to verify their password prior to updating their email address, correct.

  • Benoit

    Thanks for this tutorial. There is a little error : “and crpyt() functions” => ” and crypt() functions” :)

    • http://zaemis.blogspot.com Timothy Boronczyk

      Thanks Benoit for pointing that out. I’ve fixed the typo in the post.

  • Helen Natasha Moore (@helennatasha)

    Really useful tutorial. I would love to see more security tutorials. Thanks so much.

  • Paul Landerman

    Thanks for the article. I can’t get my head around one thing. You say “If you visit login.php and log in to your account, and then while logged in you proceed to visit the attacker’s page.” If my login.php is on my server (server A), it doesn’t sound like the attacker’s page on server A (maybe this is where I’m lost). Here is my understanding, which is obviously wrong, so hopefully you can clear my thinking: I log into my login page, let’s say the url is http://www.mysite.com/login.php, which runs on my server A. While I’m on my site I decide to go to another site, the attackers site (www.attacker.com/harmless.html). In the harmless.html it has . Hitting the process.php wouldn’t go to server A, so how does the attack occur?

    • Brendan

      The attacker would just change img src=”process.php?action=logout” to img src=”www.mysite.com/process.php?action=logout” – it doesn’t matter that the two pages are on two different servers / domains.

      • http://www.psinas.com Martin Psinas

        This is correct.

    • JAson

      I was thinking the same thing. The only way I can see this actually happening is if the attacker has already surreptitiously gained access to your webspace.

      • http://www.psinas.com Martin Psinas

        This is incorrect.

  • ThePostie

    Why use $_GET instead of $_POST? Wouldn’t $_POST data negate the issues you have with $_GET data?

    • http://www.psinas.com Martin Psinas

      No, it wouldn’t make any difference if I’m using _GET or _POST.

    • http://emokemi.com frostymarvelous

      As demonstrated above, post data can easily be sent using the hidden iframe method.

  • http://onespirit.wetpaint.com/ Anil G

    I believe you’ve misused the term “security through obscurity”.
    This term refers to the reliance on keeping the code or algorithm secret, rather than exchanging a secret token that is different per user and mathematically improbable to obtain.
    Providing a long random key is precisely not security through obscurity.
    http://users.softlab.ntua.gr/~taver/security/secur3.html

    • http://www.psinas.com Martin Psinas

      Semantics aside, the entire point of the CSRF token is to create a unique identifier to be associated with a specific user which is then stored in the session. The session id already does that, therefore the only reason to create a unique token yourself is to prevent the session id from being inadvertently exposed; hence, obscure.

  • http://gilbert.im/ Gilberto Ramos

    Thanks Martin! Good tutorial

  • http://sentinel-soft.com James

    For those looking up CSRF for the first time, there are a few problems with this method, if the user opens up another form using this method, the previous form becomes invalid. It can be quite frustrating for some as the page would need to be reloaded without the POST data being sent for a new token to be generated. There are other ways of preventing some of the attack examples you have given in these comments. You can eliminate the threat from iframes by sending a X-FRAME-OPTIONS: SAMEORIGIN header from your pages, that will prevent most modern browsers from loading it in an iframe from an external site, it is a better version of the JS alternative. Using a form to logout is overkill, the image linking to a logout page can be prevented using link jacking prevention techniques

    • http://www.clearsitecreative.com/ Rhett Waldock

      @James: The problem scenarios you describe strike me as implementation problems that wouldn’t apply to well-considered CSRF prevention code. For example, using the Session ID instead of a single-use token (as the author suggests) is one way to avoid the problem of invalid tokens when working with multiple forms (the session ID provides a token that’s for all for the length of the session). Sticking with the single-use token scheme, it’s also relatively easy to store multiple tokens in the session and keep them valid for their original form, even as additional forms are loaded. If you know your users’ usage patterns, and plan accordingly, there’s usually no reason that effective CSRF inspection can’t be implemented in a non-disruptive way.

      Relying on user agent features for security (like expected behavior for in response to X-Frame-Options header or cross-linking prevention techniques that rely heavily on HTTP Referrer) is a dangerous proposition. Having CSRF awareness built directly into your application is a much better bet!

  • Andrew Forth

    Nice post, Martin. Thank You! Would you mind explaining how the approach using session_id() differs from generating your own unique token? Surely it’s not safe to embed the session id in any markup like you do when you generate your own unique token, so I’m confused as to how the processing code is able to validate the request when using the session id. Seems to me like it would always validate since it’s the logged in user performing all the actions. Clearly, I’m missing some piece of the puzzle.

    • http://www.psinas.com Martin Psinas

      Andrew, there is no difference, really. When a CSRF attack occurs, the logged in user is performing the actions of a forged request, and that forged request is based on public html. The token, whether generated or a session_id, is not likely to be forged by an attacker as it is unique to each user (the attacker can only see his/her own session id). In either case, extra caution should be taken when using the session id as the more it is exposed, the more likely it is an attacker might get a hold of it. I hope that makes more sense.

      • http://onespirit.wetpaint.com/ Anil G

        I think you’re missing something here, Martin. My (admittedly vague) understanding is that the whole point of a CSRF token is that it’s needed *in addition* to the session ID. The CSRF token is an additional check for the web server to be able to identify attacks that DO have a valid session ID but have NOT been able to replicate the CSRF.

        • http://www.psinas.com Martin Psinas

          The token represents a unique string which is tied to a specific action, that string is then validated before the action can be processed. The session ID is a unique string. Use the example provided in the article, replace the random token with the session ID and you will see that the attack is still prevented.

          The keyword in CSRF is “forgery.” The attacker is attempting to forge an action on your website, and they cannot forge another users session ID unless you’re exposing that ID improperly.

          • http://onespirit.wetpaint.com/ AnilG

            I thought that the difference is that the session ID is probably cached in a cookie and the attacker actually co-opts the session ID without actually knowing what it is. This is done because the attacker manages to inject HTML in the users web page unexpectedly, perhaps by crafting a response on a public forum. (Of course forums try to prevent this but that’s the difficult thing that they don’t always achieve).

            So the attack HTML appears in the users page and when the user is tricked in submission, or perhaps an automatic submission is triggered on roll over or page load, the attack is submitted with the Session ID, because it’s submitted with the cookie.

            That’s why the CSRF works. The CSRF is not cached in the browser and does not automatically get submitted with every request. The attacking form needs to replicate a CSRF that it cannot possibly know (although I can conceive of a JavaScript attack that would search the page for valid CSRF tokens to use before submitting the attack).

            The new feature in HTML 5 where form fields don’t have to be inside the form tags in the HTML structure are possibly then an example of how these attacks may be potentially made easier.

            I know there’s been a lot of discussion on this, but I think it’s a pivotal point. Why do web developers use CSRF tokens if a session ID is sufficient?

          • http://www.psinas.com Martin Psinas

            What you’re describing with the forum sounds more like a CSRF attack ON TOP OF an XSS vulnerability, which doesn’t make much sense.

            1. You’re assuming that the session ID stored in the cookie is enough to validate the request when that isn’t the case; the ID must be explicitly included with the request.
            2. If the attacker has access to your session ID then they might as well just hijack your session instead of wasting their time trying to dupe you into a forged request.
            3. If they are for some silly reason using a CSRF attack on top of an XSS vulnerability, then it isn’t going to make a difference if you’re using a session ID or not because the ID would be available to them either way.

            Why do web developers use CSRF tokens if a session ID is sufficient?

            Because some developers find it easier to keep CSRF related issues separate from session related issues, and because using a random token is the recommended way to do it.

  • http://onespirit.wetpaint.com/ AnilG

    Thanks Martin, I think I need to do some more work on this to clarify my own understanding. I hope I can get an opportunity.

    However, on your comments:

    1. Thanks, that makes sense. Now I know why to include the session ID as a parameter required in all requests.

    2. Attacker does not have access to the session ID, but by making a forged request they can submit their attack from the user and the forged request carries the session ID, which they do not otherwise have access to.

    3. I thought CSRF tokens were implemented to close XSS vulnerabilities. The session ID alone does not prevent the XSS. Thus the app has a session ID anyway and is vulnerable despite the session ID to XSS and the CSRF token is added to close the gap. A CSRF attack is one step further, to defeat the one step further hardening that the CSRF token provides on top of the session ID. Apps already use Session IDS for normal request handling housekeeping and therefore do not drop the session ID implementation when they add the CSRF tokens to close the XSS vulnerability that’s there despite the session ID.

    I thought session IDs and CSRF tokens are both normally random tokens. It’s just that if your session ID is vulnerable to an XSS attack you need to close that hole by adding a CSRF token.

    • http://www.psinas.com Martin Psinas

      CSRF and XSS are two entirely different types of attacks.

      • http://onespirit.wetpaint.com/ AnilG

        Thanks for your replies, Martin. Some time later I’ve now had an opportunity to re-read your article. I realise I did not have full clarity about what you were saying. You are simply using the session ID as the number for the CSRF token. You’re still using a CSRF token, but it’s the same string as the session ID.

        You also mention that XSS is a different attack to CSRF. I’d like to know more about that. Can you indicate a good source for further info?

        I’d also like to ask about the vulnerability of the CSRF token to Javascript search. Could an attacker include Javascript in his attack code to scan other pages in the user’s browser, identify the victim site page, and then search the page for the right node containing the CSRF token? Would that be XSS now though?

        I presume the session ID, held in a cookie is not available, since the attack code comes from a different domain and the browser will therefore prevent visibility of the session cookie to the attacker. However, if you now store the session ID in the HTML of the page, is that not now available, using the (XSS?) method I mention, and therefore you have exposed the session ID unnecessarily (should have used a separate CSRF)?

        Thanks for your advice.

  • Saad Shaukat

    I want to ask that is it a good practice to block your files or resources being accessed from server other than yours. e.g. If harmless.html is sitting on another server so the request is coming from a server other than the one hosting your site so if we block the access from other domains then will this stop CSRF attack?

    Thanks

    • http://www.psinas.com Martin Psinas

      Saad Shaukat: I don’t believe that will work, but you’re welcome to try it using the example provided and see what happens.

  • Tom

    Great tut Martin… I actually discovered that using the same method worked great at preventing CSRF. I actually encrypted/decrypted the session ID to create my key which works across the entire site so as not to have to re-validate on different forms.. works flawlessly ;)

  • MrBombastic

    Thanks Good Guy Martin :)

  • Gaurav

    Can I use a $_COOKIE instead of a $_SESSION? Is it still safe enough?

  • Anupy

    @Gaurav : Non-persistent cookies will be more safer. persistent cookies can be hacked from local drives.

  • Nickelback

    1. Somebody will send first request with file_get_contents(or other).
    2. He will parse and get the csrf code (mf8i234…)
    3. He will make second post request with this value and it will be valid?

    How to protect from this?

    • http://www.psinas.com Martin Psinas

      1. the request with file_get_contents won’t work, because the token you parse out of that response will be your own and not the target users.