Building Your Own URL Shortener

Alex Fraundorf
Alex Fraundorf
Share

Most of us are familiar with seeing URLs like bit.ly or t.co on our Twitter or Facebook feeds. These are examples of shortened URLs, which are a short alias or pointer to a longer page link. For example, I can send you the shortened URL http://bit.ly/SaaYw5 that will forward you to a very long Google URL with search results on how to iron a shirt. It would be much easier to text the 20-character bit.ly URL to your son who is in college and preparing for his first big job interview. In this article you’ll learn how to create a fully functional URL shortener for your website that will work whether you use a front controller/framework or not. If you use a front controller, I’ll discuss you how to easily integrate this URL shortener without having to dig into the controller’s programming.

Answering Some Common Questions

So with bit.ly and many other URL shorteners like it out there and freely available, why should we bother building our own? Most of these shortening services even have an easy-to-use API so that we can programmatically generate a shortened URL, and use it within our PHP scripts. The best reasons are for convenience, aesthetics and brand recognition. If for example your website has an application that creates a large amount of reports, a very active blog or a large photo album, there will be a lot of links. A URL shortener will allow you to programmatically create a clean, simple link that can be emailed to your readers or published on your website. The obvious advantage to having your own is that your readers have instant brand recognition with your website. You may wonder why you always see letters mixed with numbers in shortened URL’s. By having more than ten options (0-9) per digit, we are able to have dramatically more combinations while keeping the code as short as possible. The characters we’ll be using are the digits 1-9 along with various upper/lowercase letters. I have removed all of the vowels to prevent having links created which are unintended bad words, and I have removed any characters that could be confused with each other. This gives us a list of about 50 characters available for each digit, which means that with two characters, we have 2,500 possible combinations, 125,000 possibilities with three characters, and a whopping 6.5 million combinations with just four characters!

Planning the Database

Let’s set up the short_urls table. It’s a simple table and the create statement is found below:
CREATE TABLE IF NOT EXISTS short_urls (
  id INTEGER UNSIGNED NOT NULL AUTO_INCREMENT,
  long_url VARCHAR(255) NOT NULL,
  short_code VARBINARY(6) NOT NULL,
  date_created INTEGER UNSIGNED NOT NULL,
  counter INTEGER UNSIGNED NOT NULL DEFAULT '0',

  PRIMARY KEY (id),
  KEY short_code (short_code)
)
ENGINE=InnoDB;
We have our standard auto-incrementing primary key and fields for the full URL, the shortened code for the URL (indexed for faster retrieval), a timestamp when the row was created, and the number of times the short URL has been accessed. Note that the long_url
field has a maximum length of 255 characters, which should be sufficient for most applications. If you need to store longer URLs then you’ll need to change its definition to TEXT. Now on to the PHP!

Creating a URL Short Code

The code to create and decode short URL codes will be in a class named ShortUrl. First, let’s look at the code responsible for creating the short codes:
<?php
class ShortUrl
{
    protected static $chars = "123456789bcdfghjkmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ";
    protected static $table = "short_urls";
    protected static $checkUrlExists = true;

    protected $pdo;
    protected $timestamp;

    public function __construct(PDO $pdo) {
        $this->pdo = $pdo;
        $this->timestamp = $_SERVER["REQUEST_TIME"];
    }

    public function urlToShortCode($url) {
        if (empty($url)) {
            throw new Exception("No URL was supplied.");
        }

        if ($this->validateUrlFormat($url) == false) {
            throw new Exception(
                "URL does not have a valid format.");
        }

        if (self::$checkUrlExists) {
            if (!$this->verifyUrlExists($url)) {
                throw new Exception(
                    "URL does not appear to exist.");
            }
        }

        $shortCode = $this->urlExistsInDb($url);
        if ($shortCode == false) {
            $shortCode = $this->createShortCode($url);
        }

        return $shortCode;
    }

    protected function validateUrlFormat($url) {
        return filter_var($url, FILTER_VALIDATE_URL,
            FILTER_FLAG_HOST_REQUIRED);
    }

    protected function verifyUrlExists($url) {
        $ch = curl_init();
        curl_setopt($ch, CURLOPT_URL, $url);
        curl_setopt($ch, CURLOPT_NOBODY, true);
        curl_setopt($ch,  CURLOPT_RETURNTRANSFER, true);
        curl_exec($ch);
        $response = curl_getinfo($ch, CURLINFO_HTTP_CODE);
        curl_close($ch);

        return (!empty($response) && $response != 404);
    }

    protected function urlExistsInDb($url) {
        $query = "SELECT short_code FROM " . self::$table .
            " WHERE long_url = :long_url LIMIT 1";
        $stmt = $this->pdo->prepare($query);
        $params = array(
            "long_url" => $url
        );
        $stmt->execute($params);

        $result = $stmt->fetch();
        return (empty($result)) ? false : $result["short_code"];
    }

    protected function createShortCode($url) {
        $id = $this->insertUrlInDb($url);
        $shortCode = $this->convertIntToShortCode($id);
        $this->insertShortCodeInDb($id, $shortCode);
        return $shortCode;
    }

    protected function insertUrlInDb($url) {
        $query = "INSERT INTO " . self::$table .
            " (long_url, date_created) " .
            " VALUES (:long_url, :timestamp)";
        $stmnt = $this->pdo->prepare($query);
        $params = array(
            "long_url" => $url,
            "timestamp" => $this->timestamp
        );
        $stmnt->execute($params);

        return $this->pdo->lastInsertId();
    }

    protected function convertIntToShortCode($id) {
        $id = intval($id);
        if ($id < 1) {
            throw new Exception(
                "The ID is not a valid integer");
        }

        $length = strlen(self::$chars);
        // make sure length of available characters is at
        // least a reasonable minimum - there should be at
        // least 10 characters
        if ($length < 10) {
            throw new Exception("Length of chars is too small");
        }

        $code = "";
        while ($id > $length - 1) {
            // determine the value of the next higher character
            // in the short code should be and prepend
            $code = self::$chars[fmod($id, $length)] .
                $code;
            // reset $id to remaining value to be converted
            $id = floor($id / $length);
        }

        // remaining value of $id is less than the length of
        // self::$chars
        $code = self::$chars[$id] . $code;

        return $code;
    }

    protected function insertShortCodeInDb($id, $code) {
        if ($id == null || $code == null) {
            throw new Exception("Input parameter(s) invalid.");
        }
        $query = "UPDATE " . self::$table .
            " SET short_code = :short_code WHERE id = :id";
        $stmnt = $this->pdo->prepare($query);
        $params = array(
            "short_code" => $code,
            "id" => $id
        );
        $stmnt->execute($params);

        if ($stmnt->rowCount() < 1) {
            throw new Exception(
                "Row was not updated with short code.");
        }

        return true;
    }
...
When we instantiate our ShortUrl class, we’ll pass it our PDO object instance. The constructor stores this reference and sets the $timestamp
member. We call the urlToShortCode() method passing it the long URL that we wish to shorten. The method wraps up everything needed to create the short URL code, which we will appended to our domain name. urlToShortCode() calls validateUrlFormat() which simply uses a PHP filter to make sure that the URL is properly formatted. Then, if the static variable $checkUrlExists is true, verifyUrlExists()
will be called which uses cURL to contact the URL and make sure that it doesn’t return a 404 (Not Found) error. You could alternatively check for a 200 (OK) status, but this could cause issues if the page were to unexpectedly return a 301 (Moved) or 401 (Unauthorized) response code. It doesn’t make sense to have duplicate entries, so the code checks for that with urlExistsInDb() which queries the database for the long URL. If it finds the URL, it will return the corresponding short code, otherwise it returns false so we know we need to create it. Note that http://www.example.com and http://example.com are different URLs, so if you want to prevent this kind of duplication then you will have to add some regular expressions. createShortCode() delegates the following tasks to specific methods:
  1. insertUrlInDb() to insert the long URL into the database and return the new row’s ID.
  2. convertIntToShortCode() to convert the new row’s ID to our base-50 number scheme.
  3. insertShortCodeInDb() to update the row with the newly created short code.
When we want to create a short URL, all we have to do is instantiate the class, passing a PDO instance to the constructor, call the urlToShortCode() method with the long URL we wish to shorten, and append the returned short code to the domain and pass it back to the controller that requested it.
<?php
include "../include/config.php";
include "../include/ShortUrl.php";

try {
    $pdo = new PDO(DB_PDODRIVER . ":host=" . DB_HOST .
        ";dbname=" . DB_DATABASE,
        DB_USERNAME, DB_PASSWORD);
}
catch (PDOException $e) {
    trigger_error("Error: Failed to establish connection to database.");
    exit;
}

$shortUrl = new ShortUrl($pdo);
try {
    $code = $shortUrl->urlToShortCode($_POST["url"]);
    printf('<p><strong>Short URL:</strong> <a href="%s">%1$s</a></p>',
        SHORTURL_PREFIX . $code);
    exit;
}
catch (Exception $e) {
    // log exception and then redirect to error page.
    header("Location: /error");
    exit;
}

Decoding a Short Code

The code to decode a short code and obtain the long URL is part of the ShortUrl class too. We call the shortCodeToUrl() method and pass it the short code we have extracted from the URI. shortCodeToUrl() also accepts an optional parameter, $increment
, which defaults to true. It then delegates the following:
  1. validateShortCodeFormat() makes sure that the provided short code only contains letters and numbers.
  2. getUrlFromDb() queries the database for the supplied short code and returns the record’s id, long_url, and counter fields.
  3. If the $increment parameter is true, incrementCounter() is called to increment the row’s counter field.
Here’s the rest of the class:
...
    public function shortCodeToUrl($code, $increment = true) {
        if (empty($code)) {
            throw new Exception("No short code was supplied.");
        }

        if ($this->validateShortCode($code) == false) {
            throw new Exception(
                "Short code does not have a valid format.");
        }

        $urlRow = $this->getUrlFromDb($code);
        if (empty($urlRow)) {
            throw new Exception(
                "Short code does not appear to exist.");
        }

        if ($increment == true) {
            $this->incrementCounter($urlRow["id"]);
        }

        return $urlRow["long_url"];
    }

    protected function validateShortCode($code) {
        return preg_match("|[" . self::$chars . "]+|", $code);
    }

    protected function getUrlFromDb($code) {
        $query = "SELECT id, long_url FROM " . self::$table .
            " WHERE short_code = :short_code LIMIT 1";
        $stmt = $this->pdo->prepare($query);
        $params=array(
            "short_code" => $code
        );
        $stmt->execute($params);

        $result = $stmt->fetch();
        return (empty($result)) ? false : $result;
    }

    protected function incrementCounter($id) {
        $query = "UPDATE " . self::$table .
            " SET counter = counter + 1 WHERE id = :id";
        $stmt = $this->pdo->prepare($query);
        $params = array(
            "id" => $id
        );
        $stmt->execute($params);
    }
}

Bringing It All Together

Building/altering a front controller or tailoring this package to an existing framework are outside the scope of this article, and so I’ve opted to contain our decoding logic in a file named r.php (r standing for redirect). We can write our shortened URLs as http://example.com/r/X4c where r.php
(or r/index.php depending on your design) will be the controller. This format will be easy to integrate into just about any framework without touching the existing front controller. On a related note, if you would like to learn how to build your own front controllers, check out the excellent series An Introduction to the Front Controller Pattern. One advantage of this design is that, if you wanted to, you can have a separate controller for different parts of your site using different tables to keep the short codes organized and as short as possible. http://example.com/b/ could be for blog posts, and http://example.com/i/ could be for images. “But what if I don’t use a front controller or framework?” you ask, “Did I just read this whole article for nothing?” Although it’s not as pretty, you can use the format http://example.com/r?c=X4c where r/index.php contains the decoding script. Here’s what r.php looks like:
<?php
include "../include/config.php";
include "../include/ShortUrl.php";

// How are you getting your short code?

// from framework or front controller using a URL format like
// http://.example.com/r/X4c
// $code = $uri_data[1];

// from the query string using a URL format like
// http://example.com/r?c=X4c where this file is index.php in the
// directory http_root/r/index.php
$code = $_GET["c"];

try {
    $pdo = new PDO(DB_PDODRIVER . ":host=" . DB_HOST .
        ";dbname=" . DB_DATABASE,
        DB_USERNAME, DB_PASSWORD);
}
catch (PDOException $e) {
    trigger_error("Error: Failed to establish connection to database.");
    exit;
}

$shortUrl = new ShortUrl($pdo);
try {
    $url = $shortUrl->shortCodeToUrl($code);
    header("Location: " . $url);
    exit;
}
catch (Exception $e) {
    // log exception and then redirect to error page.
    header("Location: /error");
    exit;
}
Depending on how you are getting the short code, the variable $code is set along with your other configuration settings. We establish our PDO connection, instantiate an instance of ShortUrl, and call shortCodeToUrl() passing it the short code and leaving the counter setting the default value. If the short code is valid, you’ll have a long URL which you can redirect the user to.

In Closing

So there you have it, your very own URL shortener that is incredibly easy to add to your existing site. Of course, there are plenty of ways that this package could be improved, such as:
  • Abstract your database interaction to remove redundant code.
  • Add a way to cache shortened URL requests.
  • Add some analytics to the requested short URLs beyond the counter field.
  • Add a way to filter out malicious pages.
I’d would like to take this opportunity to thank Timothy Boronczyk for his patient advice throughout my writing process. It was an honor to write this article for SitePoint and to work with him. Feel free to fork this article’s sample code on GitHub and share your contributions and improvements. Thanks for reading and happy PHPing! Image via Fotolia

Frequently Asked Questions (FAQs) on Building Your Own URL Shortener

What is a URL shortener and why would I need one?

A URL shortener is a tool that converts a regular URL into a condensed format, typically consisting of a random combination of letters and numbers. The primary reason for using a URL shortener is to make long URLs more manageable and shareable, especially on social media platforms where character count may be limited. It also helps in tracking and analyzing data associated with the URL, such as click-through rates or geographic information of visitors.

How secure is a self-built URL shortener?

The security of a self-built URL shortener largely depends on the coding practices followed during its development. It’s crucial to implement proper validation and sanitization of inputs to prevent SQL injection attacks. Additionally, using HTTPS for your URL shortener can help protect the data from being intercepted during transmission.

Can I customize the shortened URL?

Yes, most URL shorteners, including self-built ones, allow for customization of the shortened URL. This can be particularly useful for branding purposes or making the URL more meaningful to users.

How can I track the performance of my shortened URLs?

Many URL shorteners come with built-in analytics capabilities. You can track various metrics such as the number of clicks, the geographic location of the users, the referral sources, and more. This data can be invaluable for understanding your audience and improving your marketing strategies.

Can a self-built URL shortener handle high traffic?

The ability of a self-built URL shortener to handle high traffic depends on the efficiency of the underlying code and the server resources. It’s important to ensure that the shortener is built to handle potential high loads and that the server has enough capacity to manage the traffic.

What programming languages can I use to build a URL shortener?

You can use a variety of programming languages to build a URL shortener. The choice of language will depend on your comfort level and the specific requirements of your project. Some popular choices include PHP, Python, and JavaScript.

How can I prevent spam or misuse of my URL shortener?

There are several strategies to prevent spam or misuse of your URL shortener. These include implementing CAPTCHA tests, monitoring for suspicious activity, and even blacklisting certain IP addresses if necessary.

Can I make my URL shortener public for others to use?

Yes, you can make your URL shortener public for others to use. However, keep in mind that this will require additional considerations around security, scalability, and abuse prevention.

How long does it take to build a URL shortener?

The time it takes to build a URL shortener can vary greatly depending on your coding skills and the complexity of the project. A basic URL shortener can be built in a few hours, while a more complex one with additional features may take several days or even weeks.

Can I monetize my URL shortener?

Yes, it’s possible to monetize a URL shortener. Some common methods include displaying ads when the shortened URL is clicked, offering premium features for a fee, or using the shortener as part of a larger marketing or SEO service.