Taming the Snoo: Playing with the Reddit API

Claudio Ribeiro
Share

Reddit is a social networking, entertainment, and news website where the content is almost exclusively submitted by users. According to this report, in February 2016 Reddit had 36 million user accounts, 231 million unique monthly visits, and 11.464 active communities. A recent study also showed that 80% of Reddit users get their news from there.

Reddit also offers its own API. This way, we can use all the information available on Reddit to enrich our own websites or build our own Reddit clients. In this article, we will tackle some basic Reddit API usage with PHP.

Realistic Reddit Alien Painting

The Reddit API

The Reddit API is extensive and very well documented, from private methods that are only accessible through authentication (Reddit uses OAuth2), to public methods that we can use with a basic HTTP call.

In this article, we’ll first focus on the search method. While this is a public call (it does not require authentication), it is also one of the most powerful ones, since it allows us to access all of the history of Reddit posts in every subreddit.

The search method

The search method is available through a basic HTTP request and has a lot of properties. Looking at the documentation, we can see that it supports the HTTP GET method and is available through

https://www.reddit.com/[/r/subreddit]/search

We also have the following arguments available: after, before, count, include_facets, limit, q, restrict_sr, show, sort, sr_detail, syntax, t, and type. The table below can be found in the documentation, and shows every argument with more detail.

Argument Receives
after full name of a thing
before full name of a thing
count a positive integer (default: 0)
include_facets boolean value
limit the maximum number of items desired (default: 25, maximum: 100)
q a string no longer than 512 characters
restrict_sr boolean value
show (optional) the string all
sort one of (relevance, hot, top, new, comments)
sr_detail (optional) expand subreddits
syntax one of (cloudsearch, lucene, plain)
t one of (hour, day, week, month, year, all)
type (optional) comma-delimited list of result types (sr, link)

We will focus on the q, limit, sort and restrict_sr arguments.

The q argument is the most important one and indicates the query for which we will search the subreddit in question. An example of usage would be:

https://www.reddit.com/r/php/search.json?q=oop

This particular call will search for the oop expression in the php subreddit. If you try to make the call using your browser, you will see the results (just copy and paste the link in your browser).

The limit argument limits the number of posts that the returned list will have. An example of usage would be:

https://www.reddit.com/r/php/search.json?q=oop&limit=5

This particular search would return the first 5 results of searching for the oop expression in the php subreddit.

The sort argument will sort the searched posts using one of the five Reddit order properties: hot, new, comments, relevance and top. So, if we want to search the php subreddit for newer oop posts we could make the following call:

https://www.reddit.com/r/php/search.json?q=oop&sort=new

The restrict_sr is a boolean value that indicates whether or not we want to restrict our search to the current subreddit. If we pass 0, we will be searching all of Reddit.

https://www.reddit.com/r/php/search.json?q=oop&sort=new&restrict_sr=1

All the properties can be combined to make more refined searches.

Adding PHP

Being able to call the API through our browser is fun, but we want something more powerful. We want to be able to parse and use the information we get in a lot of different ways.

For this example on using the Reddit API with PHP we could use cURL, but while having cURL skills can be useful, nowadays it is a rather outdated tool. We will use a library called Guzzle and install it with Composer.

composer require guzzlehttp/guzzle

For this part of the project, not only will we use Guzzle, we will also use the rest of the arguments we discussed earlier.

<?php

namespace ApiReddit;

require_once './vendor/autoload.php';

class Searcher
{

    /**
     * This method queries the reddit API for searches
     *
     * @param $subreddit The subreddit to search
     * @param $query The term to search for
     * @param $options The filter used to search
     * @param $results The number of results to return
     *
     **/
    public function execSearch($subreddit = 'php', $query, $options, $results = 10)
    {                   
        //Executes an http request using guzzle
        $client = new \GuzzleHttp\Client([
            'headers' => ['User-Agent' => 'testing/1.0'],
            'verify' => false]);

        $response = $client->request("GET", 'https://reddit.com/r/' . $subreddit . '/search.json', ['query' => 'q=' . $query . '&sort=' . $options . '&restrict_sr=1&limit=' . $results ]);

        $body = $response->getBody(true);

        return $body;
    }
}

In this search, we added more arguments to further refine our search. Now we have subreddit, options, and results (which is set to 10 by default).

Next we will create an index.php file that will query the Reddit API. We will use Twig to render our view and show the results in a table. Then we will create a /templates folder in the root of our project. This folder will hold our Twig templates.

composer require twig/twig

The index.php file:

<?php
require __DIR__.'/vendor/autoload.php';

use ApiReddit\Searcher;

//Executes a new search
$search = new Searcher();
$json=$search->execSearch( 'php', 'composer', 'hot', 5);
$data =  json_decode($json);

//Loads the results in Twig
$loader = new Twig_Loader_Filesystem(__DIR__.'/templates');
$twig = new Twig_Environment($loader, array());

//Renders our view
echo $twig->render('index.twig', array(
    'results' => $data->data->children,
));

After loading Twig, we tell it where we store our templates and to render index.twig.

We also want to create a resources/style.css file to style our results. This file contains the following:

#posts td, #posts th {
    border: 1px solid #ddd;
    text-align: left;
    padding: 8px;
}

#posts th {
    padding-top: 11px;
    padding-bottom: 11px;
    background-color: #4CAF50;
    color: white;
}

Finally, we will create our template file keeping in mind both our results and the CSS. Inside the /templates folder, let’s create an index.twig file:

<!DOCTYPE html>

<html>
<head>
    <meta http-equiv="Content-type" content="text/html; charset=utf-8">
    <title>Reddit Search Table</title>

    <link rel="stylesheet" type="text/css" href="resources/style.css">
</head>
<body>

    <table id='posts'>
        <tr>
            <th>Title</th>
            <th>Ups</th>
            <th>Downs</th>
            <th>Created At</th>
            <th>Subreddit</th>
            <th>Number of Comments</th>
            <th>Link</th>
        </tr>

        {% for item in results %}

            <tr>
                <td>{{ item.data.title }}</td>
                <td>{{ item.data.ups }}</td>
                <td>{{ item.data.downs }}</td>
                <td>{{ item.data.created_utc }}</td>
                <td>{{ item.data.subreddits }}</td>
                <td>{{ item.data.num_comments }}</td>
                <td>{{ item.data.permalink }}</td>
            </tr>

        {% endfor %}

    </table>

</body>
</html>

Our final result is here:

Results table

Adding authentication

While the search method can be very powerful, the Reddit API has dozens more features we can explore, but most of them require authentication. To be able to access all that functionality, we first need a Reddit account, so please make sure you have one before continuing.

After we have an account and before we are able to access the API, there’s some configuration work to be done. Log into your account, and in the top right corner you’ll see the “preferences” button. Click it and the go to the “Apps” tab, then click “Create”.

Fill in the information (and be sure to remember that the Redirect URI will have to be exactly the one we are going to use in our application). We can leave the about url blank.

After that, click Create App. This will give us access to a client ID and a secret token. We will be using this information to authenticate via OAuth2 to the API. For the Oauth2 authentication we will use a very well known package, adoy/oauth2. This package is a light PHP wrapper for the OAuth 2.0 protocol (based on OAuth 2.0 Authorization Protocol draft-ietf-oauth-v2-15).

composer require adoy/oauth2

Now that we have the tools for using Oauth2, let’s create a file called Oauth.php in the root of our project with the following content:

<?php

require_once './vendor/autoload.php';

use ApiReddit\Authenticator;

$authorizeUrl = 'https://ssl.reddit.com/api/v1/authorize';
$accessTokenUrl = 'https://ssl.reddit.com/api/v1/access_token';
$clientId = 'Your Client Id';
$clientSecret = 'Your Secret';
$userAgent = 'RedditApiClient/0.1 by YourName';

$redirectUrl =  "Your redirect url";

$auth = new Authenticator( $authorizeUrl, $accessTokenUrl, $clientId, $clientSecret, $userAgent, $redirectUrl );
$auth->authenticate();

We are creating an instance of Authenticator and calling the authenticate() method. We are also autoloading the class by adding it to Composer. For this to work, let´s create the Authenticator.php class file in our /src folder.

<?php

namespace ApiReddit;

class Authenticator
{

    public function __construct($authorizeUrl, $accessTokenUrl, $clientId, $clientSecret, $userAgent, $redirectUrl)
    {

        $this->authorizeUrl   = $authorizeUrl;
        $this->accessTokenUrl = $accessTokenUrl;
        $this->clientId       = $clientId;
        $this->clientSecret   = $clientSecret;
        $this->userAgent      = $userAgent;
        $this->redirectUrl    = $redirectUrl;
    }

    public function authenticate()
    {

        $client = new \OAuth2\Client($this->clientId, $this->clientSecret, \OAuth2\Client::AUTH_TYPE_AUTHORIZATION_BASIC);
        $client->setCurlOption(CURLOPT_USERAGENT, $this->userAgent);

        if (!isset($_GET["code"])) {
            $authUrl = $client->getAuthenticationUrl($this->authorizeUrl, $this->redirectUrl, array(
                "scope" => "identity",
                "state" => "SomeUnguessableValue"
            ));
            header("Location: " . $authUrl);
            die("Redirect");
        } else {
            //$this->getUserPreferences($client, $this->accessTokenUrl);
        }
    }
}

In the Oauth.php file, we are initializing our project variables with all the data needed to authenticate through the API. Authenticator.php is where the magic happens.

We are creating a new OAuth2 client instance with our ID and secret using adoy. After that is basic OAuth logic: we use an authentication URL to execute the authentication and a redirect one to where we will be redirected after authentication.

One important thing to notice is the use of a scope in our call. The scope is basically the scope of the functionality we want to have access to. In this case, we are using identity because, in this example, we will be wanting to be fetch our own user information. All the possible actions and respective scopes are explained in the API documentation.

Last but not least, we have a line that is commented. This was on purpose. The getUserPreferences method will make the call to the API method we want to use. So let’s uncomment that line, and implement the method below.

/**
* This function will request the Reddit API for the user information
*
* @param $client The client ID
* @param $accessToken The access token given
*/

public function getUserPreferences( $client, $accessToken )
{

    $params = array("code" => $_GET["code"], "redirect_uri" => $this->redirectUrl);
    $response = $client->getAccessToken($accessToken, "authorization_code", $params);

    $accessTokenResult = $response["result"];
    $client->setAccessToken($accessTokenResult["access_token"]);
    $client->setAccessTokenType(OAuth2\Client::ACCESS_TOKEN_BEARER);

    $response = $client->fetch("https://oauth.reddit.com/api/v1/me.json");

    echo('<strong>Response for fetch me.json:</strong><pre>');
    print_r($response);
    echo('</pre>');
}

We are obtaining the access token and using it in our call to fetch information from https://oauth.reddit.com/api/v1/me.json. This GET method will return the identity of the user currently authenticated. The answer will be a json array.

If we run it, we will be taken to a login page, just like it is supposed to happen with OAuth. After logging in, we will be prompted with this page:

Allow access

Just click allow, and if everything went right, we should have a json array containing all the information about the user we just authenticated with.

This is how we authenticate with the Reddit API. We can now take a deeper look at the documentation and check all the great things we can make with it.

All the code can be found in this github repository.

Conclusion

We learned how to to use the search functionality of the Reddit API and took a first look at authenticating and using the methods that require a logged in user. But, we just scratched the surface.

The possibilities are huge because the Reddit API gives us access to an almost endless knowledge pool. Be sure to check the entire Reddit API documentation as it offers much more, and do let us know if you build anything interesting!