Efficient User Timelines in a PHP Application with Neo4j

Christophe Willemsen
Share

Any social application you encounter nowadays features a timeline, showing statuses of your friends or followers generally in a descending order of time. Implementing such a feature has never been easy with common SQL or NoSQL databases.

Complexity of queries, performance impacts increasing with the number of friends/followers and difficulties to evolve your social model are points that graph databases are eliminating.

In this tutorial, we’re going to extend the demo application used by the two introduction articles about Neo4j and PHP, respectively:

The application is built on Silex and has users following other users. The goal throughout this article will be to model the feature of feeds efficiently in order to retrieve the last two posts of the people you follow and order them by time.

You’ll discover a particular modeling technique called Linked list and some advanced queries with Cypher.

The source code for this article can be found in its own Github repository.

Modeling a timeline in a graph database

People who are used to other database modeling techniques tend to relate each post to the user. A post would have a timestamp property and the order of the posts will be done against this property.

Here is a simple representation:

Basic user-/>post relationship” title=””></p>
<p>While such a model will work without any problems, there are some downsides to it :</p>
<ul>
<li>For each user, you’ll need to order his posts by time to get the last one</li>
<li>The order operation will grow linearly with the amount of posts and users you follow</li>
<li>It forces the database to execute operations for the ordering</li>
</ul>
<h3 id=Leverage the power of a graph database

A node in a graph database holds a reference to the connections he has, providing fast performance for graph traversals.

A common modeling technique for user feeds is called Linked list. In our application, the user node will have a relationship named LAST_POST to the last post created by the user. This post will have a PREVIOUS_POST relationship to the previous one which also has a PREVIOUS_POST to the second previous post etc, etc…

Linked list

With this model, you have immediate access to the latest post of a user. In fact, you don’t even need to have a timestamp at all to retrieve its timeline (we will keep it though, in order to sort the posts across different users).

More importantly, what the user is doing in time is modeled in a natural way in a graph database. Being able to store the data in a manner that corresponds to how this data is living outside the database is a real benefit for analysis, lookups and understanding your data.

Initial setup

I suggest you download the repository used for the introduction articles and rename it to social-timeline for example:

git clone git@github.com:sitepoint-editors/social-network
mv social-network social-timeline

cd social-timeline
rm -rf .git
composer install
bower install

As in the previous articles, we’re going to load the database with a generated dummy dataset with the help of Graphgen.

You’ll need to have a running database (local or remote), go to this link, click on Generate and then on “Populate your database”.

If you use Neo4j 2.2, you’ll need to provide the neo4j username and your password in the graphgen populator box:

Graphgen population

This will import 50 users with a login, first name and last name. Each user will have two blog posts, one with a LAST_POST relationship to the user and one with a PREVIOUS_POST relationship to the other feed.

If you now open the Neo4j browser, you can see how the users and posts are modeled:

Neo4j users and posts relationships

Displaying the user feeds

The application already has a set of controllers and templates. You can pick one user by clicking on them and it will display their followers and some suggestions of people to follow.

The user feeds route

First, we will add a route for displaying the feeds of a specific user. Add this portion of code to the end of the web/index.php file

$app->get('/users/{user_login}/posts', 'Ikwattro\\SocialNetwork\\Controller\\WebController::showUserPosts')
    ->bind('user_post');

The user feeds controller and the Cypher query

We will map the route to an action in the src/Controller/WebController.php file.

In this action, we will fetch the feeds of the given user from the Neo4j database and pass them to the template along with the user node.

public function showUserPosts(Application $application, Request $request)
    {
        $login = $request->get('user_login');
        $neo = $application['neo'];
        $query = 'MATCH (user:User) WHERE user.login = {login}
        MATCH (user)-[:LAST_POST]->(latest_post)-[PREVIOUS_POST*0..2]->(post)
        RETURN user, collect(post) as posts';
        $params = ['login' => $login];
        $result = $neo->sendCypherQuery($query, $params)->getResult();

        if (null === $result->get('user')) {
            $application->abort(404, 'The user $login was not found');
        }

        $posts = $result->get('posts');

        return $application['twig']->render('show_user_posts.html.twig', array(
            'user' => $result->getSingle('user'),
            'posts' => $posts,
        ));
    }

Some explanations:

  • We first MATCH a user by his login name.
  • We then MATCH the last feed of the user and expand to the PREVIOUS_FEED (The use of the *0..2 relationship depth will have effect to embed the latest_post node inside the post nodes collection) and we limit the maximum depth to 2.
  • We return the found feeds in a collection.

Displaying the feeds in the template

We will first add a link in the user profile to access their feeds, by just adding this line after at the end of the user information block:

<p><a href="{{ path('user_post', {user_login: user.property('login') }) }}">Show posts</a></p>

We will now create our template showing the user timeline (posts). We set a heading and a loop iterating our feeds collection for displaying them in a dedicated html div:

{% extends "layout.html.twig" %}

{% block content %}
    <h1>Posts for {{ user.property('login') }}</h1>

    {% for post in posts %}
        <div class="row">
        <h4>{{ post.properties.title }}</h4>
        <div>{{ post.properties.body }}</div>
        </div>
        <hr/>
    {% endfor %}

{% endblock %}

If you now choose a user and click on the show user posts link, you can see that our posts are well displayed and ordered by descending time without specifying a date property.

A user's feed

Displaying the timeline

If you’ve imported the sample dataset with Graphgen, each of your users will follow approximately 40 other users.

To display a user timeline, you need to fetch all the users he follows and expand the query to the LAST_POST relationship from each user.

When you get all these posts, you need to filter them by time to order them between users.

The user timeline route

The process is the same as the previous one – we add the route to the index.php, we create our controller action, we add a link to the timeline in the user profile template and we create our user timeline template.

Add the route to the web/index.php file

$app->get('/user_timeline/{user_login}', 'Ikwattro\\SocialNetwork\\Controller\\WebController::showUserTimeline')
    ->bind('user_timeline');

The controller action:

public function showUserTimeline(Application $application, Request $request)
    {
        $login = $request->get('user_login');
        $neo = $application['neo'];
        $query = 'MATCH (user:User) WHERE user.login = {user_login}
        MATCH (user)-[:FOLLOWS]->(friend)-[:LAST_POST]->(latest_post)-[:PREVIOUS_POST*0..2]->(post)
        WITH user, friend, post
        ORDER BY post.timestamp DESC
        SKIP 0
        LIMIT 20
        RETURN user, collect({friend: friend, post: post}) as timeline';
        $params = ['user_login' => $login];
        $result = $neo->sendCypherQuery($query, $params)->getResult();

        if (null === $result->get('user')) {
            $application->abort(404, 'The user $login was not found');
        }

        $user = $result->getSingle('user');
        $timeline = $result->get('timeline');

        return $application['twig']->render('show_timeline.html.twig', array(
            'user' => $result->get('user'),
            'timeline' => $timeline,
        ));
    }

Explanations about the query:

  • First we match our user.
  • Then we match the path between this user, the other users he is following and their last feed (see here how Cypher is really expressive about what you want to retrieve).
  • We order the feeds by their timestamp.
  • We return the feeds in collections containing the author and the feed.
  • We limit the result to 20 feeds.

Add a link to the user profile template, just after the user feeds link:

<p><a href="{{ path('user_timeline', {user_login: user.property('login') }) }}">Show timeline</a></p>

And create the timeline template:

% extends "layout.html.twig" %}

{% block content %}
    <h1>Timeline for {{ user.property('login') }}</h1>

    {% for friendFeed in timeline %}
        <div class="row">
        <h4>{{ friendFeed.post.title }}</h4>
        <div>{{ friendFeed.post.body }}</div>
        <p>Written by: {{ friendFeed.friend.login }} on {{ friendFeed.post.timestamp | date('Y-m-d H:i:s') }}</p>
        </div>
        <hr/>
    {% endfor %}

{% endblock %}

We now have a pretty cool timeline showing the last 20 feeds of the people you follow that is efficient for the database.

Timeline implemented

Adding a post to the timeline

In order to add posts to linked lists, the Cypher query is a bit more tricky. You need to create the post node, remove the LAST_POST relationship from the user to the old latest_post, create the new relationship between the very last post node and the user and finally create the PREVIOUS_POST relationship between the new and old last post nodes.

Simple, isn’t? Let’s go!

As usual, we’ll create the POST route for the form pointing to the WebController action:

$app->post('/new_post', 'Ikwattro\\SocialNetwork\\Controller\\WebController::newPost')
    ->bind('new_post');

Next, we will add a basic HTML form for inserting the post title and text in the user template:

#show_user.html.twig
<div class="row">
        <div class="col-sm-6">
            <h5>Add a user status</h5>
            <form id="new_post" method="POST" action="{{ path('new_post') }}">
            <div class="form-group">
                <label for="form_post_title">Post title:</label>
                <input type="text" minLength="3" name="post_title" id="form_post_title" class="form-control"/>
            </div>
            <div class="form-group">
                <label for="form_post_body">Post text:</label>
                <textarea name="post_body" class="form-control"></textarea>
            </div>
            <input type="hidden" name="user_login" value="{{ user.property('login') }}"/>
            <button type="submit" class="btn btn-success">Submit</button>
            </form>
        </div>
    </div>

And finally, we create our newPost action:

public function newPost(Application $application, Request $request)
    {
        $title = $request->get('post_title');
        $body = $request->get('post_body');
        $login = $request->get('user_login');
        $query = 'MATCH (user:User) WHERE user.login = {user_login}
        OPTIONAL MATCH (user)-[r:LAST_POST]->(oldPost)
        DELETE r
        CREATE (p:Post)
        SET p.title = {post_title}, p.body = {post_body}
        CREATE (user)-[:LAST_POST]->(p)
        WITH p, collect(oldPost) as oldLatestPosts
        FOREACH (x in oldLatestPosts|CREATE (p)-[:PREVIOUS_POST]->(x))
        RETURN p';
        $params = [
            'user_login' => $login,
            'post_title' => $title,
            'post_body' => $body
            ];
        $result = $application['neo']->sendCypherQuery($query, $params)->getResult();
        if (null !== $result->getSingle('p')) {
            $redirectRoute = $application['url_generator']->generate('user_post', array('user_login' => $login));

            return $application->redirect($redirectRoute);

        }

        $application->abort(500, sprintf('There was a problem inserting a new post for user "%s"', $login));

    }

Some explanations:

  • We first MATCH the user, then we optionally match his LAST_POST node.
  • We delete the relationship between the user and his most recent last post.
  • We create our new post (which in fact is his last post in his real life timeline).
  • We create the relationship between the user and his “new” last post.
  • We break the query and pass the user, the last post and a collection of his old latest_posts.
  • We then iterate over the collection and create a PREVIOUS_POST relationship between the new last post and the next one.

The tricky part here, is that the oldLatestPosts collection will always contain 0 or 1 elements, which is ideal for our query.

New post form

Conclusion

In this article, we discovered a modeling technique called Linked list, learned how to implement this in a social application and how to retrieve nodes and relationships in an efficient way. We also learned some new Cypher clauses like SKIP and LIMIT, useful for pagination.

While real world timelines are quite a bit more complex than what we’ve seen here, I hope it’s obvious how graph databases like Neo4j really are the best choice for this type of application.