Discover Graph Databases with Neo4j and PHP

By Christophe Willemsen

Graph Databases in PHP with Neo4j

In this post, we’ll be learning about Neo4j, the leading graph database, and ways to use it with PHP. In a followup post, we’ll be building a proper graph application powered by Silex.

Graph databases are now one of the core technologies of companies dealing with highly connected data.

Business graphs, social graphs, knowledge graphs, interest graphs and media graphs are frequently in the (technology) news – and for a reason. The graph model represents a very flexible way of handling relationships in your data, and graph databases provide fast and efficient storage, retrieval and querying for it.

Neo4j, the most popular graph database, has proven its ability to deal with massive amounts of highly connected data in many use-cases.

During the last GraphConnect conference, TomTom and Ebay’s Shuttle demonstrated the value a graph database adds to your company to, for instance, provide fantastic customer experiences or to enable complex route-map editing. Neo4j is developed and supported by Neo Technology – a startup which has grown into a well respected database company.

A short Introduction

For the newcomers, here is a short introduction to graph databases and Neo4j, apart from the theoretical glance we threw at it last year.

What is a Graph ?

A graph is a generic data structure, composed of of nodes (entities) connected by relationships. Sometimes, those are also called vertices and edges. In the property graph model, each node and relationship can be labeled and hold any number of properties describing it.


image via Wikipedia

What is a Graph Database

A graph database is a database optimized for operations on connected data.
Graph databases provide high performance suitable for online operations by using dedicated storage structures for both nodes and relationships.
They don’t need to compute relationships (JOINS) at query time but store them efficiently as part of your data.

Let’s take a simple social application as an example, where users follow other users.

A user will be represented as a Node and can have a label and properties. Labels depict various roles for your nodes.

A Node

The link between these two users will be represented as a Relationship, which can also have properties and a Type to identify the nature of the relationship. Relationships add semantic meaning to your data.

Nodes with Relationship

Looking at the graph shows how natural it is to represent data in a graph and store it in a graph database.


Cypher, the Neo4j Graph Query Language

Querying a graph may not appear to be straightforward. To make it easy, Neo4j developed Cypher, a declarative graph query language, focused on readability and expressiveness for humans as developers, administrators and domain experts.

Being declarative, Cypher focuses on expressing what to retrieve from a graph, rather than how to retrieve it.

The query language is comprised of several distinct clauses. You can read more details about them in the Neo4j manual.

Here are a few clauses used to read and update the graph:

  • MATCH: Finds the “example” graph pattern you provide in the graph and returns one path per found match.
  • WHERE: Filters results with predicates, much like in SQL. There are many more predicates in Cypher though, including collection operations and graph matches.
  • RETURN: Returns your query result in the form you need, as scalar values, graph elements or paths, or collections or even documents.
  • CREATE: Creates graph elements (nodes and relationships) with labels and properties.
  • MERGE: Matches existing patterns or create them. It’s a combination of MATCH and CREATE.

Cypher is all about patterns, it describes the visual representation you’ve already seen as textual patterns (using ASCII-art).
It uses round parentheses to depict nodes (like (m:Movie) or (me:Person:Developer)) and arrows (like --> or -[:LOVES]->) for relationships.

Looking at our last graph of users, a query that will retrieve Hannah Hilpert and the users following her will be written like the following :

MATCH (user:User {name:'Hannah Hilpert'})<-[:FOLLOWS]-(follower) 
RETURN user, follower


Neo4j and PHP

After this quick introduction to the Neo4j graph database (more here), let’s see how we can use it from PHP.

Neo4j is installed as a database server.
An HTTP-API is accessible for manipulating the database and issuing Cypher queries.

If you want to install and run the Neo4j graph database, you can download the latest version here :, extract the archive on your computer and run the ./bin/neo4j start command. Note that this is only for *nix based systems.

Neo4j comes with a cool visual interface, the Neo4j Browser available at http://localhost:7474.
Just try it! There are some guides to get started within the browser, but more information can be found online.

If you don’t want to install it on your machine, you can always create a free instance on GrapheneDB, a Neo4j As A Service provider.

The Neoxygen Components

Neoxygen is a set of open-source components, most of them in PHP, for the Neo4j ecosystem available on Github. Currently, I’m the main developer. If you are interested in contributing as well, just ping me.

A powerful Client for the Neo4j HTTP-API is named NeoClient, with multi-database support and built-in high availabililty management.

Installation and configuration

The installation is trivial, just add the neoclient dependency in your composer.json file :

  "require": {

You configure your connection when building the client :

use Neoxygen\NeoClient\ClientBuilder;

$client = ClientBuilder::create()
  ->addConnection('default', 'http', 'localhost', 7474)

If you created an instance on GrapheneDB, you need to configure a secure connection with credentials. This is done by appending true for using the auth mode and your credentials to the addConnection method :


use Neoxygen\NeoClient\ClientBuilder;

$connUrl = parse_url('');
$user = 'master';
$pwd = 's3cr3tP@ssw0rd';

$client = ClientBuilder::create()
  ->addConnection('default', $connUrl['scheme'], $connUrl['host'], $connUrl['port'], true, $user, $password)

You have now full access to your Neo4j database with the client connecting to the HTTP API.

The library provides handy methods to access the different endpoints. However, the most frequently used method is sending a Cypher query.

Handling graph results in a raw json response is a bit cumbersome. That’s why the library comes with a handy result formatter that transforms the response into node and relationship objects. The formatter is disabled by default, and you can enable it by just adding a line of code into your client building process :

$client = ClientBuilder::create()
  ->addConnection('default', 'http', 'localhost', 7474)

Let’s build something cool

We’re going to build a set of User nodes and FOLLOWS relationships incrementally. Then, we’ll be able to query friend-of-a-friend information to provide friendship suggestions.

The query to create a User is the following :

CREATE (user:User {name:'Kenneth'}) RETURN user

The query is composed of 5 parts :


  • The CREATE clause (in blue), indicating we want to create a new element.
  • The identifier (in orange), used to identify your node in the query
  • The label (in red), used to add the user to the User labelled group.
  • The node properties (in green), are specific to that node.
  • The RETURN clause, indicating what you want to return, here the created user.

You can also try to run that query in the Neo4j Browser.

No need to wait, let’s create this user with the client :

$query = 'CREATE (user:User {name:"Kenneth"}) RETURN user';
$result = $client->sendCypherQuery($query)->getResult();

You can visualize the created node in your browser (open the starred tab and run “Get some data”), or get the graph result with the client.

$user = $result->getSingleNode();
$name = $user->getProperty('name');

We will do the same for another user, now with query parameters. Query parameters are passed along with the query and it allows Neo4j to cache the query execution plan, which will make your further identical queries faster :

$query = 'CREATE (user:User {name: {name} }) RETURN user';
$parameters = array('name' => 'Maxime');
$client->sendCypherQuery($query, $parameters);

As you can see, parameters are embedded in {}, and passed in an array of parameters as second argument of the sendCypherQuery method.

If you look at the graph now, you’ll see the two User nodes, but they feel quite alone :( , no ?


Creating relationships

In order to create the relationships between our nodes, we’ll use Cypher again.

$query = 'MATCH (user1:User {name:{name1}}), (user2:User {name:{name2}}) CREATE (user1)-[:FOLLOWS]->(user2)';
$params = ['user1' => 'Kenneth', 'user2' => 'Maxime'];
$client->sendCypherQuery($query, $params);

Some explanations :

We first match for existing users named Kenneth and Maxime (names provided as parameters), and then we create a FOLLOWS relationship between the two.

Kenneth will be the start node of the FOLLOWS relationship and Maxime the end node.
The relationship type will be FOLLOWS.

Looking at the graph again shows that the relationship has been created.


Creating a bunch of users

Manually writing all the creation statements for a set of 100 users and the relationships would be boring.
I want to introduce a very useful tool called Graphgen (one of the Neoxygen components) for generating graph data with ease.

It uses a specification that is very close to Cypher to describe the graph you want.
Here we’re going to create a set of 50 users and the corresponding FOLLOWS relationships.

Go to , copy and paste the following pattern in the editor area, and click on Generate :

(user:User {login: userName, firstname: firstName, lastname: lastName} *50)-[:FOLLOWS *n..n]->(user)


You can see that it automatically generates a graph with 50 users, the relationships, and realistic values for login, firstname and lastname. Impressive, no?

Let’s import this graph into our local graph database, click on Populate your database and use the default settings.


In no time, the database will be populated with the data.

If you open the Neo4j browser, and run “Get some data” again, you can see all the user nodes and their relationships.


Getting suggestions

Getting suggestions with Neo4j is simple, you just need to match one user, follow the FOLLOWS relationships to the other users, then for each found user, find the users they follow and return those that you do not follow already. The suggestion also must not be the user for whom we are looking for suggestions.

In a common application, there will be a login system and the user will be only allowed to see the users he is following. For the sake of this post which is introducing you Neo4j, you’ll be able to play with all the users.

Let’s write it in Cypher :

$query = 'MATCH (user:User {firstname: {firstname}})-[:FOLLOWS]->(followed)-[:FOLLOWS]->(suggestion)
WHERE user <> suggestion 
  AND NOT (user)-[:FOLLOWS]->(suggestion)
RETURN user, suggestion, count(*) as occurrence
ORDER BY occurrence DESC
LIMIT 10';
$params = ['firstname' => 'Francisco'];
$result = $client->sendCypherQuery($query, $params)->getResult();

$suggestions = $result->get('suggestion'); // Returns a set of nodes

If you run this query in the neo4j browser, you’ll get your first matched user and the suggestions :



In this part:

  • You’ve discovered graph databases and Neo4j
  • You learned the basics of the Cypher Query Language
  • You’ve seen how to connect to and run queries on a Neo4j database with PHP

In a followup article we’ll use everything we’ve learned so far and make a real Neo4j powere Silex PHP application.


Hi Christophe

Good articles


There is a missing arguments on a listing with create item with $parameters - should be:

$query = 'CREATE (user:User {name: {name} }) RETURN user';
$parameters = array('name' => 'Maxime');
$client->sendCypherQuery($query, $parameters);

Thanks, fixed!


Good catch. Thanks.


Also worth pointing out that other types of graph database exist, for example those that follow the Resource Description Framework (RDF) model, commonly known as Triple/Quad stores.

They have the benefit of a W3C standard query language behind them (SPARQL[1]), a defined RESTful protocol[2] and quite a few databases available, both open source and commercial, that have some benefits over Neo4J - for example the ability to scale out with a shared nothing architecture (MarkLogic, Virtuoso Cluster Edition and 4Store).

It all depends on your workload, of course, and for some people the RDF model won't provide all they need, however, it's a powerful model and worth exploring if you're thinking of using a graph database in an upcoming project.



I'd love it if you could write a simplified explanation on these not unlike this one, to introduce people to these concepts in a manner less convoluted than typical W3 specs tend to be. This post will have a part 2 in which we'll be building a friend-recommendations powered Silex app, so showing such approaches from different angles with different technologies would be priceless. Let me know if you're interested or know someone who is.


Sure, I'd be happy to do that. I agree, the W3C specs can be a bit impenetrable, but the basic concepts behind RDF are quite simple when explained well.


Excellent! Do get in touch via and we'll discuss further!


Of course there exist multiple graph databases, like there exist multiple sql databases or key-value stores.

RDF is in my opinion a subset of the possibilities a graph database can offer.
There are SPARQL plugins for Neo4j and many tutorials on the internet about it.

Now as mentioned by Bruno, the goal of the article is to go deeper with a simple demo application providing social network features and timelines.


Hi, thanks for the great introduction !

I don't know anything about Cypher, but shouldn't $params here :

$query = 'MATCH (user1:User {name:{name1}}), (user2:User {name:{name2}}) CREATE (user1)-[:FOLLOWS]->(user2)';
$params = ['user1' => 'Kenneth', 'user2' => 'Maxime'];
$client->sendCypherQuery($query, $params);

be like :

$params = ['name1' => 'Kenneth', 'name2' => 'Maxime'];



I wouldn't say subset per se, more tangential.

Looking at experiences with the SPARQL plugin for Neo4J it's much slower than a dedicated RDF Triple Store.

The other niceities you get with the RDF model is identifying everything with a URI, making the merging of datasets trivial (in the best case), and formal ontologies which provide various predefined models for building your data, e.g. FOAF, Dublin Core Terms, SIOC etc.


Hi jhuet,

Thanks for reading the article. No the parameters are name1 and name2.

Parameters in Cypher are enclosed in {} , while user1 and user2 here are node identifiers.



Learn Coding Online
Learn Web Development

Start learning web development and design for free with SitePoint Premium!

Get the latest in PHP, once a week, for free.