Build a REST API from Scratch: Implementation
We ended the first part of this tutorial with all the basic layers of our API in place. We have our server setup, authentication system, JSON input/output, error management and a couple of dummy routes. But, most importantly, we wrote the README
file that defines resources and actions. Now it’s time to deal with these resources.
Creating and updating contacts
We have no data right now, so we can start with contact creation. Current REST best practices suggest that create and update operations should return a resource representation. Since the core of this article is the API, the code that deals with the database is very basic and could be done better. In a real world application you probably would use a more robust ORM/Model and validation library.
$app->post(
'/contacts',
function () use ($app, $log) {
$body = $app->request()->getBody();
$errors = $app->validateContact($body);
if (empty($errors)) {
$contact = \ORM::for_table('contacts')->create();
if (isset($body['notes'])) {
$notes = $body['notes'];
unset($body['notes']);
}
$contact->set($body);
if (true === $contact->save()) {
// Insert notes
if (!empty($notes)) {
$contactNotes = array();
foreach ($notes as $item) {
$item['contact_id'] = $contact->id;
$note = \ORM::for_table('notes')->create();
$note->set($item);
if (true === $note->save()) {
$contactNotes[] = $note->asArray();
}
}
}
$output = $contact->asArray();
if (!empty($contactNotes)) {
$output['notes'] = $contactNotes;
}
echo json_encode($output, JSON_PRETTY_PRINT);
} else {
throw new Exception("Unable to save contact");
}
} else {
throw new ValidationException("Invalid data", 0, $errors);
}
}
);
We are in the /api/v1
group route here, dealing with the /contacts
resource with the POST
method. First we need the body of the request. Our middleware ensures that it is a valid JSON or we would not be at this point in the code. The method $app->validateContact()
ensures that the provided data is sanitized and performs basic validation; it makes sure that we have at least a first name and a unique valid email address. We can reasonably think that the JSON payload could contain both contact and notes data, so I’m processing both. I’m creating a new contact, with my ORM specific code, and in case of success I insert the linked notes, if present. The ORM provides me with objects for both contact and notes containing the ID from the database, so finally I produce a single array to encode in JSON. The JSON_PRETTY_PRINT
option is available from version 5.4 of PHP, for older version you can ask Google for a replacement.
The code for updating a contact is pretty similar, the only differences are that we are testing the existence of the contact and notes before processing data, and the validation differs slightly.
$contact = \ORM::forTable('contacts')->findOne($id);
if ($contact) {
$body = $app->request()->getBody();
$errors = $app->validateContact($body, 'update');
// other stuff here...
}
We can optimize further by mapping the same code to more than one method, for example I’m mapping the PUT
and PATCH
methods to the same code:
$app->map(
'/contacts/:id',
function ($id) use ($app, $log) {
// Update code here...
)->via('PUT', 'PATCH');
Listing contacts
Now that we have some contacts in our database it’s time to list and filter. Let’s start simple:
// Get contacts
$app->get(
'/contacts',
function () use ($app, $log) {
$contacts = array();
$results = \ORM::forTable('contacts');
$contacts = $results->findArray();
echo json_encode($contacts, JSON_PRETTY_PRINT);
}
);
The statement that retrieves the data depends on your ORM. Idiorm makes it simple and returns an associative array or an empty one, that is encoded in JSON and displayed. In case of an error or exception, the JSON middleware that we wrote earlier catches the exception and converts it into JSON. But let’s complicate it a bit…
Fields, filters, sorting and searching
A good API should allow to us to limit the fields retrieved, sort the results, and use basic filters or search queries. For example, the URL:
/api/v1/contacts?fields=firstname,email&sort=-email&firstname=Viola&q=vitae
Should return all the contacts named “Viola” where the first name OR
email address contains the string vitae
, they should be ordered by alphabetically descending email address (-email
) and I want only the firstname
and email
fields. How do we do this?
$app->get(
'/contacts',
function () use ($app, $log) {
$contacts = array();
$filters = array();
$total = 0;
// Default resultset
$results = \ORM::forTable('contacts');
// Get and sanitize filters from the URL
if ($rawfilters = $app->request->get()) {
unset(
$rawfilters['sort'],
$rawfilters['fields'],
$rawfilters['page'],
$rawfilters['per_page']
);
foreach ($rawfilters as $key => $value) {
$filters[$key] = filter_var($value, FILTER_SANITIZE_STRING);
}
}
// Add filters to the query
if (!empty($filters)) {
foreach ($filters as $key => $value) {
if ('q' == $key) {
$results->whereRaw(
'(`firstname` LIKE ? OR `email` LIKE ?)',
array('%'.$value.'%', '%'.$value.'%')
);
} else {
$results->where($key,$value);
}
}
}
// Get and sanitize field list from the URL
if ($fields = $app->request->get('fields')) {
$fields = explode(',', $fields);
$fields = array_map(
function($field) {
$field = filter_var($field, FILTER_SANITIZE_STRING);
return trim($field);
},
$fields
);
}
// Add field list to the query
if (is_array($fields) && !empty($fields)) {
$results->selectMany($fields);
}
// Manage sort options
if ($sort = $app->request->get('sort')) {
$sort = explode(',', $sort);
$sort = array_map(
function($s) {
$s = filter_var($s, FILTER_SANITIZE_STRING);
return trim($s);
},
$sort
);
foreach ($sort as $expr) {
if ('-' == substr($expr, 0, 1)) {
$results->orderByDesc(substr($expr, 1));
} else {
$results->orderByAsc($expr);
}
}
}
// Pagination logic
$page = filter_var(
$app->request->get('page'),
FILTER_SANITIZE_NUMBER_INT
);
if (!empty($page)) {
$perPage = filter_var(
$app->request->get('per_page'),
FILTER_SANITIZE_NUMBER_INT
);
if (empty($perPage)) {
$perPage = 10;
}
// Total after filters and
// before pagination limit
$total = $results->count();
// Pagination "Link" headers go here...
$results->limit($perPage)->offset($page * $perPage - $perPage);
}
$contacts = $results->findArray();
// ORM fix needed
if (empty($total)) {
$total = count($contacts);
}
$app->response->headers->set('X-Total-Count', $total);
echo json_encode($contacts, JSON_PRETTY_PRINT);
}
);
First I define a default result set (all contacts), then I extract the full query string parameters into the $rawfilters
array, unsetting the keys fields
, sort
, page
and per_page
, I’ll deal with them later. I sanitize keys and values to obtain the final $filters
array. The filters are then applied to the query using the ORM specific syntax. I do the same for the field list and sort options, adding the pieces to our result set query. Only then I can run the query with findArray()
and return the results.
Pagination logic and headers
It’s a good idea to provide a way to limit the returned data. In the code above I provide the page
and per_page
parameters. After validation they can be passed to the ORM to filter the results:
$results->limit($perPage)->offset(($page * $perPage) - $perPage);
Before that I obtain a count of the total results, so I can set the X-Total-Count
HTTP header. Now I can compute the Link header to publish the pagination URLs like this:
Link: <https://mycontacts.dev/api/v1/contacts?page=2&per_page=5>; rel="next",<https://mycontacts.dev/api/v1/contacts?page=20&per_page=5>; rel="last"
The pagination URLs are calculated using the actual sanitized parameters:
$linkBaseURL = $app->request->getUrl()
. $app->request->getRootUri()
. $app->request->getResourceUri();
// Adding fields
if (!empty($fields)) {
$queryString[] = 'fields='
. join(
',',
array_map(
function($f){
return urlencode($f);
},
$fields
)
);
}
// Adding filters
if (!empty($filters)) {
$queryString[] = http_build_query($filters);
}
// Adding sort options
if (!empty($sort)) {
$queryString[] = 'sort='
. join(
',',
array_map(
function($s){
return urlencode($s);
},
$sort
)
);
}
if ($page < $pages) {
$next = $linkBaseURL . '?' . join(
'&',
array_merge(
$queryString,
array(
'page=' . (string) ($page + 1),
'per_page=' . $perPage
)
)
);
$links[] = sprintf('<%s>; rel="next"', $next);
}
First I calculate the current base URL for the resource, then I add the fields, filters and sort options to the query string. In the end I build the full URLs by joining the pagination parameters.
Contact details and autoloading
At this point fetching the details of a single contact is really easy:
$app->get(
'/contacts/:id',
function ($id) use ($app, $log) {
// Validate input code here...
$contact = \ORM::forTable('contacts')->findOne($id);
if ($contact) {
echo json_encode($contact->asArray(), JSON_PRETTY_PRINT);
return;
}
$app->notFound();
}
);
We try a simple ORM query and encode the result, if any, or a 404 error. But we could go further. For contact creation, it’s reasonable enough that we may want the contact and the notes, so instead of making multiple calls we can trigger this option using query string parameters, for example:
https://mycontacts.dev/api/v1/contacts/1?embed=notes
We can edit the code to:
// ...
if ($contact) {
$output = $contact->asArray();
if ('notes' === $app->request->get('embed')) {
$notes = \ORM::forTable('notes')
->where('contact_id', $id)
->orderByDesc('id')
->findArray();
if (!empty($notes)) {
$output['notes'] = $notes;
}
}
echo json_encode($output, JSON_PRETTY_PRINT);
return;
}
// ...
If we have a valid contact and an embed
parameter that requests the notes, we run another query, searching for linked notes, in reverse order by ID (or date or whatever we want). With a full featured ORM/Model structure we could, and should, make a single query to our database, in order to improve performance.
Caching
Caching is important for our application’s performance. A good API should at least allow client side caching using the HTTP protocol’s caching framework. In this example I’ll use ETag
and in addition to this we will add a simple internal cache layer using APC. All these features are powered by a middleware. A year ago Tim wrote about Slim Middleware here on Sitepoint, coding a Cache Middleware as example. I’ve expanded his code for our API\Middleware\Cache
object. The middleware is added the standard way during our applications’s bootstrap phase:
$app->add(new API\Middleware\Cache('/api/v1'));
The Cache constructor accepts a root URI as a parameter, so we can activate the cache from /api/v1
and its subpaths in the main method.
public function __construct($root = '')
{
$this->root = $root;
$this->ttl = 300; // 5 minutes
}
We also set a default TTL of 5 minutes, that can be overridden later with the $app->config()
utility method.
// Cache middleware
public function call()
{
$key = $this->app->request->getResourceUri();
$response = $this->app->response;
if ($ttl = $this->app->config('cache.ttl')) {
$this->ttl = $ttl;
}
if (preg_match('|^' . $this->root . '.*|', $key)) {
// Process cache here...
}
// Pass the game...
$this->next->call();
}
The initial cache key is the resource URI. If it does not match with our root we pass the action to the next middleware. The next crossroad is the HTTP method: we want to clean the cache on update methods (PUT, POST and PATCH) and read from it on GET requests:
$method = strtolower($this->app->request->getMethod());
if ('get' === $method) {
// Process cache here...
} else {
if ($response->status() == 200) {
$response->headers->set(
'X-Cache',
'NONE'
);
$this->clean($key);
}
}
If a successful write action has been performed we clean the cache for the matching key. Actually the clean()
method will clean all objects whose key starts with $key
. If the request is a GET the cache engine starts working.
if ('get' === $method) {
$queryString = http_build_query($this->app->request->get());
if (!empty($queryString)) {
$key .= '?' . $queryString;
}
$data = $this->fetch($key);
if ($data) {
// Cache hit... return the cached content
$response->headers->set(
'Content-Type',
'application/json'
);
$response->headers->set(
'X-Cache',
'HIT'
);
try {
$this->app->etag($data['checksum']);
$this->app->expires($data['expires']);
$response->body($data['content']);
} catch (\Slim\Exception\Stop $e) {
}
return;
}
// Cache miss... continue on to generate the page
$this->next->call();
if ($response->status() == 200) {
// Cache result for future look up
$checksum = md5($response->body());
$expires = time() + $this->ttl;
$this->save(
$key,
array(
'checksum' => $checksum,
'expires' => $expires,
'content' => $response->body(),
)
);
$response->headers->set(
'X-Cache',
'MISS'
);
try {
$this->app->etag($checksum);
$this->app->expires($expires);
} catch (\Slim\Exception\Stop $e) {
}
return;
}
} else {
// other methods...
}
First I’m computing the full key, it is the resource URI including the query string, then I search for it in the cache. If there are cached data (cache hit) they are in the form of an associative array made by expiration date, md5 checksum and actual content. The first two values are used for the Etag
and Expires
headers, the content fills the response body and the method returns. In Slim the $app->etag()
method takes care of the headers of type If-None-Match
from the client returning a 304 Not Modified
status code.
If there are no cached data (cache miss) the action is passed to the other middleware and the response is processed normally. Our cache middleware is called again before rendering (like an onion, remember?), this time with the processed response. If the final response is valid (status 200) it gets saved in the cache for reuse and then sent to the client.
REST Rate limit
Before it’s too late we should have a way to limit clients’ calls to our API. Another middleware comes to our help here.
$app->add(new API\Middleware\RateLimit('/api/v1'));
public function call()
{
$response = $this->app->response;
$request = $this->app->request;
if ($max = $this->app->config('rate.limit')) {
$this->max = $max;
}
// Activate on given root URL only
if (preg_match('|^' . $this->root . '.*|', $this->app->request->getResourceUri())) {
// Use API key from the current user as ID
if ($key = $this->app->user['apikey']) {
$data = $this->fetch($key);
if (false === $data) {
// First time or previous perion expired,
// initialize and save a new entry
$remaining = ($this->max -1);
$reset = 3600;
$this->save(
$key,
array(
'remaining' => $remaining,
'created' => time()
),
$reset
);
} else {
// Take the current entry and update it
$remaining = (--$data['remaining'] >= 0)
? $data['remaining'] : -1;
$reset = (($data['created'] + 3600) - time());
$this->save(
$key,
array(
'remaining' => $remaining,
'created' => $data['created']
),
$reset
);
}
// Set rating headers
$response->headers->set(
'X-Rate-Limit-Limit',
$this->max
);
$response->headers->set(
'X-Rate-Limit-Reset',
$reset
);
$response->headers->set(
'X-Rate-Limit-Remaining',
$remaining
);
// Check if the current key is allowed to pass
if (0 > $remaining) {
// Rewrite remaining headers
$response->headers->set(
'X-Rate-Limit-Remaining',
0
);
// Exits with status "429 Too Many Requests" (see doc below)
$this->fail();
}
} else {
// Exits with status "429 Too Many Requests" (see doc below)
$this->fail();
}
}
$this->next->call();
}
We allow for a root path to be passed and, like the cache middleware, we can set other parameters like rate.limit
from our application config. This middleware uses the $app->user
context created by the authentication layer; the user API key is used as key for APC cache. If we don’t find data for the given key we generate them: I’m storing the remaining calls, the timestamp of creation of the value, and I give it a TTL of an hour. If there are data in the APC I recalculate the remaining calls and save the updated values.
Then I’m setting the X-Rate-Limit–*
headers (it’s a convention, not a standard) and if the user has no remaining calls left I’m resetting X-Rate-Limit-Remaining
to zero and fail with a 429 Too Many Requests
status code. There’s a little workaround here: I cannot use Slim’s $app->halt()
method to output the error because Apache versions prior to 2.4 don’t support the status code 429
and convert it silently into a 500
error. So the middleware uses its own fail()
method:
protected function fail()
{
header('HTTP/1.1 429 Too Many Requests', false, 429);
// Write the remaining headers
foreach ($this->app->response->headers as $key => $value) {
header($key . ': ' . $value);
}
exit;
}
The method outputs a raw header to the client and, since the standard response flow is interrupted, it outputs all the headers that were previously generated by the response.
Where do we go from here?
We’ve covered a lot of stuff here, and we have our basic API that respects common best practices, but there are still many improvements we can add. For example:
- use a more robust ORM/Model for data access
- use a separate validation library that injects into the model
- use dependency injection to take advantage of other key/value storage engines instead of APC
- build a discovery service and playground with Swagger and similar tools
- build a test suite layer with Codeception
The full source code can be found here. As always I encourage you to experiment with the sample code to find and, hopefully, share your solutions. Happy coding!