In parts one and two, we built some very basic functionality and used TDD with PHPUnit to make sure our classes are well tested. We also learned how to test an abstract class in order to make sure its concrete methods worked. Now, let’s continue building our library.
Catch up
I took the liberty of implementing the functionality and the test for the abstract API class’ constructor, requiring the URL to be passed in. It’s very similar to what we did with the Diffbot and DiffbotTest classes.
I also added some more simple methods, and testing of the different API instantiations and custom fields for the APIs into the mix with dynamic setters and getters using __call
. This seemed like too menial work to bother you with as it’s highly repetitive and ultimately futile at this point, but if you’re curious, please leave a comment below and we’ll go through part2-end > part3-start differences in another post – you can even diff the various files and ask about specific differences in the forums, I’d be happy to answer them to the best of my knowledge, and also take some advice regarding their design. Additionally, I have moved the “runInSeparateProcess” directive from the entire DiffbotTest
class to just the test that needs an empty static class, which reduced the duration of the entire testing phase to mere seconds.
If you’re just now joining us, please download the part 3 start branch and catch up.
Data Mocking
We mentioned before we would be data mocking in this part. This might sound more confusing than it is, so allow me to clarify. When we request a URL through Diffbot, we expect a certain result. Like, requesting a specific Amazon product, we expect to get the parsed values for that product. However, if we rely on this live data in our tests, we face two problems:
- The tests become slower by X, where X is the time required to fetch the data from Amazon
- The data can change and break our tests. Suddenly, some information our tests relied upon before can break due to different values being returned.
Because of this, it’s best if we cache the entire response to a given API call offline – headers and all – and use it to fake a response to Guzzle (functionality Guzzle has built in). This way, we can feed Diffbot a fake every time during tests and make sure it gets the same data, thereby giving us consistent results. Matthew Setter wrote about data mocking with Guzzle and PHPUnit before here, if you’d like to take a look.
To get to the testing level we need, we’ll be faking the data that Diffbot returns. Doesn’t this mean that we aren’t effectively testing Diffbot itself but only our ability to parse the data? Exactly, it does. It’s not on us to test Diffbot – Diffbot’s crew does that. What we’re testing here is the ability to initiate API calls and parse the data they return – that’s all.
Updating the Main Class
First, we need to update the Diffbot class. Our API subclasses will need to know about the token, and about the HTTP client we’re using. To make this possible, we’ll be registering the Diffbot instance within the API subclasses upon their instantiation.
First, add the following property to the Api abstract class:
/** @var Diffbot The parent class which spawned this one */
protected $diffbot;
and then add the following method:
/**
* Sets the Diffbot instance on the child class
* Used to later fetch the token, HTTP client, EntityFactory, etc
* @param Diffbot $d
* @return $this
*/
public function registerDiffbot(Diffbot $d) {
$this->diffbot = $d;
return $this;
}
Then, we need to update the Diffbot class slightly.
Add the following content to Diffbot.php
:
// At the top:
use GuzzleHttp\Client;
// As a property:
/** @var Client The HTTP clients to perform requests with */
protected $client;
// Methods:
/**
* Sets the client to be used for querying the API endpoints
*
* @param Client $client
* @return $this
*/
public function setHttpClient(Client $client = null)
{
if ($client === null) {
$client = new Client();
}
$this->client = $client;
return $this;
}
/**
* Returns either the instance of the Guzzle client that has been defined, or null
* @return Client|null
*/
public function getHttpClient()
{
return $this->client;
}
/**
* Creates a Product API interface
*
* @param $url string Url to analyze
* @return Product
*/
public function createProductAPI($url)
{
$api = new Product($url);
if (!$this->getHttpClient()) {
$this->setHttpClient();
}
return $api->registerDiffbot($this);
}
/**
* Creates an Article API interface
*
* @param $url string Url to analyze
* @return Article
*/
public function createArticleAPI($url)
{
$api = new Article($url);
if (!$this->getHttpClient()) {
$this->setHttpClient();
}
return $api->registerDiffbot($this);
}
/**
* Creates an Image API interface
*
* @param $url string Url to analyze
* @return Image
*/
public function createImageAPI($url)
{
$api = new Image($url);
if (!$this->getHttpClient()) {
$this->setHttpClient();
}
return $api->registerDiffbot($this);
}
/**
* Creates an Analyze API interface
*
* @param $url string Url to analyze
* @return Analyze
*/
public function createAnalyzeAPI($url)
{
$api = new Analyze($url);
if (!$this->getHttpClient()) {
$this->setHttpClient();
}
return $api->registerDiffbot($this);
}
We added the ability to set a client, and we made it default to a new instance of Guzzle Client. We also improved direct methods for instantiation of API subtypes, all of which are nearly identical. This is on purpose – we might need specific configuration per API type later on, so separating them out like this will benefit us in the long run. Likewise, it contains some identical code which we’ll later detect and fix with PHPCPD (PHP Copy Paste Detector). The Diffbot class now also injects itself into spawned API classes.
Factories and Entities
You might be wondering – isn’t this a good place for the factory pattern? Making an object that’s strictly in charge of creating the APIs and nothing more? Sure, that might be so – but in my opinion, that’s over-engineering it. The Diffbot class was always meant to return new instances of the various APIs, and as such it performs its function through these methods. Likewise, our library has a very specific purpose, and was, from the ground-up, intended to depend heavily on Guzzle. Abstracting too much would get us lost in the complexity of something overly simple and would waste our time. Diffbot is our factory.
But there’s one thing it cannot and should not do. What we ultimately want the library to be able to do is give us back an object, for example a “Product” object, with accessors that let us read fetched and parsed fields in a fluent, object oriented manner. In other words:
// ... set URL etc
$product = $productApi->call();
echo $product->getOfferPrice();
This time, to make this possible, we’ll need a factory. Why a factory and not just have the API classes instantiate the entities like “Product” themselves? Because someone coming into contact with our library might want to parse the JSON results returned by Diffbot in a different manner – they might want to modify the output so that it matches their database and is compatible with direct insertion, or they might want an easy way to compare it with their own products, for example.
To be able to return such entities, they need an interface if they’re to be interchangeable. However, seeing as we know some of their always-on functionality, let’s make an abstract instead. Create src/Abstracts/Entity.php
:
<?php
namespace Swader\Diffbot\Abstracts;
use GuzzleHttp\Message\Response;
abstract class Entity
{
/** @var Response */
protected $response;
/** @var array */
protected $objects;
public function __construct(Response $response)
{
$this->response = $response;
$this->objects = $response->json()['objects'][0];
}
/**
* Returns the original response that was passed into the Entity
* @return Response
*/
public function getResponse()
{
return $this->response;
}
}
The Diffbot API returns a JSON object with two subobjects: request
and objects
, as evident by the JSON output from the test drive on the landing page. Guzzle’s Response Message supports outputting JSON data to arrays, but that’s about it. So, this class only has a constructor into which it accepts the response object and then binds the first element of the “objects” field (the one with the meaningful data) to another protected property.
In order for the Factory to be interchangeable, let’s give it an interface. Create src/Interfaces/EntityFactory.php
:
<?php
namespace Swader\Diffbot\Interfaces;
use GuzzleHttp\Message\Response;
interface EntityFactory
{
/**
* Returns the appropriate entity as built by the contents of $response
*
* @param Response $response
* @return Entity
*/
public function createAppropriate(Response $response);
}
Now, we can implement it. Create src/Factory/Entity.php
:
<?php
namespace Swader\Diffbot\Factory;
use GuzzleHttp\Message\Response;
use Swader\Diffbot\Exceptions\DiffbotException;
use Swader\Diffbot\Interfaces\EntityFactory;
class Entity implements EntityFactory
{
protected $apiEntities = [
'product' => '\Swader\Diffbot\Entity\Product',
'article' => '\Swader\Diffbot\Entity\Article',
'image' => '\Swader\Diffbot\Entity\Image',
'analyze' => '\Swader\Diffbot\Entity\Analyze',
'*' => '\Swader\Diffbot\Entity\Wildcard',
];
/**
* Creates an appropriate Entity from a given Response
* If no valid Entity can be found for typoe of API, the Wildcard entity is selected
*
* @param Response $response
* @return \Swader\Diffbot\Abstracts\Entity
* @throws DiffbotException
*/
public function createAppropriate(Response $response)
{
$this->checkResponseFormat($response);
$arr = $response->json();
if (isset($this->apiEntities[$arr['request']['api']])) {
$class = $this->apiEntities[$arr['request']['api']];
} else {
$class = $this->apiEntities['*'];
}
return new $class($response);
}
/**
* Makes sure the Diffbot response has all the fields it needs to work properly
*
* @param Response $response
* @throws DiffbotException
*/
protected function checkResponseFormat(Response $response)
{
$arr = $response->json();
if (!isset($arr['objects'])) {
throw new DiffbotException('Objects property missing - cannot extract entity values');
}
if (!isset($arr['request'])) {
throw new DiffbotException('Request property not found in response!');
}
if (!isset($arr['request']['api'])) {
throw new DiffbotException('API property not found in request property of response!');
}
}
}
This is our basic Entity factory. It checks if the response is valid, and then based on that response creates an entity, pushes the response into it, and returns it. If it cannot find a valid entity (for example, the “api” field in the response doesn’t match a key as defined in the apiEntities
property), it picks a wildcard entity. Later, we might even choose to upgrade this factory with the ability to change just some or all of the apiEntities
pairs, so that users don’t have to write a whole new factory just to try out a different Entity, but let’s leave that be for now. The Factory class needs testing too, of course. To see the test, refer to the source code on Github – link at the bottom of the post.
Finally, we need to build some of those Entities, else it’s all been for naught. For starters, create src/Entity/Product.php
. What can we expect from our Product API call? Let’s take a look.
Focusing only on the “root” values inside the “object” property, we can see we instantly get Title, Text, Availability, OfferPrice, and Brand, among others. We could use something like the __call
magic method here to automatically discern what we’re looking for in the array, but for clarity, IDE autocompletion, and the possibility of additional parsing, let’s do it manually. Let’s just build those few now – I’ll leave the rest up to you as an exercise. If you’d like to see how I finished it, refer to the source code at the end of the article.
<?php
namespace Swader\Diffbot\Entity;
use Swader\Diffbot\Abstracts\Entity;
class Product extends Entity
{
/**
* Checks if the product has been determined available
* @return bool
*/
public function isAvailable()
{
return (bool)$this->objects['availability'];
}
/**
* Returns the product offer price, in USD, as a floating point number
* @return float
*/
public function getOfferPrice()
{
return (float)trim($this->objects['offerPrice'], '$');
}
/**
* Returns the brand, as determined by Diffbot
* @return string
*/
public function getBrand()
{
return $this->objects['brand'];
}
/**
* Returns the title, as read by Diffbot
* @return string
*/
public function getTitle()
{
return $this->objects['title'];
}
}
You can see here that a potentially good option would be implementing a currency converter into the Product entity, maybe even support injecting different converters. This would then allow users to read back the item price in currencies other than the default.
Naturally, the Product entity needs testing, too. For the sake of brevity, I’ll just refer you to the source code at the end of the post.
Finally, we need to add the EntityFactory to the Diffbot instance, similar to what we did with the Guzzle Client:
/**
* Sets the Entity Factory which will create the Entities from Responses
* @param EntityFactory $factory
* @return $this
*/
public function setEntityFactory(EntityFactory $factory = null)
{
if ($factory === null) {
$factory = new Entity();
}
$this->factory = $factory;
return $this;
}
/**
* Returns the Factory responsible for creating Entities from Responses
* @return EntityFactory
*/
public function getEntityFactory()
{
return $this->factory;
}
Don’t forget to add the factory
protected property:
/** @var EntityFactory The Factory which created Entities from Responses */
protected $factory;
Now, let’s talk some more about mocking the resources.
Creating Mocks
Creating mock response files to be used with Guzzle is very simple (and necessary to be able to test our classes). They need to look something like this – only with a JSON body instead of XML. This is easily accomplished with cURL. First, create the folder tests/Mocks/Products
. Use the terminal to enter it, and execute the following command:
curl -i "http://api.diffbot.com/v3/product?url=http%3A%2F%2Fwww.petsmar t.com%2Fdog%2Fgrooming-supplies%2Fgrreat-choice-soft-slicker-dog-brush-zid36-12094%2Fcat-36-catid-100016&token =demo&fields=saveAmount,mpn,prefixCode,meta,sku,queryString,saveAmountDetails,shippingAmount,productOrigin,regularPriceDetails,offerPriceDetails" > dogbrush.json
Opening the dogbrush.json
file will show you the full content of the response – both headers and body.
Testing the Call
Now that we have our dogbrush
response, and all our classes are ready, we can use it to test the Product API. For that, we need to develop the call
method. Once the call is executed, we expect to get back some values that Diffbot parsed, and we expect them to be correct.
We start with the test. Edit ProductApiTest.php
to look like this:
<?php
namespace Swader\Diffbot\Test\Api;
use GuzzleHttp\Client;
use GuzzleHttp\Subscriber\Mock;
use Swader\Diffbot\Diffbot;
class ProductApiTest extends \PHPUnit_Framework_TestCase
{
protected $validMock;
protected function getValidDiffbotInstance()
{
return new Diffbot('demo');
}
protected function getValidMock(){
if (!$this->validMock) {
$this->validMock = new Mock(
[file_get_contents(__DIR__.'/../Mocks/Products/dogbrush.json')]
);
}
return $this->validMock;
}
public function testCall() {
$diffbot = $this->getValidDiffbotInstance();
$fakeClient = new Client();
$fakeClient->getEmitter()->attach($this->getValidMock());
$diffbot->setHttpClient($fakeClient);
$diffbot->setEntityFactory();
$api = $diffbot->createProductAPI('https://dogbrush-mock.com');
/** @var Product $product */
$product = $api->call();
$targetTitle = 'Grreat Choice® Soft Slicker Dog Brush';
$this->assertEquals($targetTitle, $product->getTitle());
$this->assertTrue($product->isAvailable());
$this->assertEquals(4.99, $product->getOfferPrice());
$this->assertEquals('Grreat Choice', $product->getBrand());
}
}
We get a valid Diffbot instance first, create and inject a faked Guzzle client with our previously downloaded response. We set the Entity Factory, and after we call call
, we expect a specific title to be returned. Should this assertion not hold, the test will fail. We also test other properties for which we know true values.
Now we need to develop the call
method. It will be identical for all our APIs (they all do the same thing, essentially – a remote request to a given URL), so we put it into the abstract Api class:
public function call()
{
$response = $this->diffbot->getHttpClient()->get($this->buildUrl());
return $this->diffbot->getEntityFactory()->createAppropriate($response);
}
You can see here we’re using a buildUrl
method. This method will use all the custom fields of an API to enhance a default URL and pass the fields along with the request, in order to return the additional values we’ve requested.
First, we write some tests for it in ProductApiTest, so that we know the build works for the given API type:
public function testBuildUrlNoCustomFields() {
$url = $this
->apiWithMock
->buildUrl();
$expectedUrl = 'http://api.diffbot.com/v3/product/?token=demo&url=https%3A%2F%2Fdogbrush-mock.com';
$this->assertEquals($expectedUrl, $url);
}
public function testBuildUrlOneCustomField() {
$url = $this
->apiWithMock
->setOfferPriceDetails(true)
->buildUrl();
$expectedUrl = 'http://api.diffbot.com/v3/product/?token=demo&url=https%3A%2F%2Fdogbrush-mock.com&fields=offerPriceDetails';
$this->assertEquals($expectedUrl, $url);
}
public function testBuildUrlTwoCustomFields() {
$url = $this
->apiWithMock
->setOfferPriceDetails(true)
->setSku(true)
->buildUrl();
$expectedUrl = 'http://api.diffbot.com/v3/product/?token=demo&url=https%3A%2F%2Fdogbrush-mock.com&fields=sku,offerPriceDetails';
$this->assertEquals($expectedUrl, $url);
}
Here is the method in full:
protected function buildUrl()
{
$url = rtrim($this->apiUrl, '/') . '/';
// Add Token
$url .= '?token=' . $this->diffbot->getToken();
// Add URL
$url .= '&url='.urlencode($this->url);
// Add Custom Fields
$fields = static::getOptionalFields();
$fieldString = '';
foreach ($fields as $field) {
$methodName = 'get' . ucfirst($field);
$fieldString .= ($this->$methodName()) ? $field . ',' : '';
}
$fieldString = trim($fieldString, ',');
if ($fieldString != '') {
$url .= '&fields=' . $fieldString;
}
return $url;
}
If we run the test now, everything should pass:
Conclusion
In this part, we did some more TDD, moving in the direction of completion. Due to the extensiveness of the content that can be written on the topic of testing, I’ve decided to cut the story short here. We should definitely test for whether or not the custom fields work, good URLs are built, and so on and so forth – the areas you can test are nigh infinite, but it would unnecessarily extend the tutorial beyond one’s attention span. You can always see the fully finished result on Github. If, however, you’re interested in reading a tutorial on the rest of the logic, please do let me know and I’ll do my best to continue explaining, piece by piece.
Don’t forget to implement the other entities and tests as “homework”! It seems like a waste of time, but in time you’ll grow to love the security tests give you in the long run, I promise! Likewise, if you have any ideas on improving my approaches, I’m always open to learning something new and will gladly take a look at pull requests and constructive feedback.
In the next and final part, we’ll wrap things up and deploy our package to Packagist.org so everyone can install it at will via Composer.
Bruno is a blockchain developer and technical educator at the Web3 Foundation, the foundation that's building the next generation of the free people's internet. He runs two newsletters you should subscribe to if you're interested in Web3.0: Dot Leap covers ecosystem and tech development of Web3, and NFT Review covers the evolution of the non-fungible token (digital collectibles) ecosystem inside this emerging new web. His current passion project is RMRK.app, the most advanced NFT system in the world, which allows NFTs to own other NFTs, NFTs to react to emotion, NFTs to be governed democratically, and NFTs to be multiple things at once.