Using the RulerZ Rule Engine to Smarten up Playlist Building
Rule engines are funny things. They’re typically complex, and meant to replace simple conditional logic. The problem they solve is one of scale.
When your application becomes so large that the logic to display or enable functionality is over a large area; conditional logic leads to bugs. Edge cases. Instances when your if statements don’t cover every aspect they need to. Or every path through your application.
This is when good rules engines shine. Perhaps I’m being a little too abstract here. Let’s look at an example…
You can find the example code at https://github.com/assertchris-tutorials/rulerz.
The Problem
I listen to music all the time. I prefer iTunes, over other media players, for many reasons. But one of the reasons that stands out is that I can use iTunes to build large, complex playlists for me.
I can give iTunes a few rules, and it will update the list of playlist tracks based on those rules, without me having to think about how it is doing that.
But how does it do this? How does it build my simple rules into a filter for tracks? When I tell it things like; “give me everything from The Glitch Mob, produced before 2014, where plays count is less than 20”, it understands what I mean.
Now, we could make these Smart Playlists with many conditionals. If you’re anything like me though, you just cringed at that thought.
Enter RulerZ
RulerZ is a rule engine. It’s an implementation of the Specification pattern. You know where else you’ve seen the Specification pattern? In database abstraction layers like Eloquent and Doctrine!
The basic idea is that you start with some sort of list. It could be users in a database, or expenses in a CSV file. Then you read them into memory (or even filter before that) and filter them according to some chain-based logic. You know the kind:
$list
->whereArtist("The Glitch Mob")
->whereYearLessThan(2015)
->wherePlayCountLessThan(20)
->all();
In database abstraction layers this is usually done by generating SQL. We send that SQL to the database server, where the records are brought into memory and then filtered. What we get is the already-filtered list, but the idea is still the same.
We wouldn’t want to make those filters as conditionals, through PHP. The Specification pattern (and by extension SQL) are great for applying this boolean logic.
Let’s take a look at how to use RulerZ:
use RulerZ\Compiler;
use RulerZ\Parser;
use RulerZ\RulerZ;
$compiler = new Compiler\EvalCompiler(
$parser = new Parser\HoaParser()
);
$rulerz = new RulerZ(
$compiler, [
$visitor = new Compiler\Target\ArrayVisitor(),
]
);
$tracks = [
[
"title" => "Animus Vox",
"artist" => "The Glitch Mob",
"plays" => 36,
"year" => 2010
],
[
"title" => "Bad Wings",
"artist" => "The Glitch Mob",
"plays" => 12,
"year" => 2010
],
[
"title" => "We Swarm",
"artist" => "The Glitch Mob",
"plays" => 28,
"year" => 2010
]
// ...
];
$filtered = $rulerz->filter(
$tracks,
"artist = :artist and year < :year and plays < :plays",
[
"artist" => "The Glitch Mob",
"year" => 2015,
"plays" => 20
]
);
In this example, we have a list of tracks. This could be something we export from iTunes…
We create a rule compiler, and a new RulerZ
instance. We can then use the RulerZ
instance of filter our track list. We combine the textual rules with the parameter list to create the boolean filter logic.
Like SQL but in PHP, against records stored in memory. It’s simple and elegant!
Building Smart Playlists
Let’s put this knowledge to use! We’ll begin by extracting an iTunes library:
Open iTunes, click “File” → “Library” → “Export Library…”
Save the XML file as library.xml
, in your working directory.
Depending on the size of your library, this file may be large. My library.xml
file is about 46k lines long…
This XML file can be difficult to work with. It’s in an odd key/value format. So we’re going to convert it to a JSON file, containing track data only:
$document = new DomDocument();
$document->loadHTMLFile("library.xml");
$tracks = $document
->getElementsByTagName("dict")[0] // root node
->getElementsByTagName("dict")[0] // track container
->getElementsByTagName("dict"); // track nodes
$clean = [];
foreach ($tracks as $track) {
$key = null;
$all = [];
foreach ($track->childNodes as $node) {
if ($node->tagName == "key") {
$key = str_replace(" ", "", $node->nodeValue);
} else {
$all[$key] = $node->nodeValue;
$key = null;
}
}
$clean[] = $all;
}
file_put_contents(
"tracks.json", json_encode($clean)
);
We create a DomDocument
object, to allow us to step through the XML nodes. There are three levels to this file: root dict
node → library dict
node → track dict
nodes.
For each track node, we step through each child node. Half of them are key
nodes (with dictionary key strings) and the other half are value nodes. So we store each key until we get a value to go with it. This is a bit of a hack, but it does the job. We only need to run this once to get a nice track list, and RulerZ will use it thereafter!
If you want to debug this code, I suggest you export playlists (as XML files) instead. That way you can have a much smaller library.xml
file to work with. You don’t want to repeat this extraction many times, on a large list. Trust me…
Then we need to create a form, for the filters:
$filterCount = 0;
$filtered = [];
function option($value, $label, $selected = null) {
$parameters = "value={$value}";
if ($value == $selected) {
$parameters .= " selected='selected'";
}
return "<option {$parameters}>{$label}</option>";
}
We begin with $filterCount
, which is the number of filters applied so far. We’re not persisting any filters yet, so this will always be 0
. We also create an array of filtered tracks, though this will also be empty for now.
Then we define a function for rendering option elements. This cuts down on the work we have to do later. Yay! Next up is the markup:
<form method="post">
<div>
<select name="field[<?= $filterCount ?>]">
<?= option("Name", "Name") ?>
<?= option("Artist", "Artist") ?>
<?= option("Album", "Album") ?>
<?= option("Year", "Year") ?>
</select>
<select name="operator[<?= $filterCount ?>]">
<?= option("contains", "contains") ?>
<?= option("begins", "begins with") ?>
<?= option("ends", "ends with") ?>
<?= option("is", "is") ?>
<?= option("not", "is not") ?>
<?= option("gt", "greater than") ?>
<?= option("lt", "less than") ?>
</select>
<input type="text" name="query[<?= $filterCount ?>]" />
</div>
<input type="submit" value="filter" />
</form>
<?php foreach ($filtered as $track): ?>
<div>
<?= $track["Artist"] ?>,
<?= $track["Album"] ?>,
<?= $track["Name"] ?>
</div>
<?php endforeach; ?>
Here we’ve created markup for adding a single filter. The fields are named field[0]
, operator[0]
and query[0]
, which will make sense the more we work on this.
We also step through the array of filtered tracks, displaying the artist, album and name of each. This array is empty right now, but we’ll add tracks to it shortly.
We’ve created a small subset of the filter options we could create. Each track has the following kinds of data:
{
"Track ID": "238",
"Name": "Broken Bones (Bonus Track)",
"Artist": "CHVRCHES",
"Album Artist": "CHVRCHES",
"Composer": "CHVRCHES",
"Album": "The Bones of What You Believe (Special Edition)",
"Genre": "Alternative",
"Kind": "Purchased AAC audio file",
"Size": "7872373",
"Total Time": "224721",
"Disc Number": "1",
"Disc Count": "1",
"Track Number": "14",
"Track Count": "16",
"Year": "2013",
"Date Modified": "2014-05-21T09:45:09Z",
"Date Added": "2013-11-24T22:18:35Z",
"Bit Rate": "256",
"Sample Rate": "44100",
"Play Count": "133",
"Play Date": "3513745347",
"Play Date UTC": "2015-05-05T20:22:27Z",
"Skip Count": "1",
"Skip Date": "2014-01-30T21:44:20Z",
"Release Date": "2013-09-24T07:00:00Z",
"Normalization": "1979",
"Artwork Count": "1",
"Sort Album": "Bones of What You Believe (Special Edition)",
"Persistent ID": "B05B025A46F6F2BB",
"Track Type": "File",
"Purchased": "",
"Location": "file://.../track.m4a",
"File Folder Count": "5",
"Library Folder Count": "1"
}
Aside form the textual filters we’ve already added; we can add our own custom functions:
$visitor->setOperator("my_is", function($field, $value) {
return $field == $value;
});
$visitor->setOperator("my_not", function($field, $value) {
return $field != $value;
});
$visitor->setOperator("my_contains", function($field, $value) {
return stristr($field, $value);
});
$visitor->setOperator("my_begins", function($field, $value) {
return preg_match("/^{$value}.*/i", $field) == 1;
});
$visitor->setOperator("my_ends", function($field, $value) {
return preg_match("/.*{$value}$/i", $field) == 1;
});
$visitor->setOperator("my_gt", function($field, $value) {
return $field > $value;
});
$visitor->setOperator("custom_lt", function($field, $value) {
return $field < $value;
});
We can use these in other textual queries, like: my_contains(Artist, 'Glitch')
. In fact, we can begin to stitch the form filters together, using these:
if (isset($_POST["field"])) {
$fields = $_POST["field"];
$operators = $_POST["operator"];
$values = $_POST["query"];
$query = "";
foreach ($fields as $i => $field) {
$operator = $operators[$i];
$value = $values[$i];
if (trim($field) && trim($operator) && trim($value)) {
if ($query) {
$query .= " and ";
}
$query .= "my_{$operator}({$field}, '{$value}')";
}
}
$filterCount = count($fields);
}
This code checks if there are posted filters. For each posted filter, we get the operator
and query
value. If these aren’t empty values (which is what we use trim
to check) then we build a query string.
We also adjust the $filterCount
so new filter fields are added to the end of the list. Finally, we need to filter the exported track list:
$tracks = json_decode(
file_get_contents("tracks.json"), true
);
$filtered = $rulerz->filter($tracks, $query);
This takes the iTunes export we made earlier and filters it according to the dynamic query we just made.
Displaying Posted Filters
Let’s display posted filters in the form, so we can see which filters are being applied to the current result-set:
<form method="post">
<?php if ($fields): ?>
<?php for ($i = 0; $i < $filterCount; $i++): ?>
<div>
<select name="field[<?= $i ?>]">
<?= option("Name", "Name", $fields[$i]) ?>
<?= option("Artist", "Artist", $fields[$i]) ?>
<?= option("Album", "Album", $fields[$i]) ?>
<?= option("Year", "Year", $fields[$i]) ?>
</select>
<select name="operator[<?= $i ?>]">
<?= option("contains", "contains", $operators[$i]) ?>
<?= option("begins", "begins with", $operators[$i]) ?>
<?= option("ends", "ends with", $operators[$i]) ?>
<?= option("is", "is", $operators[$i]) ?>
<?= option("not", "is not", $operators[$i]) ?>
<?= option("gt", "greater than", $operators[$i]) ?>
<?= option("lt", "less than", $operators[$i]) ?>
</select>
<input
type="text"
name="query[<?= $i ?>]"
value="<?= $values[$i] ?>" />
</div>
<?php endfor; ?>
<?php endif; ?>
<div>
<select name="field[<?= $filterCount ?>]">
<?= option("Name", "Name") ?>
<?= option("Artist", "Artist") ?>
<?= option("Album", "Album") ?>
<?= option("Year", "Year") ?>
</select>
<select name="operator[<?= $filterCount ?>]">
<?= option("contains", "contains") ?>
<?= option("begins", "begins with") ?>
<?= option("ends", "ends with") ?>
<?= option("is", "is") ?>
<?= option("not", "is not") ?>
<?= option("gt", "greater than") ?>
<?= option("lt", "less than") ?>
</select>
<input type="text" name="query[<?= $filterCount ?>]" />
</div>
<input type="submit" value="filter" />
</form>
This is much like the previous form we had. Now we’re basing option selection on posted values.
We’re not removing empty filters. Consider that an exercise left to the reader!
Conclusion
This was an interesting project for me. It’s not often I get to really think about how something was implemented, through code of my own. RulerZ provided me with the tools I needed to do it!
Can you think of other interesting uses for a rule engine? Let me know in the comments!