PINQ – Querify Your Datasets – Faceted Search

Share this article

In part 1, we briefly covered the installation and basic syntax of PINQ, a PHP LINQ port. In this article, we will see how to use PINQ to mimic a faceted search feature with MySQL.

We are not going to cover the full aspect of faceted search in this series. Interested parties can refer to relevant articles published on Sitepoint and other Internet publications.

A typical faceted search works like this in a website:

  • A user provides a keyword or a few keywords to search for. For example, “router” to search for products which contain “router” in the description, keyword, category, tags, etc.
  • The site will return the products matching the criteria.
  • The site will provide some links to fine tune the search. For example, it may prompt that there are different brands for a router, and there may be different price ranges and different features.
  • The user can further screen the results by clicking the different links provided and eventually gets a more customized result set.

Faceted search is so popular and powerful and you can experience it in almost every e-Commerce site.

Unfortunately, faceted search is not a built-in feature provided by MySQL yet. What can we do if we are using MySQL but also want to provide our users with such a feature?

With PINQ, we’ll see there is an equally powerful and straightforward approach to achieving this as when we are using other DB engines – at least in a way.

Extending Part 1 Demo

NOTE: All code in this part and the part 1 demo can be found in the repo.

In this article, we will extend the demo we have shown in Part 1 and add in some essential faceted search features.

Let’s start with index.php by adding the following few lines:

$app->get('demo2', function () use ($app)
{
    global $demo;
    $test2 = new pinqDemo\Demo($app);
    return $test2->test2($app, $demo->test1($app));
}
);

$app->get('demo2/facet/{key}/{value}', function ($key, $value) use ($app)
{
    global $demo;
    $test3 = new pinqDemo\Demo($app);
    return $test3->test3($app, $demo->test1($app), $key, $value);
}
);

We just created two more routes in our demo application (using Silex).

The first route is to bring us to the page showing all the records that match our first search behavior, i.e., search by providing a keyword. To keep the demo simple, we select all books from the sample book_book table. It will also display the result set and faceted links for further navigation.

The second route brings us to another page showing the records matching further facet search criteria in the result set produced in the above step. It will display the faceted search links too.

In a real world implementation, after a faceted link is clicked, any faceted filtering in the result page will be adjusted to reflect the statistical information of the result data set. By doing this, the user can apply “add-on” screenings, adding “brand” first then “price range”, etc.

But in this simple demo, we will skip this approach, all faceted search and the links will only reflect the information on the original data set. This is the first restriction and the first area for improvement in our demo.

As we see from the code above, the real functions reside in another file called pinqDemo.php. Let’s see the relevant code that provides the faceted search feature.

A facet class

First, we create a class to represent a facet. Generally, a facet should have a few properties:

  • The data it operates on ($data)
  • The key it groups on ($key)
  • The key type ($type). It can be one of the below:
    • specify a full string to make an exact match
    • specify partial (normally beginning) of a string to make a pattern match
    • specify a value range to group by a value range
  • If the key type is a range, there is a need to specify a value step to determine the upper/lower bound of the range; or if the key type is a partial string, we need to provide a number to specify how many first letters shall be used to group ($range)

The grouping is the most critical part in a facet. All the aggregating information that a facet could possibly return depends on the “grouping” criteria. Normally, “Full String”, “Partial String” and “Value Range” are the most commonly used ones.

namespace classFacet
{
    use Pinq\ITraversable,
        Pinq\Traversable;

    class Facet
    {

        public $data; // Original data
        public $key; // the field to be grouped on
        public $type; // F: full string; S: start of a string; R: range;
        public $range; // Only valid if $type is not F

		...

        public function getFacet()
        {
            $filter = '';

            if ($this->type == 'F') // Full string 
            {
				...
            }
            elseif ($this->type == "S") //Start of string
            {
				...
            }
            elseif ($this->type == "R") // A value range
            {
                $filter = $this->data
                        ->groupBy(function($row)
                        {
                            return floor($row[$this->key] / $this->range) * $this->range;
                        })
                        ->select(function (ITraversable $data)
                {
                    return ['key' => $data->last()[$this->key], 'count' => $data->count()];
                });
            }

            return $filter;
        }
    }
}

In this class, the key function is to return the faceted result set based on the data and the facet key properties. We noticed that for different types of keys, there are different ways to group the data. In the above, we have shown what the code will look like if we are grouping the data by a value range in a step specified by $range.

Making facets and displaying the original data

public function test2($app, $data)
        {
            $facet = $this->getFacet($data);
            return $app['twig']->render('demo2.html.twig', array('facet' => $facet, 'data' => $data));
        }

		private function getFacet($originalData)
        {
            $facet = array();

            $data = \Pinq\Traversable::from($originalData);

            // 3 samples on constructing different Facet objects and return the facet
            $filter1 = new \classFacet\Facet($data, 'author', 'F');
            $filter2 = new \classFacet\Facet($data, 'title', 'S', 6);
            $filter3 = new \classFacet\Facet($data, 'price', 'R', 10);

            $facet[$filter1->key] = $filter1->getFacet();
            $facet[$filter2->key] = $filter2->getFacet();
            $facet[$filter3->key] = $filter3->getFacet();
            return $facet;
        }

In the getFacet() function, we do the following steps:

  • Convert the original data to a Pinq\Traversable object for further processing.
  • We create 3 facets. The ‘author’ facet will group on the field author and it is a full string grouping; ‘title’ facet on field title and a partial string grouping (the starting 6 letters count); ‘price’ facet on field price and a range grouping (by a step of 10).
  • Finally, we get the facets and return them back to test2 function so that the template can render the data and the facets.

Displaying the facets and the filtered data

Most of the time, facets will be displayed as a link and bring us to a filtered data set.

We have already created a route ('demo2/facet/{key}/{value}') to display the faceted search results and the facet links.

The route takes two parameters, reflecting the key we facet on and the value of that key. The test3 function that eventually gets invoked from that route is excerpted below:

public function test3($app, $originalData, $key, $value)
        {
            $data = \Pinq\Traversable::from($originalData);
            $facet = $this->getFacet($data);

            $filter = null;

            if ($key == 'author')
            {
                $filter = $data
                        ->where(function($row) use ($value)
                        {
                            return $row['author'] == $value;
                        })
                        ->orderByAscending(function($row) use ($key)
                {
                    return $row['price'];
                })
                ;
            }
            elseif ($key == 'price')
            {
				...
            }
            else //$key==title
            {
                ...
            }

            return $app['twig']->render('demo2.html.twig', array('facet' => $facet, 'data' => $filter));
        }

Basically, depending on the key, we apply filtering (the anonymous function in where clause) corresponding to the value passed in and get the further screened data. We can also specify the order of the faceted data.

Finally, we display the data (along with the facets) in a template. This route renders the same template as that which is used by route 'demo2').

Next, let’s take a look at the template and see how the facet links are displayed. I am using Bootstrap so the CSS components used here should be quite familiar:

<div class="col col-md-4">
                <h4>Search Bar</h4>
                <ul>
                    {% for k, v in facet %}
                        <li><h5><strong>{{k|capitalize}}</strong></h5></li>
                        <ul class="list-group">
                            {% for vv in v %}
                                <li class="list-group-item"><span class="badge">{{vv.count}}</span><a href="/demo2/facet/{{k}}/{{vv.key}}">{{vv.key}}</a></li>
                            {%endfor%}
                        </ul>
                    {%endfor%}
                </ul>
</div>

We have to remember that the facet generated by our app is a nested array. In the first layer, it is an array of all the facets, and in our case, we have a total of 3 (for author, title, author, respectively).

For each facet, it is a “key-value” paired array so that we can iterate in a traditional way.

Please note how we construct the URIs of the links. We used both the outer loop’s key (k) and inner loops key (vv.key) to be the parameters in the route ('demo2/facet/{key}/{value}'). The count of the key (vv.count) is used to touch up the display in the template (as a Bootstrap badge).

The template will be rendered as shown below:


(The first shows the initial entry page and the second shows a faceted result with price between $0 to $10 and ordered by author)

All right, so far we have managed to mimic a faceted search feature in our web app!

Before we conclude this series, we shall take a final look at this demo and see what can be done to improve it and what are the limitations.

Improvements to be made

Overall, this is a quite rudimentary demo. We just ran through the basic syntax and concepts and forged them into a can-run example. As we saw earlier, a few areas can be improved on to make it more flexible.

We need to consider providing “add-on” criteria searching capability. Our current implementation limits the facet search to be applied on the original only, instead of the screened data. This is the most important improvement I can think of.

Limitations

The faceted search implemented here has a deep-rooted limitation (and probably true for other faceted search implementations):

We are retrieving data from the MySQL server every time.

This app uses Silex as the framework. For any single-entrance framework like Silex, Symfony, Laravel, its index.php (or app.php) gets called every time a route is to be analyzed and a controller’s function is to be invoked.

Looking at the code in our index.php, we will see that this also means the below line of code:

$demo = new pinqDemo\Demo($app);

gets called every time a page in the app is displayed, which then means the following lines are executed every time:

class Demo
    {

        private $books = '';

        public function __construct($app)
        {
            $sql = 'select * from book_book order by id';
            $this->books = $app['db']->fetchAll($sql);
        }

Will it be better if we avoid using a framework? Well, besides the fact that it is not really a very good idea to develop an app without a framework, we are still facing the same issue: data (and status) are not persistent from one HTTP call to another. This is the fundamental characteristic of HTTP. This should be avoided with the use of a caching engine.

We do save some SQL statements being executed at the server side when we are constructing the facets. Instead of passing 1 select query AND 3 different group by queries with the same where statement, we just issue one select query with the where statement and use PINQ to provide the aggregating information.

Conclusion

In this part, we managed to mimic a facet search capability for our book collection site. As I said, it is merely a can-run demo and has plenty of room of improvement and some default limitations. Let us know if you build on this example and can show us some more advanced use cases!

The author of PINQ is now working on the next major version release (version 3). I do hope it can get more powerful.

Feel free to leave your comments and thoughts below!

What is Pinq and how does it relate to faceted search?

Pinq is a PHP library that provides a unique, intuitive, and powerful query language to manipulate arrays and other data sets. It is designed to simplify the process of querying and manipulating data. In relation to faceted search, Pinq can be used to create complex queries that can filter and sort data based on multiple criteria, which is the core concept of faceted search.

How does Pinq’s approach to faceted search differ from other methods?

Pinq’s approach to faceted search is unique because it uses a query language that is based on PHP, which is a widely used programming language. This makes it easier for developers who are already familiar with PHP to implement faceted search. Additionally, Pinq’s query language is designed to be intuitive and easy to use, which can simplify the process of creating complex queries.

Can Pinq be used with other databases or is it limited to MySQL?

Pinq is not limited to MySQL. It can be used with any data set, including arrays and other databases. This flexibility makes Pinq a versatile tool for developers who need to work with different types of data.

How does Pinq handle large data sets?

Pinq is designed to handle large data sets efficiently. It does this by using a lazy evaluation strategy, which means that it only processes data when it is actually needed. This can significantly improve performance when working with large data sets.

What are the benefits of using Pinq for faceted search?

Using Pinq for faceted search has several benefits. First, it simplifies the process of creating complex queries, which can save developers time and effort. Second, it provides a powerful and flexible query language that can handle a wide range of data types and structures. Finally, it is based on PHP, which is a widely used programming language, making it easier for developers to learn and use.

Is Pinq suitable for beginners or is it more suited to experienced developers?

Pinq is designed to be intuitive and easy to use, making it suitable for both beginners and experienced developers. However, some knowledge of PHP and query languages is beneficial when using Pinq.

How does Pinq ensure the accuracy of search results?

Pinq ensures the accuracy of search results by using a powerful and flexible query language that can accurately filter and sort data based on multiple criteria. This allows it to provide precise and relevant search results.

Can Pinq be used for real-time search?

Yes, Pinq can be used for real-time search. Its efficient handling of large data sets and its ability to create complex queries make it suitable for real-time search applications.

How does Pinq compare to other PHP libraries for faceted search?

Pinq stands out from other PHP libraries for faceted search due to its unique, intuitive, and powerful query language. It also offers flexibility in terms of the types of data it can handle, and its efficient handling of large data sets makes it a strong choice for developers.

Is Pinq open source and can it be customized?

Yes, Pinq is an open-source library, which means that developers can customize it to suit their specific needs. This flexibility is another advantage of using Pinq for faceted search.

Taylor RenTaylor Ren
View Author

Taylor is a freelance web and desktop application developer living in Suzhou in Eastern China. Started from Borland development tools series (C++Builder, Delphi), published a book on InterBase, certified as Borland Expert in 2003, he shifted to web development with typical LAMP configuration. Later he started working with jQuery, Symfony, Bootstrap, Dart, etc.

databasefaceted searchfiltergroupLINQOOPHPormPHPpinqsortsql
Share this article
Read Next
Get the freshest news and resources for developers, designers and digital creators in your inbox each week