In this tutorial, we’re going to take a look at Elasticsearch and how we can use it in PHP. Elasticsearch is an open-source search server based on Apache Lucene. We can use it to perform super fast full-text and other complex searches. It also includes a REST API which allows us to easily issue requests for creating, deleting, updating and retrieving of data.
Installing Elasticsearch
This tutorial will assume you’re using a Debian-based environment like this one in the installation instructions below.
To install Elasticsearch we first need to install Java. By default, it is not available in the repositories that Ubuntu uses so we need to add one.
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
Once that’s done, we can install Java.
sudo apt-get install oracle-java8-installer
Next, let’s download Elasticsearch using wget
.
wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.5.2.tar.gz
Currently, the most recent stable version is 1.5.2 so that is what we used above. If you want to make sure you get the most recent version, take a look at the Elasticsearch downloads page.
Then, we extract and install.
mkdir es
tar -xf elasticsearch-1.5.2.tar.gz -C es
cd es
./bin/elasticsearch
When we access http://localhost:9200
in the browser, we get something similar to the following:
{
"status" : 200,
"name" : "Rumiko Fujikawa",
"cluster_name" : "elasticsearch",
"version" : {
"number" : "1.5.2",
"build_hash" : "62ff9868b4c8a0c45860bebb259e21980778ab1c",
"build_timestamp" : "2015-04-27T09:21:06Z",
"build_snapshot" : false,
"lucene_version" : "4.10.4"
},
"tagline" : "You Know, for Search"
}
Using Elasticsearch
Now we can start playing with Elasticsearch. First, let’s install the official Elasticsearch client for PHP.
composer require elasticsearch/elasticsearch
Next, let’s create a new php file that we will use for testing and with the following code so that we can use the Elasticsearch client.
<?php
require 'vendor/autoload.php';
$client = new Elasticsearch\Client();
Indexing Documents
Indexing new documents can be done by calling the index
method on the client. This method accepts an array as its argument. The array should contain the body
, index
and type
as its keys. The body
is an array containing the data that you want to index. The index
is the location where you want to index the specific document (corresponds to database in traditional RDBMS). Lastly, the type
is the type you want to give to the document, how you want to categorize the document. It’s like the table in RDBMS land. Here’s an example:
$params = array();
$params['body'] = array(
'name' => 'Ash Ketchum',
'age' => 10,
'badges' => 8
);
$params['index'] = 'pokemon';
$params['type'] = 'pokemon_trainer';
$result = $client->index($params);
If you print out the $result
you get something similar to the following:
Array
(
[_index] => pokemon
[_type] => pokemon_trainer
[_id] => AU1Bn51W5l_vSaLQKPOy
[_version] => 1
[created] => 1
)
In the example above, we haven’t specified an ID for the document. Elasticsearch automatically assigns a unique ID if nothing is specified. Let’s try assigning an ID to another document:
$params = array();
$params['body'] = array(
'name' => 'Brock',
'age' => 15,
'badges' => 0
);
$params['index'] = 'pokemon';
$params['type'] = 'pokemon_trainer';
$params['id'] = '1A-000';
$result = $client->index($params);
When we print the $result
:
Array
(
[_index] => pokemon
[_type] => pokemon_trainer
[_id] => 1A-001
[_version] => 1
[created] => 1
)
When indexing documents, we’re not limited to a single-dimensional array. We can also index multi-dimensional ones:
$params = array();
$params['body'] = array(
'name' => 'Misty',
'age' => 13,
'badges' => 0,
'pokemon' => array(
'psyduck' => array(
'type' => 'water',
'moves' => array(
'Water Gun' => array(
'pp' => 25,
'power' => 40
)
)
)
)
);
$params['index'] = 'pokemon';
$params['type'] = 'pokemon_trainer';
$params['id'] = '1A-002';
$result = $client->index($params);
We can go as deep as we want, but we still need to observe proper storage of data (not going too deep, keeping it structured and logical, etc) when we index it with Elasticsearch, just like we do in an RDBMS setting.
Searching for Documents
We can search for existing documents within a specific index using either the get
or search
method. The main distinction between the two is that the get
method is commonly used when you already know the ID of the document. Its also used for getting only a single document. On the other hand, the search()
method is used for searching multiple documents, and you can use any field in the document for your query.
Get
First, let’s start with the get
method. Just like the index
method, this one accepts an array as its argument. The array should contain the index
, type
and id
of the document that you want to find.
$params = array();
$params['index'] = 'pokemon';
$params['type'] = 'pokemon_trainer';
$params['id'] = '1A-001';
$result = $client->get($params);
The code above would return the following:
Array
(
[_index] => pokemon
[_type] => pokemon_trainer
[_id] => 1A-001
[_version] => 1
[found] => 1
[_source] => Array
(
[name] => Brock
[age] => 15
[badges] => 0
)
)
Search with Specific Fields
The array argument for the search
method needs to have the index
, the type
and the body
keys. The body
is where we specify the query. To start, here’s an example on how we use it to return all the documents which have an age of 15.
$params['index'] = 'pokemon';
$params['type'] = 'pokemon_trainer';
$params['body']['query']['match']['age'] = 15;
$result = $client->search($params);
This returns the following:
Array
(
[took] => 177
[timed_out] =>
[_shards] => Array
(
[total] => 5
[successful] => 5
[failed] => 0
)
[hits] => Array
(
[total] => 1
[max_score] => 1
[hits] => Array
(
[0] => Array
(
[_index] => pokemon
[_type] => pokemon_trainer
[_id] => 1A-001
[_score] => 1
[_source] => Array
(
[name] => Brock
[age] => 15
[badges] => 0
)
)
)
)
)
Let’s break the results down:
took
– number of milliseconds it took for the request to finish.timed_out
– returnstrue
if the request timed out._shards
– by default, Elasticsearch distributes the data into 5 shards. If you get 5 as the value fortotal
andsuccessful
then every shard is currently healthy. You can find a more detailed explanation in this Stackoverflow thread.hits
contains the results.
The method that we used above only allows us to search with a first-level depth, though. If we are to go further down, we have to use bool
queries. To do that, we specify bool
as an item for the query
. Then we can traverse to the field we want by using .
starting from the first-level field down to the field we want to use as a query.
$params['index'] = 'pokemon';
$params['type'] = 'pokemon_trainer';
$params['body']['query']['bool']['must'][]['match']['pokemon.psyduck.type'] = 'water';
$result = $client->search($params);
Searching with Arrays
We can search using arrays as the query (to match several values) by specifying the bool
item, followed by must
, terms
and then the field we want to use for the query. We specify an array containing the values that we want to match. In the example below we’re selecting documents which have an age
that is equal to 10 and 15.
$params['index'] = 'pokemon';
$params['type'] = 'pokemon_trainer';
$params['body']['query']['bool']['must']['terms']['age'] = array(10, 15);
This method only accepts one-dimensional arrays.
Filtered Search
Next, let’s do a filtered search. To use filtered search, we have to specify the filtered
item and set the range that we want to return for a specific field. In the example below, we’re using the age
as the field. We’re selecting documents which have ages greater than or equal to (gte) 11 but less than or equal (lte) to 20.
$params['index'] = 'pokemon';
$params['type'] = 'pokemon_trainer';
$params['body']['query']['filtered']['filter']['range']['age']['gte'] = 11;
$params['body']['query']['filtered']['filter']['range']['age']['lte'] = 20;
$result = $client->search($params);
OR and AND
In RDBMS land we are used to using the AND and OR keywords to specify two or more conditions. We can also do that with Elasticsearch using filtered search. In the example below we’re using the and
filter to select documents which have an age of 10 and a badge count of 8. Only the documents which matched this criteria are returned.
$params['index'] = 'pokemon';
$params['type'] = 'pokemon_trainer';
$params['body']['query']['filtered']['filter']['and'][]['term']['age'] = 10;
$params['body']['query']['filtered']['filter']['and'][]['term']['badges'] = 8;
$result = $client->search($params);
If you want to select either of those then you can use or
instead.
$params['body']['query']['filtered']['filter']['or'][]['term']['age'] = 10;
$params['body']['query']['filtered']['filter']['or'][]['term']['badges'] = 8;
Limiting Results
Results can be limited to a specific number by specifying the size
field. Here’s an example:
$params['body']['query']['filtered']['filter']['and'][]['term']['age'] = 10;
$params['body']['query']['filtered']['filter']['and'][]['term']['badges'] = 8;
$params['size'] = 1;
This returns the first result since we limited the results to just one document.
Pagination
In RDBMS land we have the limit and offset. In Elasticsearch we have size
and from
. from
allows us to specify the index of the first result in the resultset. Documents are zero-indexed. So for 10 results per page, if we have a size of 10, we add 10 to the from
value every time the user navigates to the next page.
$params['index'] = 'pokemon';
$params['type'] = 'pokemon_trainer';
$params['size'] = 10;
$params['from'] = 10; // <-- will return second page
Updating a Document
To update a document, we first need to fetch the old data of the document. To do that, we specify the index
, type
and the id
like we did earlier and then we call the get
method. The current data can be found in the _source
item. All we have to do is update the current fields with new values or add new fields to that item. Finally, we call the update
method with the same parameters used for the get method.
$params = array();
$params['index'] = 'pokemon';
$params['type'] = 'pokemon_trainer';
$params['id'] = '1A-001';
$result = $client->get($params);
$result['_source']['age'] = 21; //update existing field with new value
//add new field
$result['_source']['pokemon'] = array(
'Onix' => array(
'type' => 'rock',
'moves' => array(
'Rock Slide' => array(
'power' => 100,
'pp' => 40
),
'Earthquake' => array(
'power' => 200,
'pp' => 100
)
)
)
);
$params['body']['doc'] = $result['_source'];
$result = $client->update($params);
This returns something similar to the following:
Array
(
[_index] => pokemon
[_type] => pokemon_trainer
[_id] => 1A-001
[_version] => 2
)
Note that the _version
is incremented every time you call the update
method, regardless of whether things have actually been updated.
You might be wondering why we have a version in the document or even be tempted to think that there’s a functionality in Elasticsearch that allows us to fetch a previous version of a document. Unfortunately, that isn’t so. The version merely serves as a counter as to how many times a document was updated.
Deleting a Document
Deleting a document can be done by calling the delete
method. This method accepts an array containing the index
, type
and id
as its argument.
$params = array();
$params['index'] = 'pokemon';
$params['type'] = 'pokemon_trainer';
$params['id'] = '1A-001';
$result = $client->delete($params);
This returns the following:
Array
(
[found] => 1
[_index] => pokemon
[_type] => pokemon_trainer
[_id] => 1A-001
[_version] => 7
)
Note that you will get an error if you try to fetch a deleted document using the get
method.
Conclusion
In this article, we looked at how we can work with Elasticsearch in PHP using the official Elasticsearch client. Specifically, we’ve taken a look at how to index new documents, search for documents, paginate results, and delete documents.
Overall, Elasticsearch is a nice way to add search functionality to your PHP applications. If you want to learn more about how to integrate Elasticsearch on your PHP applications, you can check out Daniel Sipos’ series on how to integrate Elasticsearch with Drupal and Silex.
If, however, you prefer more automatic solutions to adding in-depth search functionality to your applications, see this series.
Frequently Asked Questions (FAQs) about Elasticsearch in PHP
What is the basic structure of an Elasticsearch query in PHP?
An Elasticsearch query in PHP is structured as an associative array. The array contains key-value pairs that define the index you’re querying, the type of search you’re performing, and the actual search parameters. For instance, a simple match query might look like this:$params = [
'index' => 'my_index',
'type' => 'my_type',
'body' => [
'query' => [
'match' => [
'testField' => 'abc'
]
]
]
];
$response = $client->search($params);
In this example, ‘my_index’ is the index you’re searching, ‘my_type’ is the type of document you’re looking for, and ‘testField’ => ‘abc’ is the actual search query.
How can I handle errors in Elasticsearch PHP?
Error handling in Elasticsearch PHP can be done using try-catch blocks. When an operation fails, the Elasticsearch client will throw an exception that you can catch and handle. For example:try {
$response = $client->search($params);
} catch (Elasticsearch\Common\Exceptions\BadRequest400Exception $e) {
// handle exception...
}
In this example, if the search operation fails, an exception of type BadRequest400Exception is thrown. You can catch this exception and handle it as needed.
How can I index a document in Elasticsearch using PHP?
Indexing a document in Elasticsearch using PHP involves creating an array that represents the document and then passing that array to the index() method of the Elasticsearch client. Here’s an example:$doc = [
'id' => '1',
'title' => 'Elasticsearch: cool. bonsai cool.',
'name' => 'bonsai tree',
'age' => 13,
'lives' => 'Australia',
'about' => 'Bonsai artist'
];
$params = [
'index' => 'my_index',
'type' => 'my_type',
'id' => 'my_id',
'body' => $doc
];
$response = $client->index($params);
In this example, the $doc array represents the document to be indexed. The ‘index’, ‘type’, and ‘id’ keys in the $params array specify the index, type, and ID of the document.
How can I delete a document from Elasticsearch using PHP?
Deleting a document from Elasticsearch using PHP can be done using the delete() method of the Elasticsearch client. You need to specify the index, type, and ID of the document you want to delete. Here’s an example:$params = [
'index' => 'my_index',
'type' => 'my_type',
'id' => 'my_id'
];
$response = $client->delete($params);
In this example, the ‘index’, ‘type’, and ‘id’ keys in the $params array specify the index, type, and ID of the document to be deleted.
How can I update a document in Elasticsearch using PHP?
Updating a document in Elasticsearch using PHP can be done using the update() method of the Elasticsearch client. You need to specify the index, type, and ID of the document you want to update, as well as the new data for the document. Here’s an example:$params = [
'index' => 'my_index',
'type' => 'my_type',
'id' => 'my_id',
'body' => [
'doc' => [
'name' => 'new name'
]
]
];
$response = $client->update($params);
In this example, the ‘index’, ‘type’, and ‘id’ keys in the $params array specify the index, type, and ID of the document to be updated. The ‘doc’ key in the ‘body’ array contains the new data for the document.
Wern is a web developer from the Philippines. He loves building things for the web and sharing the things he has learned by writing in his blog. When he's not coding or learning something new, he enjoys watching anime and playing video games.