Monitoring WordPress Apps with the ELK Stack
WordPress is an amazing piece of engineering. There’s little wonder that more than a quarter of all CMS-based websites are using it. In reality, though, WordPress sites crash just like any other site. Bad plugins or themes causing the “WordPress screen of death”, or WordPress updates going south, are an all too frequent occurrence.
When something does go wrong, one of the first things you’re going to want to look at are the log files. Not because you enjoy it — log files are not easy to decipher — but because they contain valuable information that can shed light on what exactly occurred.
In modern environments however, this task is a challenge. While WordPress admins might not ever need to hear the word “log”, the web developers and DevOps crews running the site will often need to go through lines after lines of log files to understand what went wrong.
“So, what’s new?” you might ask. After all, there are plenty of WordPress plugins such as WP Log Viewer that enable you to view these logs easily from the WordPress admin panel.
While this is true, analyzing WordPress and PHP logs is simply not enough. There are also web server and database logs to sift through. To successfully query huge volumes of log messages coming in from various sources and identify correlations, a more solid solution is required.
Enter the ELK Stack. The most popular and fastest-growing open source log analytics platform, ELK allows you to build a centralized logging system that can pull logs from as many sources as you define and then analyze and visualize this data.
To show an example of using ELK, this article will go through the steps of establishing a pipeline of logs from your WordPress application into the Logz.io ELK Stack. You can, if you like, use any instance of the stack to perform the exact same procedures.
Enabling Logging for WordPress Apps
The first step is to configure WordPress to write logs. To do this we are going to enter some definitions in our wp-config
file.
First, we will change the WP_DEBUG value to true
:
define( 'WP_DEBUG', true );
You’ll now start seeing PHP errors, notices and warnings, as well as WordPress debug messages, on your app’s pages.
Next, we will enable the WordPress logging feature:
define( 'WP_DEBUG_LOG', true );
This will save all of the error messages to a debug.log
file located in the /wp-content/
directory of your app.
If you don’t want the error messages to be displayed on your app’s pages, use WP_DEBUG_DISPLAY — this is another constant that allows you to control whether WP_DEBUG messages are shown inside the HTML of your site. To change the default behavior that displays the errors on-screen, change the value to false
to hide all messages:
define( 'WP_DEBUG_DISPLAY', false );
Another useful option is the SAVEQUERIES definition, which saves database queries to an array. You can then use this array to help analyze the issues:
define( 'SAVEQUERIES', true );
Each query will be saved along with information on how long the query took to execute, and the function that called it.
Save your configuration.
You’re all set! To verify the creation of the debug.log
file, simulate an error (you can use the error_log()
function) and then locate and open the file. If the file is not there, you have not triggered an error yet.
Of course, using WP_DEBUG is not recommended in production, so be careful with how you use it and what definitions you are using (the SAVEQUERIES definition, for example, can slow down your site considerably).
Shipping the Logs to ELK
Now that we’ve enabled logging for our WordPress app, the next step is to take our new log file and ship it together with Apache logs to the ELK Stack for analysis. To do this, we will use Filebeat, a log shipper by Elastic that tails log files, and sends the traced data to Logstash or Elasticsearch.
Installing Filebeat
I’m running Ubuntu 14.04, and I’m going to install Filebeat from the repository (if you’re using a different OS, here are additional installation instructions).
First, I’m going to download and install the Public Signing Key:
curl https://packages.elasticsearch.org/GPG-KEY-elasticsearch | sudo apt-key add -
Next, I’m going to save the repository definition to /etc/apt/sources.list.d/beats.list
:
echo "deb https://packages.elastic.co/beats/apt stable main" | sudo tee -a /etc/apt/sources.list.d/beats.list
Finally, I’m going to run apt-get update
and install Filebeat:
sudo apt-get update && sudo apt-get install filebeat
Logz.io uses TLS as an added security layer, so our next step before configuring the data pipeline is to download a certificate and move it to the correct location:
wget http://raw.githubusercontent.com/cloudflare/cfssl_trust/master/intermediate_ca/COMODORSADomainValidationSecureServerCA.crt
sudo mkdir -p /etc/pki/tls/certs
sudo cp COMODORSADomainValidationSecureServerCA.crt /etc/pki/tls/certs/
Configuring Filebeat
Our next step is to open the Filebeat configuration file at /etc/filebeat/filebeat.yml
and configure Filebeat to track specific log files and output them to the Logz.io Logstash instance.
In the Prospectors section, we will define the files we want Filebeat to tail: in this case, our Apache log files (/var/log/apache2/*.log
) as well as the WordPress debug file (/var/www/html/wordpress/wp-content/debug.log
). If you’re using Nginx, alter accordingly.
For each prospector we will define a log type (Apache, WP) — this helps to differentiate between the various log messages as they begin to pile up in our ELK system and will allow us to analyze them more easily.
We will also add some additional Logz.io-specific fields (codec and user token) to each prospector.
The configuration looks as follows:
################### Filebeat Configuration Example #########################
############################# Filebeat #####################################
filebeat:
# List of prospectors to fetch data.
prospectors:
# This is a text lines files harvesting definition
-
paths:
- /var/www/html/wordpress/wp-content/debug.log
fields:
logzio_codec: plain
token: tWMKrePSAcfaBSTPKLZeEXGCeiVMpuHb
fields_under_root: true
ignore_older: 24h
document_type: WP
-
paths:
- /var/log/apache2/*.log
fields:
logzio_codec: plain
token: tWMKrePSAcfaBSTPKLZeEXGCeiVMpuHb
fields_under_root: true
ignore_older: 24h
document_type: apache
registry_file: /var/lib/filebeat/registry
Next, in the Output section of the configuration file, we will define the Logz.io Logstash host (listener.logz.io:5015
) as the output destination for our logs, and the location of the TLS certificate used for authentication.
############################# Output ########################################
# Configure what outputs to use when sending the data collected by the beat.
output:
logstash:
# The Logstash hosts
hosts: ["listener.logz.io:5015"]
tls:
# List of root certificates for HTTPS server verifications
Certificate_authorities: ['/etc/pki/tls/certs/COMODORSADomainValidationSecureServerCA.crt']
Now, if you are using the open source ELK stack, you can ship directly to Elasticsearch or use Logstash. The configuration for either of these outputs in this case is straightforward:
Output:
logstash:
hosts: ["localhost:5044"]
elasticsearch:
hosts: ["localhost:9200"]
Save your Filebeat configuration.
Configuring Logstash
Logstash, the component in the stack in charge of parsing the logs before forwarding them to Elasticsearch, can be configured to manipulate the data to make the logs more readable and easy to analyze.
In our case, we’re going to use the grok plugin to parse our WordPress logs. Now, if we’re using Logz.io, grokking is taken care of for us. But if you’re using the open source ELK, simply apply the following configuration directly to your Logstash configuration file (/etc/logstash/conf.d/xxxx.conf
):
if [type] == "WP" {
grok {
match => [
"message", "\[%{MONTHDAY:day}-%{MONTH:month}-%{YEAR:year} %{TIME:time} %{WORD:zone}\] PHP %{DATA:level}\: %{GREEDYDATA:error}"
]
}
mutate {
add_field => [ "timestamp", "%{year}-%{month}-%{day} %{time}" ]
remove_field => [ "zone", "month", "day", "time" ,"year"]
}
date {
match => [ "timestamp" , "yyyy-MMM-dd HH:mm:ss" ]
remove_field => [ "timestamp" ]
}
}
Verifying the Pipeline
It’s time to make sure the log pipeline into ELK is working as expected.
First, make sure Filebeat is running:
cd /etc/init.d
./filebeat status
And if not, enter:
sudo ./filebeat start
Next, open up Kibana (integrated into the Logz.io user interface). Apache logs and WordPress errors will begin to show up in the main display area.
Analyzing the Logs
ELK is designed for big data. As such, the platform allows you to sift through large volumes of messages being ingested by querying the storage component of the stack — Elasticsearch.
To start making sense of the data, select one of the messages in the main display area — this will give you an idea on what information is available. Remember the different type we defined for the Filebeat prospectors? To make this list of messages more understandable, select the type
, response
, level
and error
fields from the list of mapped fields on the left.
Now, say you’d like to filter the results to only see messages coming in from the WordPress debug.log
file. There are a number of ways to do this, the easiest being entering the following field-level query in the Kibana query field at the top of the page:
type:WP
Again, open one of the messages and view the information that has been shipped into the system. Here’s an example of a database error logged by PHP into the debug.log
file and forwarded into the ELK Stack:
[01-Jun-2016 14:03:11 UTC] PHP Warning: mysqli_real_connect(): (28000/1045): Access denied for user 'ro'@'localhost' (using password: YES) in /var/www/html/wordpress/wp-includes/wp-db.php on line 1490
Save the search. We will use it to create a visualization in the next step.
Visualizing the Logs
Our next step is to try and create a graphic depiction of the data by creating a new Kibana visualization. As an example, we’re going to create a piechart giving us a breakdown of the different PHP and WordPress errors logged.
Select the Visualize tab in Kibana, and from the selection of available visualizations, select the Pie Chart visualization type.
Next, select to create the visualization based on our saved search above, and configure it as follows:
We’re using a simple term aggregation using the level
field to show a count for the top 5 error types. Hit the green Play button to see a preview of the visualization:
This is a simple example of how your WordPress log data can be visualized in Kibana. The same applies to your Apache logs and any other data source you configure to integrate with ELK, and once you have a series of visualizations for monitoring your WordPress app, you can add them up to create a dashboard giving you a general overview of your environment.
Writing Custom Logs
Another option is to write your own logs to the log file.
Log-driven development (LDD) is a development methodology incorporated into the DevOps culture that is based on developers writing and monitoring logs as an integral part of their development process.
Using the error_log()
function, you can write custom messages into the WordPress log file for future ingestion into ELK. Examples of using this function could be for monitoring when a certain line of code is read or even when a specific page is visited.
Final Note
Being able to centralize the logging of all the components and services your app relies on is key to keeping tabs on your environment, especially if your app is running on a cloud infrastructure in which multiple services are running behind a hidden veil.
While WordPress supports various logging plugins, none offer the ability to correlate logs with additional data sources such as the web server, database, or load balancer. Centralized logging with ELK allows you to do just that — together with the ability to analyze the data and create monitoring dashboards to help you visualize it.