Error Logging with MongoDB and Analog
With the hype around document databases continuing to grow, and around MongoDB in particular, we get a lot of questions about how people should move their applications over to using it. The advice is usually the same – especially for existing applications – take one step at a time. With that said, we’d like to show off what we consider to be an excellent place to start: using MongoDB for error logging.
MongoDB is an excellent fit for logging (and of course other things as well) for many reasons. For one, it is very VERY fast for writing data. It can perform writes asynchronously; your application wont hang because your logging routines are blocked. This allows you to centralize your logs which makes it easier to querying against them to find issues. Also, its query interface is easy to work with and is very flexible. You can query against any of the field names or perform aggregate functions either with map/reduce or MongoDB 2.2’s upcoming aggregation framework.
This article will show how you can use existing code to add a logging library to your code and log errors to MongoDB. You’ll also see how to query MongoDB from the command line and how to filter those queries to find the information you are interested in.
Setting Up The Logger
Before writing a MongoDB logging class for PHP, we took a quick look to see what else was out there already and found a nice micrologging tool on GitHub called Analog. In the interest of not reinventing the wheel, we’ll use this in our examples.
What we really liked about Analog is the simplicity of its code and the number of things you can log to! It’s designed to be extensible, so you should be able to easily build on in to anything specific you may need for your own project.
The logger is fairly self-contained, so all you’ll need to do to make its functionality available is to include its main file Analog.php
. This takes care of the autoloading and namespace registration needed for it to find its dependencies. Since it uses spl_autoload_register()
, it will happily co-exist alongside any other autoloading arrangements you already have in place.
To start using the logger, you’ll need to intialize the logging handler you want to use and then pass it to the main logging class. There are some examples included with the project which makes it easy to see what you need for a specific platform. For MongoDB, we have the following:
<?php
Analog::handler(AnalogHandlerMongo::init(
"localhost:27017",
"testing",
"log"));
All we have to do here is to point Analog at our MongoDB installation (ours is on the same machine as the web server and uses the default port), tell it to use the testing
database, and write to the log
collection. With this included somewhere at the top of our script, probably along with various other bootstrapping tasks, we’re ready to go.
Logging Errors
At this point we can use the logging functionality anywhere we want it in our application. To log an error, simply do:
<?php
Analog::log("Oh noes! Something went wrong!");
To see what’s in the database, open the mongo shell.
lorna@taygete:~$ mongo type "help" for help > use testing > db.log.find(); { "_id" : ObjectId("4f268e9dd8562fc817000000"), "machine" : "localhost", "date" : "2012-02-29 11:11:16", "level" : 3, "message" : "Oh noes! Something went wrong!" }
As you can see this gives us the error message, the severity, the date and time that the error was created, and the machine from which it came. The machine identifier comes from $_SERVER["SERVER_ADDR"]
if set, otherwise “localhost” is used.
Logging Levels
The Analog library comes with a great set of constants that you can use to set the level of each error. Here’s a snippet from the class to showing them:
<?php
...
class Analog {
/**
* List of severity levels.
*/
const URGENT = 0; // It's an emergency
const ALERT = 1; // Immediate action required
const CRITICAL = 2; // Critical conditions
const ERROR = 3; // An error occurred
const WARNING = 4; // Something unexpected happening
const NOTICE = 5; // Something worth noting
const INFO = 6; // Information, not an error
const DEBUG = 7; // Debugging messages
...
The default is level 3 to denote an error. To log an error of any other level, pass the desired level as a second parameter to the log()
method:
<?php
Analog::log("FYI, a log entry", Analog::INFO);
Looking in the database now, we can how our log messages collection will grow.
> db.log.find(); { "_id" : ObjectId("4f268e9dd8562fc817000000"), "machine" : "localhost", "date" : "2012-02-29 11:11:16", "level" : 3, "message" : "Oh noes! Something went wrong!" } { "_id" : ObjectId("4f268e9dd8562fc817000001"), "machine" : "localhost", "date" : "2012-02-29 12:35:41", "level" : 6, "message" : "FYI, a log entry" }
Although (as with all logs) in a real application we’ll be building up a large set of data, using a database means we can easily generate summary information or filter the data to find only the important entries.
Filtering And Summarizing MongoDB Logs
Using database storage means the ability to search results, and MongoDB is designed to be easy for developers to use even with large datasets. The days of grep’ing enormous flat-file logs are over! We can very easily filter the data to show only what we’re interested in.
> db.log.find({level: 3}); { "_id" : ObjectId("4f268e9dd8562fc817000000"), "machine" : "localhost", "date" : "2012-02-29 11:11:16", "level" : 3, "message" : "Oh noes! Something went wrong!" }
There are some higher-level entries also in the database since we have many different levels of logging. To show everything of error severity and above (a lower error level constant), we can query with the operator $lte
:
> db.log.find({level: {$lte: 3}}); { "_id" : ObjectId("4f268e9dd8562fc817000000"), "machine" : "localhost", "date" : "2012-02-29 11:11:16", "level" : 3, "message" : "Oh noes! Something went wrong!" } { "_id" : ObjectId("4f26aaafd8562fcb27000009"), "machine" : "localhost", "date" : "2012-02-29 13:01:04", "level" : 0, "message" : "To the lifeboats!" }
We can also look for date ranges, for example, using a $gt
comparison to pull the most recent few log entries from my database:
> db.log.find({date: {$gt: "2012-02-29 14:35:30"}}); { "_id" : ObjectId("4f26aaafd8562fcb2700000a"), "machine" : "localhost", "date" : "2012-02-29 14:35:31", "level" : 4, "message" : "Empty variable $a on line 127" } { "_id" : ObjectId("4f26aaafd8562fcb2700000b"), "machine" : "localhost", "date" : "2012-02-29 14:35:35", "level" : 4, "message" : "Empty variable $a on line 93" } { "_id" : ObjectId("4f26aaafd8562fcb2700000c"), "machine" : "localhost", "date" : "2012-02-29 14:35:40", "level" : 4, "message" : "Empty variable $a on line 277" } { "_id" : ObjectId("4f26aaafd8562fcb2700000d"), "machine" : "localhost", "date" : "2012-02-29 14:35:45", "level" : 6, "message" : "FYI, it seems to be snowing" }
If you commonly query data on a particular field, you can speed up your queries by adding an index. For example, if you frequently query on level
and date you can create a compound index:
> db.log.ensureIndex({ date : -1, level : 1 } );
The above line will create a single index if it doesn’t already exist. There’s a couple things worth noting here, however. First, we placed date
first as it will have the largest variation and therefore the index will do the most good. We also created date
as a reverse index as we commonly want to query for the most recent entries. Secondly, we added level
as part of the index. This compound index will make any query on date and any query on date
and level
more efficient. It will not be able to be used for queries on just level
and not date
.
Sometimes you’ll want to look for overall trends in your logs, so you’ll group how many of a particular error happens. In this example, we’ve grouped the error set by the error level to show how many there are of each:
> db.log.group({key: {level: true}, initial: {count: 0}, reduce: function (obj, prev){prev.count++}}); [ { "level" : 3, "count" : 1 }, { "level" : 6, "count" : 4 }, { "level" : 4, "count" : 8 }, { "level" : 0, "count" : 1 } ]
You can use the group()
function to count errors per day, or from a particular machine, as you so choose. Do take care though as this approach is only useful on small data sets. If you have over 10,000 results then you’ll want to use map/reduce to generate the results.
Summary
It makes sense to start small when looking at adding MongoDB to an existing application, and logging is an ideal candidate. Different types of errors can include different types of information and you can also save the current object or any other information to MongoDB since it has a flexible schema. Any new technology can be a bit of a learning curve but hopefully the command line examples help you to get quite close to what you are working on. Implementing just one piece of functionality in something new can be a great way to get your feet wet – hope you enjoy MongoDB as much as we do!
Image via mama-art / Shutterstock