Consuming Feeds with SimplePie
If you’re an avid feed consumer like I am, you might be a bit sad to see Google’s recent announcement of Reader reaching end of life. Reader was simple and easy to use, but there’s no reason you can’t have the same functionality with your own homegrown project. The PHP library SimplePie allows for quick and easy feed consumption and display. Here’s how you can get started on your own feed reader.
SimplePie is installable via Composer. You’ll need to add the following to your composer.json
file. After Composer downloads the library and you include the autoloader file in your PHP script, you’re ready to begin writing your very own reader.
{
"require": {
"simplepie/simplepie": "dev-master"
}
}
Basic Functionality
To work with SimplePie, you’ll need first to pick a RSS or Atom feed you’d like to manipulate and grab its URL. I’ll be using the New York Times for my examples – here’s what you should have to start:
<?php
require_once 'autoloader.php';
$url = 'http://rss.nytimes.com/services/xml/rss/nyt/HomePage.xml';
$feed = new SimplePie();
$feed->set_feed_url($url);
$feed->init();
You can see in the above code the URL to the NY Times’ feed, and the beginning of our SimplePie-based reader. In the latest version of SimplePie (v1.3), the constructor doesn’t take any arguments, unlike previous versions, and we use the set_feed_url()
method to tell it where to pull the feed data from. Once the URL is set, we call init()
and the reader is ready to go. You won’t see anything with just the above though; we still need to go out and grab the information.
SimplePie provides many classes and methods to extract the information from any given section of an RSS feed. With the above beginning, Let’s add the following:
<?php
echo '<h1>' . $feed->get_title() . '</h1>';
echo '<p>' . $feed->get_description() . '</p>';
The methods get_title()
and get_description()
do exactly what they say; they return the title and the description of the entire feed. You can get the feed author (in some instances, but not with the NY Times feed), feed contributors, feed link(s), copyright, and of course items with similar methods.
In order to actually display any stories posted by the feed, you need to get at minimum one item using the get_item()
or get_items()
functions. For now, let’s just grab a single item and get some information from it.
<?php
$item = $feed->get_item(0);
echo '<article>';
echo '<header>';
echo '<p>Title: <a href="' . $item->get_link() . '">' . $item->get_title() . '</a></p>';
echo '<p>Author: ' . $item->get_author()->get_name() . '</p>';
echo '<p>Date: ' . $item->get_date('Y-m-d H:i:s') . '</p>';
echo '<p>Description: ' . $item->get_description() . '</p>';
echo '</header>';
echo $item->get_content(true);
echo '</article>';
This certainly won’t look pretty, but it shows you a the basics necessary for getting information from a feed item. So, let’s go over what some of this means.
The get_item()
method retrieves a single item from the feed. The integer argument provided is the feed item number and, like an array, zero is the index of the first item in the feed.
The get_link()
method returns a URL to the feed item itself, allowing you to open the article/video/etc for whatever feed you are working with. The get_title
and get_description()
methods should look familiar as they are the same as before when being applied directly to the feed itself.
The author line looks a little funny, but as I’m sure you’ve figured out – the get_author()
method returns an Author
object, and we call the get_name()
method on that object to retrieve whom the article is by.
The get_date()
method takes any standard PHP date format string to display the date however you like.
Finally, the get_content()
method takes a Boolean argument that says either return only the summary information (true) or try to return the summary information but fallback to the description if none is provided (false).
Selecting Items
So we’ve seen the basics and have displayed a feed title, description, and the relevant information for a single item posted to the feed. That’s all well and good, but it would be frustrating to have to refresh the page constantly to see if there is a newer item, and even more frustrating that we can’t see any past items!
SimplePie provides for that very easily, and I’ve already briefly touched on our options: get_item()
and get_items()
. These two functions are used to get our feed items and their content in two different ways.
The first option that we demoed takes a single integer that specifies which item in the “array” of feed items we want to grab. Using this in a loop in conjunction with the method get_item_quantity()
allows us to display all of or a subset of the items in the feed.
<?php
$itemQty = $feed->get_item_quantity();
for ($i = 0; $i < $itemQty; $i++) {
...
}
This solution, however usable, is not very elegant if you want to use pagination when displaying feed items. For this, we can use the method get_items()
. It accepts two integers: an offset, and an item count. This allows us to get all of the items by passing zeros for both, or, what is really handy about this, allows us to get small subsets anywhere throughout all available feed items.
<?php
foreach ($feed->get_items(3, 3) as $item) {
...
}
This usage would be displaying the second page of three items.
Caching
With all of this data flowing, it would be quite taxing to process the entire feed every time you load the page, but don’t worry… SimplePie has you covered. There are several options for caching feed data built-in so you don’t have to pull the entire feed every time. SimplePie uses a Conditional GET to determine if a feed has been updated since the last retrieval time. To store the content, your cache storage options are the file system, MySQL, Memcache, or – if you’re feeling up to it – you can write your own handler.
The easiest to implement is caching to a file. Here’s how:
<?php
$feed = new SimplePie();
$feed->set_feed_url($url);
$feed->enable_cache();
$feed->init();
And that’s it! The newly added line turns on caching for SimplePie and writes the data out to a file. You’ll need to make sure that your document root contains a writeable directory named cache
and you’re all set. If you prefer to specify a different location, that’s easy too. Simply use the method set_cache_location()
and pass it a string pointing to the location you want cache files written to. The same method can also be used in the case of caching with MySQL and Memcache.
Conclusion
These are the most basic principles of using the SimplePie library. You’ve learned how to set up and initialize a feed, grab one or more items from the feed, and parse the information contained in those items to display.
Of course, SimplePie’s functionality doesn’t end there. It also gives you the ability to pull items and parse information from multiple feeds at once – allowing for a mix of information displayed at your finger tips, all controlled by you. Dive into their API documentation; it’s an excellent resource and will guide you well.
I look forward seeing the next best Google Reader replacement powered by SimplePie, written by you! You can find a brief sample to accompany this article on GitHub to get you started.
Image via Fotolia