Content Syndication With ColdFusion MXBy David Medlock
Content syndication has become a massively popular approach to providing content to a Website’s visitors, as well as providing a means for many people to gather information in one place for quick and easy viewing.
For example, if you get your ColdFusion news from Macromedia, your technical articles. from SitePoint, and your digital security news from CNET, you might want to create a single Web page that allows you to see all the news at one time. In this article, we’re going to talk about how you can do this very quickly (and I mean very quickly!) and easily in ColdFusion MX.
It’s important to note that, because RSS is an XML-based technology, you must use ColdFusion MX in order to follow along with the examples in this article. There are ways of using XML with older versions, but they are overly complicated and not always as stable as MX’s built-in functionality.
What is RSS?
First, let’s talk about what RSS is. I’m not going to give you the entire history here, but I will provide links to very important information that will help you to get started with the concept behind RSS. To sum it up very simple, RSS (Real Simple Syndication or RDF Site Summary) is a clean, straightforward method of sharing content between Websites. RSS can work for you in several ways:
- RSS can allow you to keep news items updated on your site without having to manually change or update the news. If you run a health-related Website, you might want to put health-related news on the site. In this case, you could use MedicineNet’s RSS feed to keep the news updated.
- You can create your own RSS feed that will allow visitors to your Website to syndicate your content. This can increase your visibility among sites of a similar genre, increase traffic to your site, and help you to build a recognizable name.
- You can use RSS to create your own personal home page for keeping track of news items you usually have to visit several sites to get. This also gives you the ability to manipulate the way the information is presented and how often it’s updated.
The possibilities for RSS feeds are unlimited. Before we move too much further, though, I’m going to list some resources you may want to read before delving into the rest of the article.
- Get Off Your RSS!
- Introduction to RSS at WebReference
- XML.com â€“ What is RSS?
- RSS 2.0 Specification (Technology @ Harvard Law)
- RSS 1.0 Specification
Retrieving and Caching RSS Feeds
In my article on HTTP, I discussed how to retrieve an RSS feed using HTTP. To recap that, we’ll go ahead and talk about how to retrieve the RSS feed and cache it for later use. First, we’re going to use a very simple HTTP call to retrieve the feed. We’ll work with the RSS feed for my blog here at SitePoint as our example. To retrieve the blog feed, we’ll use an HTTP call that looks like this:
<cfhttp url="http://www.sitepoint.com/blog.rdf?blogid=7" method="get" timeout="5" path="C:temp" file="dave_blog.rdf" />
This tag retrieves the RSS feed from SitePoint and stores it in a file called
"dave_blog.rdf". It’s important to remember that, because we specify the path and file, there will be no
FileContent variable in the
cfhttp struct. This means we’re going to have to read the file into a variable:
<cffile action="read" file="C:tempdave_blog.rdf" variable="FileContent">
Now we have a variable called
FileContent that contains the RDF document for the feed. We’re eventually going to parse this XML document and output the information. But, before we do so, let’s discuss caching.
If you’re going to put my blog feed on your site and you have 1,000 visitors a day then, chances are, you’re going to have to make an HTTP call to SitePoint 1,000 times a day. To prevent overloading their server, and to speed up the display of this feed on your own site, you should definitely cache the file on your server. This is why we’ve specified a path and file name to which we’re going to save the feed. I’m thinking that I want to update the feed every 24 hours so that I’ll be sure to get fresh content, but I won’t hit SitePoint very often at all. Here’s the complete code (
cache.cfm) I’ll use to fetch the feed as needed and cache it to a file:
<cfset SavePath = ExpandPath("./cache.cfm")>
<cfset SavePath = Replace(SavePath, "cache.cfm", "blog.rdf")>
<cfdirectory action="list" directory="#GetDirectoryFromPath(SavePath)#" name="blogRdf" filter="blog.rdf">
<cfif not blogRdf.RecordCount eq 0 and DateCompare(blogRdf.DateLastModified, DateAdd("h", -24, Now())) eq -1>
<cffile action="delete" file="#SavePath#">
<cfif FileExists(SavePath) eq 0>
<cfhttp url="http://www.sitepoint.com/blog.rdf?blogid=7" method="get" timeout="10" path="#GetDirectoryFromPath(SavePath)#" file="#GetFileFromPath(SavePath)#"></cfhttp>
<cffile action="read" file="#SavePath#" variable="FileContent">
This is a pretty big chunk of code, so I’ll break it down line by line for you. It’s really very simple.
- I create the
SavePathvariable, which is simply the path to the current template file.
- I replace the name of my current template with the name of the file in which I want to store the feed,
- I get a directory listing of the directory that I stored the feed in.
- If the file exists in the directory and it is older than 24 hours, I delete it.
- If the file does not exist (remember, it’s deleted every 24 hours), I do an HTTP call to retrieve it and I store it in the file.
- I read in the file and store it as
"FileContent"for parsing later.
And that is pretty much all that has to be done for our retrieval and caching mechanism. That was easy, wasn’t it? The rest is easy, too.
Parsing and Displaying the RSS Feed
If you look at the raw XML output of the RSS feed, you’ll be able to see that its structure contains a root node called "rdf:RDF". We then have a "channel" node with a title, link, description, and image. We then have an "items" node that contains an array of "li" nodes ("rdf:li"). We have an "image" node that contains information about the SitePoint logo, including the title of the image, the link to the blog (to go into the href of your anchor tag), and the URL of the image itself. Then, we have an array of "item" nodes, each or which contains a title, a direct link to the blog post, a description, and the date. Notice that the date is in the "dc" namespace and it’s not a format that ColdFusion is very friendly with. I’ll show you a brute force method of displaying the date as well.
The first thing you’ll have to do is decide what you want to display and how you want to display it. I’m going to have a single page that displays the title and description of the blog as well as a link to it and the SitePoint logo. I’ll then display each blog entry — the title will be linked to the entry on SitePoint’s Website and, beneath it, the description and date will appear. Here’s the code I use to do this:
<cfset xmlDoc = XMLParse(filecontent)>
<img src="#xmlDoc.rdf.image.url.xmlText#" align="left">
Source Title: #xmlDoc.rdf.channel.title.xmlText#<br>
Link: <a href="#xmlDoc.rdf.channel.link.xmlText#">#xmlDoc.rdf.channel.link.xmlText#</a><br />
<cfloop from="1" to="#ArrayLen(xmlDoc.rdf.item)#" index="i">
<cfset dt = Replace(xmlDoc.rdf.item[i].date.xmlText, "T", " ")>
<cfset dt = Replace(dt, "Z", " ")>
<a href="#xmlDoc.rdf.item[i].link.xmlText#">#xmlDoc.rdf.item[i].title.xmlText#</a><br />
<em>(Posted on #DateFormat(dt, "mm/dd/yyyy")#)</em>
I’ll walk you through the code we’re using to parse and display the feed. First, we use the
XMLParse(string) function to parse the document. Note that this is also a validating parser, so it will choke if you pass it invalid XML. When you use this function, it creates an XML Document object. Essentially, this is a series of structures and arrays all nested inside one another.
Next, we’ll open our
cfoutput and we’ll begin to display information. My image source is going to be the value contained in
xmlDoc.rdf.image.url.xmlText. If you look carefully, you’ll see that you can access values simply by "walking" down the document tree to the value you need. We start with the
xmlDoc variable, then go to the root node of
rdf. Our image information is in the
image node, under the
url node. The value is contained in the
xmlText element of the
As a side point, I should explain how to access attributes in an XML document. Let’s access the "about" attribute of the image node. Here’s the path to it:
xmlDoc.rdf.image.xmlAttributes.about. To access an attribute, refer to the
xmlAttributes struct that is attached to the node in which the attribute is stored; then, simply access the attribute name.
You can apply the above logic to display the title, description, and link to the blog. Then, we want to display the items. To do this, I’m simply going to loop from 1 to the length of the array found at
xmlDoc.rdf.item. As I’m looping over it, I’ll access the
item[i].variableName variable to display the desired information. Note that you must fully qualify the path to the value, otherwise ColdFusion won’t know where to look for the variable.
That was pretty easy, too, right? We now have a total of about 37 lines of code that we’re using to display the blog feed itself. Your assignment is to modify the code to make it prettier, display only x rows on one page, and allow for premature refreshing of the cache (before the 24 hours is up).
Before I move on, let me talk about the date, though. This date is not presented in a format that ColdFusion understands. Therefore, our script crashes pretty hard when it tries to format it using the
DateFormat(date, mask) function. To fix this, you simply need to replace the "T" and "Z" with spaces, as you see I’ve done above. Then, you can display it however you like.
Creating an RSS Feed for Your Site
If you run a content site and you’d like to syndicate your content to other sites, the easiest way to do so is to create an RSS feed. I’m going to use RSS version 0.91 to create a content feed of my own. You can use RSS 1.0, as SitePoint does, if you’d prefer. Just make a few minor modifications so that the document is compliant with the standard.
My approach to delivering the content feed is that I’ll have a scheduled task that runs every 24 hours. (If you have a site that is very dynamic and constantly changing, you may want to decrease this interval.) This scheduled task will create the RSS feed and write it to a file on the server, which can then easily be accessed by all. It’s a very simple process and the code looks like this:
<cfquery name="GetArticles" datasource="#dsn#">
SELECT TOP 5 Title, Description, ID
ORDER BY ID DESC
<title>My Content Site â€“ Recent Articles</title>
<description>The best place to get content.</description>
<cfset Desc = REReplaceNoCase(Description, "<[^>]*>", "", "ALL")>
<cfset Desc = Replace(Desc, " ", " ", "ALL")>
<cffile action="write" file="#ExpandPath("./myArticles.xml")#" output="#ToString(rssFeed)#">
Don’t be concerned with the query or the database structure at this point. Just know that the information I’m getting from the database is the title, description, and link (in this case, the ID for the link) to be used in the RSS feed.
The next notable, and the most important, thing in this file is the use of
cfxml. This tag is being used to create an XML document that will later be converted to a string and written to a file. Notice that we specify the
variable attribute, which will be the ColdFusion variable that holds the XML document. The variable name is not the top level node! Don’t let that fool you. Next, we simply insert the RSS nodes with which we want to provide our users, including the title, link, URL, and image information. We then loop over the query and add each
item node with the title, link, and description.
Finally, we use the
ToString(string) function to convert the XML document created in
cfxml into a string that can be written to a file. Run the template and then point your browser at the "./myArticles.xml" file in the same directory as the template. (You can move this anywhere on your server as long as it is accessible via the Web. Otherwise, it does no good.)
For another way to create an RSS feed, you may be interested in this tutorial by Pablo Varando. It’s a different approach that works just as well. Note that it’s been tested on BlueDragon as well. My example has not, but the BlueDragon server supports all the tags and methods I used, so, technically, it should work fine.
We’ve now seen how to retrieve and display an RSS/RDF feed. The example we used was RSS 1.0, but the principles are the same for using RSS 0.91. We’ve also seen how to create our own RSS feed for our users to access so they can display our content on their sites.
Now, I’m going to introduce something new to my articles: assignments. Here are a few assignments that you can complete to enhance your skills in this area. If you need help with these assignments, please feel free to discuss them below or on the SitePoint Forums. I’m generally available to help and, on the forums, there are many very skilled ColdFusion developers, who offer their advice and viewpoints readily.
- Modify the RSS reader to use RSS 0.91.
- Create a custom tag that accepts a URL for an RSS feed and then retrieves it, caches it, parses it, and displays it. Set it up to accept either RSS 0.91 or RSS 1.0.
- Modify the RSS creation script above to create an RSS 1.0 feed.
- Create an RSS feed for your site (if you have a content or news based site).
There are many other ways that RSS can be used on the Web. In fact, the more you think about it, the more ideas you’ll find, I’m sure. Feel free to share those ideas with us using the discussion box below. I hope to see many ColdFusion developers using RSS feeds in the near future. Until next time, keep it Cold!