Like most developers, you’ve probably heard a great deal about XML and content management systems. It’s likely, however, that you’ve only been exposed to theoretical discussions that haven’t been grounded in practical knowledge.
This step-by-step tutorial will get you up and running with a very basic XML-based content management system (or CMS). I don’t have the space here to get into a very complex example, but with any luck, the concepts and ideas presented here will provide you with the necessary springboard for your own exploration.
Some notes before we get started. I’m using DOMXML functions and sessions to make this application work, so you’ll need to use PHP 4.2.1 or higher, and turn on DOM support.
Also, don’t forget to download all the codefor this tutorial. It’ll come in handy!
Key Takeaways
- Utilize PHP 4.2.1 or higher and enable DOM support to effectively use DOMXML functions and sessions in building an XML-based CMS.
- Download all code provided in the tutorial to assist in setting up the basic structure and functionality of the CMS.
- Understand the flexible nature of XML, which allows for the definition of custom tags, making it ideal for managing diverse data types in a CMS.
- Recognize the structure of the CMS, which includes a data backend using XML files, a display component for rendering content, and an administrative component for content management.
- Follow the provided PHP code examples closely to set up the administrative tools for creating, editing, publishing, and deleting XML content.
- Consider future expansions and improvements of the CMS by exploring more dynamic XML structures and robust search algorithms as outlined in the tutorial.
A Short Introduction to XML and CMSs
Let me first give you a little background on CMSs and XML. I assume that you’ve read or heard about both of these technologies elsewhere, so I’ll keep this discussion brief.
XML stands for eXtensible Markup Language, and is a subset of SGML (Standardized General Markup Language). XML is very much like HTML, except that in XML you can define your own tags. This ability to produce custom documents comes in very handy when you need to track certain types of data very closely, particularly in the worlds of publishing and ecommerce.
For example, for any given article you publish in an online magazine, you can create tags for author’s name, byline (if it’s different from the author’s name), word length, date of publication, title or headline, story body, keywords, and so on. As you’ll see later in this article, breaking your article down into these XML tags or nodes allows the CMS to do useful things with all the articles it holds.
Essentially, XML allows you to make a mini-database out of each document, without the overhead and expense that many databases bring to a Web project.
A CMS is used to create, publish, and maintain content on a Website. It usually consists of the following pieces:
- A data backend (XML or database tables) that contains all of your articles, news stories, images, and other content.
- A data display component, usually templates or other pages, onto which your articles, images, etc. are “painted” by the CMS for site visitors.
- A data administration component, usually easy-to-use HTML forms that allow site administrators to create, edit, publish, and delete articles in some kind of secure workflow. The data administration portion of a CMS is usually the most complicated, and where you’ll likely spend most of your development time.
Over the past decade, different scripting languages have been used to create CMSs, including Perl/CGI, ASP, TCL, JSP, Python, and PHP. Each of these languages has its own pros and cons, but I’m going to focus on using PHP with XML to build a simple content management system.
Requirements
Building any kind of CMS, whether database- or XML-backed, involves the gathering of information that defines the basic requirements for the project. Although many developers groan at the thought of this kind of exercise, a set of well-defined requirements can make your life a lot easier.
Because this is a fairly simple project, and because you’re going to do it for yourself, a simple requirements list will do.
What kind of requirements do we need to gather? Essentially, requirements fall into three major categories:
- What kind of content will the CMS handle? Furthermore, how is each type of content broken down? (The more complete your understanding of this issue, the easier it’ll be to create your XML files.)
- Who will be visiting the site, and what behaviors do these users expect to find? (For example, will they want to browse a hierarchical list of articles, search for articles by keyword, and see links of related articles?)
- What do the site administrators need to do? (For example, log in securely, create content, edit content, publish content, and delete content. If your content management system will have roles for administrative users – such as site admin, editor, writer – then your system becomes more complex).
In the interests of keeping this article from becoming complicated, I will choose very basic requirements for my simple XML-based CMS:
- The CMS will handle the management of articles only. Each article will have a:
- unique ID
- headline
- author name
- author email
- abstract
- article body that can contain paragraphs and one level of subheading
- status (either “in progress” or “live”)
- keyword listing
- Site visitors will be able to view “live” articles listed by author name. They will also be able to perform a search on headline and keywords.
- The site itself will consist of the following pages:
- A home page that lists five articles published on the site and a search function.
- An article detail page that displays one article at a time.
- A search result page that will list all articles by an author, keyword, or string entered into the search engine.
- Site administrators get a secure login, a way to add more administrators, and easy screens from which to add new XML files, edit existing files, publish files, and delete files.
Defining the XML Files
Whenever I build a CMS, I try to define the data backend first, because I find that all the other elements cascade from there. In this case, our data backend is an XML file repository, so we need to define how our files should be structured.
XML files are made up of nested start and end tags, each of which defines some chunk of information. XML files must also contain a “root” start and end tag that includes all the other tags.
Because we are only going to be dealing with articles in this example, our “root” start and end tag should be:
<article>
</article>
All other tags that we identified during our discovery phase must go in between these two tags. Based on that list, our article files will likely be structured like this:
<?xml version="1.0"?>
<article id=\"xml-howto-1\">
<headline>Writing XML Articles</headline>
<status>in progress</status>
<author>Joe Author</author>
<email>jauthor@example.com</email>
<abstract>A short article about writing XML articles.</abstract>
<keywords>XML,articles,how to</keywords>
<para-intro>Intro paragraph here.</para-intro>
<para-main>Main paragraph.</para-main>
<para-conclusion>Conclusion paragraph.</para-conclusion>
</article>
Several things to note about our article example:
- Usually, you would create a DTD or Schema to define how an article would look. Creating effective DTDs or Schemas is an entire tutorial unto itself, so here, I used a shortcut method involving a sample case. This is faster than developing a schema, but be aware that you may run into problems because your sample case may be too simple. Also, if you want to validate your XML document, you will need to create a DTD.
- Did you notice the “
id=
” part in the article tag? This is called an attribute. We’ll talk more later about why it’s important to have a unique id attribute for each article we create in the system. - Because we want to keep this example simple, I’m going to assume that our articles will comprise only three paragraphs each, and the forms we build later on will accommodate this structure. In future tutorials, we will build a more dynamic structure in which we nest the paragraph tags into a
<body>
tag.
Building the Admin Tool
The admin tool for our XML-based CMS will be just a few PHP pages that will allow administrators to log in and create, edit, publish, and delete XML articles. Administrators will also be able to create, edit, and delete other administrators.
The Login Page
The login page is very simple. It involves a simple HTML form that allows administrators to enter a username and password. The PHP logic on this page needs to check the entered values against a list of administrators. If we had enough time, I’d walk you through the building of an admin.xml file that holds these values. But for now, we’ll take the shortcut of embedding values in our PHP.
Here is the code for the login.php page:
<?php
session_start();
?>
<html>
<title>Please Log In</title>
<body>
<form name="login" method="post" action="verify.php">
<table width="290" border="0" align="center" cellpadding="4" cellspacing="1">
<tr>
<td colspan="2"><div align="center">Please log in</div>
</td>
</tr>
<tr>
<td width="99" bgcolor="#CCCCCC"> <div align="right">login</div></td>
<td width="181" bgcolor="#CCCCCC"> <div align="left">
<input name="username" type="text" id="username">
</div></td>
</tr>
<tr>
<td bgcolor="#CCCCCC"> <div align="right">password</div></td>
<td bgcolor="#CCCCCC"> <div align="left">
<input name="password" type="password" id="password">
</div></td>
</tr>
<tr>
<td colspan="2"><div align="center">
<input type="submit" name="Submit" value="Submit">
<input name="reset" type="reset" id="reset" value="Reset">
</div></td>
</tr>
<tr>
<td colspan=2 align=center>
<?php echo $_SESSION[\"error\"]; ?>
</td>
</tr>
</table>
</form>
</body>
</html>
Notice that the form’s action is set to a page called verify.php. The verify.php page is extremely simple. All it does is check that the passed-in values for username and password match the stored username/password values.
If there’s a match for both, PHP sets a session variable and redirects the user to the admin page. If not, PHP sends the user back to the login.php page, and a special session variable containing an error message is displayed. Here is the code for the verify.php page:
<?php
session_start();
$user = 'tom';
$passw = 'test';
if (($_POST["username"] == $user) and ($_POST["password"] == $passw)){
$_SESSION["login"] = "true";
header("Location:adminindex.php");
exit;
} else {
$_SESSION["error"] = "<font color=red>Wrong username or password. Try again.</font>";
header("Location:login.php");
}
?>
Because anyone can enter a URL for the admin pages, we have to add an extra piece of security. At the top of each page, we need to check to see if the value of the session variable “login” is set to “true.” If it isn’t, send folks back to the login.php page; if it is, show them the admin page.
The Admin Index Page
The first page of our admin tool is the adminindex.php. This page lists all XML articles currently displayed on the site, allowing you to edit or delete them and change their status (to publish them, for example). It also allows you to create new XML articles.
The code for this page is very simple and compact. We want to include a link to the createArticle.php page. We want to open up the xml/ directory (where we’ll store all the articles), pull out the names of each file and pass those names into the links to the editArticle.php page. We’ll do the same for our delArticle.php page.
Here’s the code for adminindex.php:
<?php
session_start();
if ($_SESSION["login"] != "true"){
header("Location:login.php");
$_SESSION["error"] = "<font color=red>You don't have privileges to see the admin page.</font>";
exit;
}
?>
<h1>Welcome to the Admin Index Page</h1>
<a href="createArticle.php">Create New XML Article</a><br><br>
<table border=0 cellspacing=0 cellpadding=3 width="85%">
<tr valign=top>
<td width="75%">
<table border=1 cellspacing=0 cellpadding=2>
<?php
$dh = opendir('./xml/');
while ($file = readdir($dh)){
if (eregi("^..?$", $file)) {
continue;
}
echo "<tr valign=top><td width="80%">";
echo "<a href="editArticle.php?file=".$file . "">".$file . "</a></td>";
echo "<td width="20%">";
echo "<a href="delArticle.php?file=" .$file . "">delete</a>";
echo " </td></tr>";
}
?>
</table>
</td></tr></table>
The Create Article Page
The createArticle.php page is very important – it allows the site administrator to create new XML articles on the site. It’s just a simple form that allows site administrators to enter pertinent information. Each of the form fields maps to the XML document structure we figured out earlier.
The form will also check to make sure that users enter an ID – without an Article ID, most of the site’s functionality won’t work.
Here’s the code for that page:
<?php
session_start();
if ($_SESSION["login"] != "true"){
header("Location:login.php");
$_SESSION["error"] = "<font color=red>You don't have privileges to see the admin page.</font>";
exit;
}
?>
<html>
<head>
<title>Create an XML Article</title>
<script>
function isReady(form){
if(form.id.value == "") {
alert("Please enter an ID!");
return false;
}
}
</script>
</head>
<body>
<h1>Create an XML Article</h1>
<a href="adminindex.php">Cancel</a><br><br>
<form name="createArticle" action="addArticle.php" method="post" onSubmit="return isReady(this)">
<table border=1 cellspacing=0 cellpadding=3>
<tr valign=top>
<td width="135">Article ID</td>
<td width="634"> <input name="id" type="text" id="id"> <br> <font size="-1">(no
spaces, must be unique)</font></td>
</tr>
<tr valign=top>
<td>Status</td>
<td>In Progress <input type="hidden" name="status" value="in progress"></td>
</tr>
<tr valign=top>
<td>Headline</td>
<td> <input name="headline" type="text" id="headline" size="60"></td>
</tr>
<tr valign=top>
<td>Author Name</td>
<td> <input name="name" type="text" id="name" size="30"></td>
</tr>
<tr valign=top>
<td>Author Email</td>
<td> <input name="email" type="text" id="email" size="30"></td>
</tr>
<tr valign=top>
<td>Keywords</td>
<td> <p>
<input name="keywords" type="text" id="keywords">
<font size="-1"><br>
</font><font size="-1">(separate keywords with commas)</font> </p></td>
</tr>
<tr valign=top>
<td>Abstract</td>
<td><textarea name="abstract" cols="50" rows="5" id="abstract"></textarea></td>
</tr>
<tr valign=top>
<td> <p>Article Body<br>
</p></td>
<td> <p>Intro paragraph:</p>
<p>
<textarea name="body[intro]" cols="70" rows="10" wrap="soft" id="body[intro]"></textarea>
</p>
<p>Main paragraph:</p>
<p>
<textarea name="body[main]" cols="70" rows="10" wrap="soft" id="body[main]" ></textarea>
</p>
<p> </p>
<p>Conclusion paragraph:</p>
<p>
<textarea name="body[conclusion]" cols="70" rows="10" wrap="soft"></textarea>
</p></td>
</tr>
<tr valign=top>
<td colspan=2> <div align="center">
<input type="submit" name="Add Article" value="Add Article">
<input name="reset" type="reset" id="reset" value="Reset">
</div></td>
</tr>
</table>
</form>
</body></html>
This form’s action is set to the addArticle.php page, which uses DOMXML functions to create an XML article from the information in the form. Because this is a little complex, I’ll go over the code section by section.
The first part of the code initializes our new XML file, setting the version and creating the root node, which is <article>
.
<?php
//create document root
$doc = domxml_new_doc("1.0");
$root = $doc->create_element("article");
$root = $doc->append_child($root);
Next, we add an id attribute to the <article>
node. First, however, we need to make sure that users have chosen a unique value, as the id will be used as the file name. We perform this check by looking at all the articles in the xml directory. If we find a filename that contains the id from the form (stored in the incoming $id variable), then we add a “–
” and the number of seconds since the beginning of the UNIX epoch to our id. Although it’s not considered good form to change user input without a warning, this will do for now. Finally, we add the id attribute to the <article>
node.
//add ID attribute
//FIRST, let's make sure that the id they chose isn't going to overwrite a file!
$dh = opendir('./xml/');
while ($file = readdir($dh)){
$string = $id . \\".xml\\";
if (eregi("^\.\.?$", $file)) {
continue;
}
if (eregi($string, $file)){
$time = date("U"); //num of seconds since unix epoch
$id = $id . "-" . $time;
}
}
$root->set_attribute('id', $id);
Now that we’ve created the root, it’s time to create each of that node’s children in order. The first is <headline>
. Notice that the <headline>
node is a child of <article>
, and that the headline text is a child of <headline>
.
//create headline
$head = $doc->create_element("headline");
$head = $root->append_child($head);
$htext = $doc->create_text_node($headline);
$htext = $head->append_child($htext);
The same is true of the <author>
, <email>
, <abstract>
and <status>
nodes:
//create author name
$aname = $doc->create_element("author");
$aname = $root->append_child($aname);
$atext = $doc->create_text_node($name);
$atext = $aname->append_child($atext);
//create author email
$mail = $doc->create_element("email");
$mail = $root->append_child($mail);
$mtext = $doc->create_text_node($email);
$mtext = $mail->append_child($mtext);
//create abstract
$abs = $doc->create_element("abstract");
$abs = $root->append_child($abs);
$abstext = $doc->create_text_node($abstract);
$abstext = $abs->append_child($abstext);
//create status, always in progress when first created
$stat = $doc->create_element("status");
$stat = $root->append_child($stat);
$stat_text = $doc->create_text_node($status);
$stat_text = $stat->append_child($stat_text);
Next come the keywords:
//create keyword listing
$keylisting = $doc->create_element("keywords");
$keylisting = $root->append_child($keylisting);
$ktext = $doc->create_text_node($keywords);
$ktext = $keylisting->append_child($ktext);
The paragraphs can be handled as an array, as they are being passed in as body[lead]
, body[second]
, and so on. Our PHP code is set up to create para tags using these passed-in keys; we’ll end up with tags named <para-lead>
, <para-second>
, and so on.
//create paras
if (is_array($body)){
foreach ($body as $K => $V){
if ($V != ""){
$para = $doc->create_element("para-$K");
$para = $root->append_child($para);
$ptext = $doc->create_text_node($V);
$ptext = $para->append_child($ptext);
//$para->set_attribute('order', $K);
}
}
}
Finally, we can write this entire XML tree to a file, using id as a filename, and send the user back to adminindex.php, where they should see the file they just created added to the list of XML articles.
//write to the file
$filename = "./xml/".$id . ".xml";
$doc->dump_file($filename, false, true);
//send user back to adminindex
header("Location:adminindex.php");
Editing an XML Article
Generally speaking, editing an XML article is very much the same as creating an article, except that you have to load an existing article’s nodes into the edit form, then write your changes out to the file.
In this example, we’re not going to allow any changes to the article’s id attribute, as this would make it very difficult for the rest of the application to function properly. Restricting changes to the id attribute also allows us to perform an easy short cut when updating an XML file – all we have to do is delete the old file and create a new file with our new information. This is crude, but fast and effective.
Here is the code for the editArticle.php page. Note that the form’s action is set to updateArticle.php. Also note the use of the extractText()
function to extract content from an XML object and place it into a variable that can then be assigned to a form element.
<?php
session_start();
function extractText($array){
if(count($array) <= 1){
//we only have one tag to process!
for ($i = 0; $i<count($array); $i++){
$node = $array[$i];
$value = $node->get_content();
}
return $value;
}
}
if ($_SESSION["login"] != "true"){
header("Location:login.php");
$_SESSION["error"] = "<font color=red>You don't have privileges to see the admin page.</font>";
exit;
}
//pull in the XML file
if ($file == ""){
echo "<h2>You didn't choose a file to edit!</h2>";
echo "<a href="adminindex.php">Go back to index and choose a file</a>";
exit;
} else {
$filename = "./xml/".$file;
$xml = domxml_open_file($filename);
$root = $xml->root();
$id = $root->get_attribute("id");
$h_array = $root->get_elements_by_tagname("headline");
$headline = extractText($h_array);
$stat_array = $root->get_elements_by_tagname("status");
$status = extractText($stat_array);
$a_array = $root->get_elements_by_tagname("author");
$author = extractText($a_array);
$e_array = $root->get_elements_by_tagname("email");
$email = extractText($e_array);
$ab_array = $root->get_elements_by_tagname("abstract");
$abstract = extractText($ab_array);
$kl_array = $root->get_elements_by_tagname("keywords");
$keywords = extractText($kl_array);
$lead_array = $root->get_elements_by_tagname("para-intro");
$plead = extractText($lead_array);
$second_array = $root->get_elements_by_tagname("para-main");
$pmain = extractText($second_array);
$con_array = $root->get_elements_by_tagname("para-conclusion");
$pcon = extractText($con_array);
$statusList = array("live","in progress");
?>
<html>
<title>Edit an XML Article</title>
<body>
<h1>Edit an XML Article</h1>
<a href="adminindex.php">Cancel</a><br><br>
<form name="createArticle" action="updateArticle.php" method="post">
<table border=1 cellspacing=0 cellpadding=3>
<tr valign=top>
<td width="135">Article ID</td>
<td width="634"> <?php echo htmlspecialchars($id); ?> <input type="hidden" name="id" value="<?php echo $id; ?>">
</td>
</tr>
<tr valign=top>
<td>Status</td>
<td>
<select name="status">
<?php
foreach ($statusList as $stat){
if($stat == $status){
echo "<option value="".$stat."" selected>$stat";
} else {
echo "<option value="".$stat."">$stat";
}
}
?>
</select>
</td>
</tr>
<tr valign=top>
<td>Headline</td>
<td> <input name="headline" type="text" id="headline" value="<?php echo htmlspecialchars($headline); ?>" size="60"></td>
</tr>
<tr valign=top>
<td>Author Name</td>
<td> <input name="name" type="text" id="name" value="<?php echo htmlspecialchars($author); ?>"size="30"></td>
</tr>
<tr valign=top>
<td>Author Email</td>
<td> <input name="email" type="text" id="email" value="<?php echo htmlspecialchars($email); ?>"size="30"></td>
</tr>
<tr valign=top>
<td>Keywords</td>
<td> <input name="keywords" type="text" value="<?php echo htmlspecialchars($keywords); ?>">
<br> <font size="-1">(separate keywords with commas)</font> </td>
</tr>
<tr valign=top>
<td>Abstract</td>
<td><textarea name="abstract" cols="50" rows="5" id="abstract"><?php echo htmlspecialchars($abstract); ?></textarea></td>
</tr>
<tr valign=top>
<td> <p>Article Body<br>
</p></td>
<td> <p>Intro paragraph:</p>
<p>
<textarea name="body[intro]" cols="70" rows="10" wrap="soft" id="body[intro]"><?php echo htmlspecialchars($plead); ?></textarea>
</p>
<p>Main paragraph:</p>
<p>
<textarea name="body[main]" cols="70" rows="10" wrap="soft" id="body[main]" ><?php echo htmlspecialchars($pmain); ?></textarea>
</p>
<p>Conclusion paragraph:</p>
<p>
<textarea name="body[conclusion]" cols="70" rows="10" wrap="soft"><?php echo htmlspecialchars($pcon); ?></textarea>
</p></td>
</tr>
<tr valign=top>
<td colspan=2> <div align="center">
<input type="submit" name="Add Article" value="Add Article">
<input name="reset" type="reset" id="reset" value="Reset">
</div></td>
</tr>
</table>
</form>
</body>
</html>
<?php
}//end if-else
?>
The updateArticle.php page is very similar to the addArticle.php page, but there’s no need to check for a unique id attribute at the front. Also, at the end, the PHP code will delete the existing file and then dump the new XML information structure into a newly created file name with the same name. This saves us a lot of time trying to insert edits into the proper nodes.
<?php
//create document root
$doc = domxml_new_doc("1.0");
$root = $doc->create_element("article");
$root = $doc->append_child($root);
//add ID attribute
$root->set_attribute('id', $id);
//create headline
$head = $doc->create_element("headline");
$head = $root->append_child($head);
$htext = $doc->create_text_node($headline);
$htext = $head->append_child($htext);
//create author name
$aname = $doc->create_element("author");
$aname = $root->append_child($aname);
$atext = $doc->create_text_node($name);
$atext = $aname->append_child($atext);
//create author email
$mail = $doc->create_element("email");
$mail = $root->append_child($mail);
$mtext = $doc->create_text_node($email);
$mtext = $mail->append_child($mtext);
//create abstract
$abs = $doc->create_element("abstract");
$abs = $root->append_child($abs);
$abstext = $doc->create_text_node($abstract);
$abstext = $abs->append_child($abstext);
//create keyword listing
$keylisting = $doc->create_element("keywords");
$keylisting = $root->append_child($keylisting);
$ktext = $doc->create_text_node($keywords);
$ktext = $keylisting->append_child($ktext);
//create status, always in progress when first created
$stat = $doc->create_element("status");
$stat = $root->append_child($stat);
$stat_text = $doc->create_text_node($status);
$stat_text = $stat->append_child($stat_text);
//create paras
if (is_array($body)){
foreach ($body as $K => $V){
if ($V != ""){
$V = stripslashes($V);
$para = $doc->create_element("para-$K");
$para = $root->append_child($para);
$ptext = $doc->create_text_node($V);
$ptext = $para->append_child($ptext);
}
}
}
//write to the file (first delete existing one!)
$filename = "./xml/".$id . ".xml";
unlink($filename);
$doc->dump_file($filename, false, true);
//send user back to adminindex
header("Location:adminindex.php");
?>
Deleting an XML Article
Deleting an XML file is very simple. All you have to do is pass a filename to the delArticle.php page, unlink the file, and send the user back to the adminindex.php page:
<?php
session_start();
if ($_SESSION["login"] != "true"){
header("Location:login.php");
$_SESSION["error"] = "<font color=red>You don't have privileges to see the admin page.</font>";
exit;
}
$dir = "./xml/";
$filetoburn = $dir . $file;
unlink($filetoburn);
header("Location: adminindex.php");
?>
At this point, we’ve just completed the article management portion of our XML-powered CMS. We’ve built a login page, an administrative index, and pages for adding, editing, and deleting articles.
In the next part of our article, we’ll build the display side of the Website so that visitors can read and search for articles.
Building the Display Side
Now that we’ve defined the XML article structure and built a very simple, secure administration tool to help us create, edit, delete, and publish files, it’s time to build that part of the site that displays articles for site visitors.
Let’s recap our display-side requirements:
- Site visitors will be able to view “live” articles listed by headline. They will also be able to perform a search on headline and keywords.
- The site itself will consist of the following pages:
- A home page that lists five articles published on the site and a search function. Furthermore, site visitors can click a link to show all articles.
- An article detail page that displays one article at a time.
- A search result page that will list all articles by an author, keyword, or string entered into the search engine.
The Home Page
The home page, index.php, contains code that basically repeats what we created on the admin side. It basically loops through all the articles in the xml/ directory, opens each file and looks for a status, headline, and abstract. If the status is set to something other than “live,” then the loop finds the next article.
The result is a list of article headlines and abstracts displayed to the home page. Each link leads the site visitor to the showArticle.php page.
Here’s the code (note that I’ve included a search widget also):
<?php
function extractText($array){
if(count($array) <= 1){
//we only have one tag to process!
for ($i = 0; $i<count($array); $i++){
$node = $array[$i];
$value = $node->get_content();
}
return $value;
}
}
?>
<html>
<head>
<title>Welcome to XMLTEST</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body>
<h1>Welcome to the XMLTEST site</h1>
<p><br>
<a href="adminindex.php">Admin Login</a> </p>
<form name="search" method="post" action="searchArticles.php">
Search articles:
<input name="search" type="text" id="search">
<input name="Search" type="submit" id="Search" value="Search">
</form>
<p>The following articles are available: </p>
<table border=1 cellspacing=0 cellpadding=2 width=500>
<?php
$dh = opendir('./xml/');
$fileCount = 0;
while ($file = readdir($dh) and $fileCount <= 5){
if (eregi("^\.\.?$", $file)) {
continue;
}
$open = "xml/".$file;
$xml = domxml_open_file($open);
//we need to pull out all the things from this file that we will need to
//build our links
$root = $xml->root();
$stat_array = $root->get_elements_by_tagname("status");
$status = extractText($stat_array);
$ab_array = $root->get_elements_by_tagname("abstract");
$abstract = extractText($ab_array);
$h_array = $root->get_elements_by_tagname("headline");
$headline = extractText($h_array);
if ($status != "live"){
continue;
}
echo "<tr valign=top><td>";
echo "<a href="showArticle.php?file=".$file . "">".$headline . "</a><br>";
echo $abstract;
echo "</td></tr>";
$fileCount++;
}
?>
</table>
<br><a href="adminindex.php">Admin Login</a>
</body>
</html>
Show Article Page
The showArticle.php page is where individual articles are displayed.
Everything you learned about pulling text out of an XML structure applies here, too. Basically, you open the XML article based on the filename that you pass to the page. Then you pull out each appropriate node that you need and store them as variables. In the case of our paragraphs, we’ll store them in an array so we can print them out with a simple foreach loop. Then we print the variables to the screen.
Here’s the code:
<?php
session_start();
function extractText($array){
if(count($array) <= 1){
//we only have one tag to process!
for ($i = 0; $i<count($array); $i++){
$node = $array[$i];
$value = $node->get_content();
}
return $value;
}
}
//pull in the XML file
if ($file == ""){
echo "<h2>You didn't choose a file to edit!</h2>";
echo "<a href="index.php">Go back to index and choose a file</a>";
exit;
} else {
$open = "./xml/" . $file;
$xml = domxml_open_file($open);
$root = $xml->root();
$id = $root->get_attribute("id");
$h_array = $root->get_elements_by_tagname("headline");
$headline = extractText($h_array);
$stat_array = $root->get_elements_by_tagname("status");
$status = extractText($stat_array);
$a_array = $root->get_elements_by_tagname("author");
$author = extractText($a_array);
$e_array = $root->get_elements_by_tagname("email");
$email = extractText($e_array);
$ab_array = $root->get_elements_by_tagname("abstract");
$abstract = extractText($ab_array);
$kl_array = $root->get_elements_by_tagname("keywords");
$keywords = extractText($kl_array);
$lead_array = $root->get_elements_by_tagname("para-intro");
$para["intro"] = extractText($lead_array);
$second_array = $root->get_elements_by_tagname("para-main");
$para["main"] = extractText($second_array);
$con_array = $root->get_elements_by_tagname("para-conclusion");
$para["con"] = extractText($con_array);
?>
<html>
<head>
<title><?php echo $headline; ?></title>
<meta name="keywords" content="<?php echo $keywords; ?>">
<meta name="description" content="<?php echo $abstract; ?>">
</head>
<body>
<h1><?php echo $headline; ?></h1>
<a href="index.php">back to main</a>
<p>by <a href="mailto:<?php echo $email; ?>"><?php echo $author; ?></a></p>
<p><small><?php echo $abstract; ?></small></p>
<?php
foreach ($para as $k => $v){
echo "<p>".$v."</p>n";
}
?>
</body>
</html>
<?php
}
?>
Searching Through Articles
Searching through an XML article archive is a lot like searching through a database table, except that with XML, you need to open each file and compare the search term with each of the nodes you want to search on, collect pertinent data (like filename, headline, and abstract) and then display these on a page.
In the case of our example, I’m going to use the quick and easy route – opening all files in order, checking to see if they are live (if they aren’t live, I go on to the next file), and storing the headline, abstract, and filename in a multidimensional array.
When I send the results to the screen, I make sure that I count the number of records in this array. If the count is 0, then I display a message saying that no files matched the search term. Otherwise, I display a linked headline and abstract for each article I find.
This search algorithm is the result of three minutes of thinking and about ten minutes of implementing. It is fast and dirty, and works like a charm if you have a small number of articles. In future articles, we’ll look at more robust methods for searching XML files.
Here’s the code:
<?php
session_start();
$results = array();
//this is a very simple, potentially very slow search
function extractText($array){
if(count($array) <= 1){
//we only have one tag to process!
for ($i = 0; $i<count($array); $i++){
$node = $array[$i];
$value = $node->get_content();
}
return $value;
}
}
$dh = opendir('./xml/');
while ($file = readdir($dh)){
if (eregi("^\.\.?$", $file)) {
continue;
}
$open = "./xml/".$file;
$xml = domxml_open_file($open);
//we need to pull out all the things from this file that we will need to
//build our links
$root = $xml->root();
$stat_array = $root->get_elements_by_tagname("status");
$status = extractText($stat_array);
$k_array = $root->get_elements_by_tagname("keywords");
$keywords = extractText($k_array);
$h_array = $root->get_elements_by_tagname("headline");
$headline = extractText($h_array);
$ab_array = $root->get_elements_by_tagname("abstract");
$abstract = extractText($ab_array);
if ($status != "live"){
continue;
}
if (eregi($searchTerm, $keywords) or eregi($searchTerm,$headline)){
$list['abstract'] = $abstract;
$list['headline'] = $headline;
$list['file'] = $file;
$results[] = $list;
}
}
$results = array_unique($results);
?>
<h1>Search Results</h1>
<a href="index.php">back to main</a>
<p>You searched for: <i><?php echo $searchTerm ?></i></p>
<hr>
<?php
if (count($results)>0){
echo "<p>Your search results:</p>";
foreach ($results as $key => $listing){
echo "<br><a href="showArticle.php?file=".$listing["file"]."">" . $listing["headline"]."</a>n";
echo "<br>". $listing["abstract"];
echo "<br>";
}
} else {
echo "<p>Sorry, no articles matched your search term.";
}
?>
Conclusion
That’s it! Well done — we’ve covered a lot of territory. We learned:
- Some rudimentary tools for developing XML structures
- How to build a very simple and secure admin tool to take care of basic site functions
- How to build a home page, article view page, and a simple search engine
In each case, there’s room for improvement, refinement, and robustness. The key is, while working with the tools, you were exposed to some of the basic concepts of XML data structures, application workflow, and simple CMS design.
Frequently Asked Questions about XML in Content Management Systems
What is the role of XML in a Content Management System (CMS)?
XML plays a crucial role in a CMS by providing a standard way to store, transport, and organize data. It allows for the separation of content from its presentation, making it easier to manage and update. XML also enables interoperability between different systems and platforms, making it a versatile choice for content management.
How does XML improve the efficiency of a CMS?
XML improves the efficiency of a CMS by enabling the reuse of content across multiple platforms and formats. This means that you can create content once and publish it in various places without having to recreate it each time. This not only saves time but also ensures consistency across all platforms.
What are the benefits of using an XML-based CMS?
An XML-based CMS offers several benefits. It provides a high level of flexibility and customization, allowing you to tailor the system to your specific needs. It also supports multilingual content, making it an excellent choice for businesses operating in multiple countries. Additionally, XML-based CMSs are typically more secure and reliable than other types of CMSs.
How does XML support multilingual content in a CMS?
XML supports multilingual content in a CMS by allowing for the use of different character sets and languages. This means that you can create content in any language and have it displayed correctly on your website. XML also supports right-to-left languages, making it a versatile choice for international businesses.
What is the difference between an XML-based CMS and a traditional CMS?
The main difference between an XML-based CMS and a traditional CMS is how they handle content. A traditional CMS typically stores content in a database, while an XML-based CMS stores content in XML files. This allows for greater flexibility and control over your content, as well as improved performance and scalability.
How does an XML-based CMS improve website performance?
An XML-based CMS improves website performance by reducing the load on the server. Since content is stored in XML files rather than a database, the server doesn’t have to process complex database queries to retrieve content. This results in faster page load times and a better user experience.
Is an XML-based CMS suitable for large-scale websites?
Yes, an XML-based CMS is suitable for large-scale websites. It offers excellent scalability, allowing you to easily add more content as your website grows. It also provides robust performance, ensuring that your website remains fast and responsive even as it expands.
How secure is an XML-based CMS?
An XML-based CMS is typically more secure than a traditional CMS. It doesn’t rely on a database, which is often a target for hackers. Additionally, XML files can be encrypted to further enhance security.
Can I customize an XML-based CMS to fit my needs?
Yes, an XML-based CMS is highly customizable. You can modify the structure of your XML files to fit your specific content needs. You can also use XSLT (Extensible Stylesheet Language Transformations) to control how your content is presented on your website.
What skills do I need to use an XML-based CMS?
To use an XML-based CMS, you need to have a basic understanding of XML and how it works. You should also be familiar with XSLT, as it is often used to customize the presentation of content. Some knowledge of PHP may also be helpful, as many XML-based CMSs use PHP for server-side processing.
Tom is the founder of Triple Dog Dare Media, an Austin, TX-based professional services consultancy that specializes in designing, building, and deploying ecommerce, database, and XML systems. He's spent the last 7 years working in various areas of XML development, including XML document analysis, DTD creation and validation, XML-based taxonomies, and XML-powered content and knowledge management systems.