Build an XML-Based Content Management System with PHP

Like most developers, you’ve probably heard a great deal about XML and content management systems. It’s likely, however, that you’ve only been exposed to theoretical discussions that haven’t been grounded in practical knowledge.

This step-by-step tutorial will get you up and running with a very basic XML-based content management system (or CMS). I don’t have the space here to get into a very complex example, but with any luck, the concepts and ideas presented here will provide you with the necessary springboard for your own exploration.

Some notes before we get started. I’m using DOMXML functions and sessions to make this application work, so you’ll need to use PHP 4.2.1 or higher, and turn on DOM support.

Also, don’t forget to download all the codefor this tutorial. It’ll come in handy!

A Short Introduction to XML and CMSs

Let me first give you a little background on CMSs and XML. I assume that you’ve read or heard about both of these technologies elsewhere, so I’ll keep this discussion brief.

XML stands for eXtensible Markup Language, and is a subset of SGML (Standardized General Markup Language). XML is very much like HTML, except that in XML you can define your own tags. This ability to produce custom documents comes in very handy when you need to track certain types of data very closely, particularly in the worlds of publishing and ecommerce.

For example, for any given article you publish in an online magazine, you can create tags for author’s name, byline (if it’s different from the author’s name), word length, date of publication, title or headline, story body, keywords, and so on. As you’ll see later in this article, breaking your article down into these XML tags or nodes allows the CMS to do useful things with all the articles it holds.

Essentially, XML allows you to make a mini-database out of each document, without the overhead and expense that many databases bring to a Web project.

A CMS is used to create, publish, and maintain content on a Website. It usually consists of the following pieces:

  • A data backend (XML or database tables) that contains all of your articles, news stories, images, and other content.
  • A data display component, usually templates or other pages, onto which your articles, images, etc. are “painted” by the CMS for site visitors.
  • A data administration component, usually easy-to-use HTML forms that allow site administrators to create, edit, publish, and delete articles in some kind of secure workflow. The data administration portion of a CMS is usually the most complicated, and where you’ll likely spend most of your development time.

Over the past decade, different scripting languages have been used to create CMSs, including Perl/CGI, ASP, TCL, JSP, Python, and PHP. Each of these languages has its own pros and cons, but I’m going to focus on using PHP with XML to build a simple content management system.

Requirements

Building any kind of CMS, whether database- or XML-backed, involves the gathering of information that defines the basic requirements for the project. Although many developers groan at the thought of this kind of exercise, a set of well-defined requirements can make your life a lot easier.

Because this is a fairly simple project, and because you’re going to do it for yourself, a simple requirements list will do.

What kind of requirements do we need to gather? Essentially, requirements fall into three major categories:

  1. What kind of content will the CMS handle? Furthermore, how is each type of content broken down? (The more complete your understanding of this issue, the easier it’ll be to create your XML files.)

  2. Who will be visiting the site, and what behaviors do these users expect to find? (For example, will they want to browse a hierarchical list of articles, search for articles by keyword, and see links of related articles?)

  3. What do the site administrators need to do? (For example, log in securely, create content, edit content, publish content, and delete content. If your content management system will have roles for administrative users – such as site admin, editor, writer – then your system becomes more complex).

In the interests of keeping this article from becoming complicated, I will choose very basic requirements for my simple XML-based CMS:

  • The CMS will handle the management of articles only. Each article will have a:
    • unique ID
    • headline
    • author name
    • author email
    • abstract
    • article body that can contain paragraphs and one level of subheading
    • status (either “in progress” or “live”)
    • keyword listing

  • Site visitors will be able to view “live” articles listed by author name. They will also be able to perform a search on headline and keywords.

  • The site itself will consist of the following pages:

    • A home page that lists five articles published on the site and a search function.
    • An article detail page that displays one article at a time.
    • A search result page that will list all articles by an author, keyword, or string entered into the search engine.
    • Site administrators get a secure login, a way to add more administrators, and easy screens from which to add new XML files, edit existing files, publish files, and delete files.

Defining the XML Files

Whenever I build a CMS, I try to define the data backend first, because I find that all the other elements cascade from there. In this case, our data backend is an XML file repository, so we need to define how our files should be structured.

XML files are made up of nested start and end tags, each of which defines some chunk of information. XML files must also contain a “root” start and end tag that includes all the other tags.

Because we are only going to be dealing with articles in this example, our “root” start and end tag should be:

<article>  
 
</article>

All other tags that we identified during our discovery phase must go in between these two tags. Based on that list, our article files will likely be structured like this:

<?xml version="1.0"?>  
<article id=\"xml-howto-1\">  
<headline>Writing XML Articles</headline>  
<status>in progress</status>  
<author>Joe Author</author>  
<email>jauthor@example.com</email>  
<abstract>A short article about writing XML articles.</abstract>  
<keywords>XML,articles,how to</keywords>  
<para-intro>Intro paragraph here.</para-intro>  
<para-main>Main paragraph.</para-main>  
<para-conclusion>Conclusion paragraph.</para-conclusion>  
</article>

Several things to note about our article example:

  • Usually, you would create a DTD or Schema to define how an article would look. Creating effective DTDs or Schemas is an entire tutorial unto itself, so here, I used a shortcut method involving a sample case. This is faster than developing a schema, but be aware that you may run into problems because your sample case may be too simple. Also, if you want to validate your XML document, you will need to create a DTD.
  • Did you notice the “id=” part in the article tag? This is called an attribute. We’ll talk more later about why it’s important to have a unique id attribute for each article we create in the system.
  • Because we want to keep this example simple, I’m going to assume that our articles will comprise only three paragraphs each, and the forms we build later on will accommodate this structure. In future tutorials, we will build a more dynamic structure in which we nest the paragraph tags into a <body> tag.

Building the Admin Tool

The admin tool for our XML-based CMS will be just a few PHP pages that will allow administrators to log in and create, edit, publish, and delete XML articles. Administrators will also be able to create, edit, and delete other administrators.

The Login Page

The login page is very simple. It involves a simple HTML form that allows administrators to enter a username and password. The PHP logic on this page needs to check the entered values against a list of administrators. If we had enough time, I’d walk you through the building of an admin.xml file that holds these values. But for now, we’ll take the shortcut of embedding values in our PHP.

Here is the code for the login.php page:

<?php  
session_start();  
?>  
<html>  
<title>Please Log In</title>  
<body>  
<form name="login" method="post" action="verify.php">  
<table width="290" border="0" align="center" cellpadding="4" cellspacing="1">  
   <tr>  
     <td colspan="2"><div align="center">Please log in</div>  
     </td>  
   </tr>  
   <tr>  
     <td width="99" bgcolor="#CCCCCC"> <div align="right">login</div></td>  
     <td width="181" bgcolor="#CCCCCC"> <div align="left">  
         <input name="username" type="text" id="username">  
       </div></td>  
   </tr>  
   <tr>  
     <td bgcolor="#CCCCCC"> <div align="right">password</div></td>  
     <td bgcolor="#CCCCCC"> <div align="left">  
         <input name="password" type="password" id="password">  
       </div></td>  
   </tr>  
   <tr>  
     <td colspan="2"><div align="center">  
         <input type="submit" name="Submit" value="Submit">  
         &nbsp;  
         <input name="reset" type="reset" id="reset" value="Reset">  
       </div></td>  
   </tr>  
 
 <tr>  
 <td colspan=2 align=center>  
 <?php echo $_SESSION[\"error\"]; ?>  
 </td>  
 </tr>  
 </table>  
</form>  
</body>  
</html>

Notice that the form’s action is set to a page called verify.php. The verify.php page is extremely simple. All it does is check that the passed-in values for username and password match the stored username/password values.

If there’s a match for both, PHP sets a session variable and redirects the user to the admin page. If not, PHP sends the user back to the login.php page, and a special session variable containing an error message is displayed. Here is the code for the verify.php page:

<?php  
session_start();  
 
$user = 'tom';  
$passw = 'test';  
 
if (($_POST["username"] == $user) and ($_POST["password"] == $passw)){  
 $_SESSION["login"] = "true";  
 header("Location:adminindex.php");  
 exit;  
} else {  
 $_SESSION["error"] = "<font color=red>Wrong username or password. Try again.</font>";  
 header("Location:login.php");  
}  
 
?>

Because anyone can enter a URL for the admin pages, we have to add an extra piece of security. At the top of each page, we need to check to see if the value of the session variable “login” is set to “true.” If it isn’t, send folks back to the login.php page; if it is, show them the admin page.

The Admin Index Page

The first page of our admin tool is the adminindex.php. This page lists all XML articles currently displayed on the site, allowing you to edit or delete them and change their status (to publish them, for example). It also allows you to create new XML articles.

The code for this page is very simple and compact. We want to include a link to the createArticle.php page. We want to open up the xml/ directory (where we’ll store all the articles), pull out the names of each file and pass those names into the links to the editArticle.php page. We’ll do the same for our delArticle.php page.

Here’s the code for adminindex.php:

<?php   
session_start();  
if ($_SESSION["login"] != "true"){  
 header("Location:login.php");  
 $_SESSION["error"] = "<font color=red>You don't have privileges to see the admin page.</font>";  
 exit;  
}  
?>  
<h1>Welcome to the Admin Index Page</h1>  
<a href="createArticle.php">Create New XML Article</a><br><br>  
<table border=0 cellspacing=0 cellpadding=3 width="85%">  
<tr valign=top>  
<td width="75%">  
<table border=1 cellspacing=0 cellpadding=2>  
<?php  
$dh = opendir('./xml/');  
 
while ($file = readdir($dh)){  
 if (eregi("^..?$", $file)) {  
   continue;  
 }  
 echo "<tr valign=top><td width="80%">";  
 echo "<a href="editArticle.php?file=".$file . "">".$file . "</a></td>";  
 echo "<td width="20%">";  
 echo "<a href="delArticle.php?file=" .$file . "">delete</a>";  
 echo "&nbsp;</td></tr>";  
}  
?>  
</table>  
</td></tr></table>

The Create Article Page

The createArticle.php page is very important – it allows the site administrator to create new XML articles on the site. It’s just a simple form that allows site administrators to enter pertinent information. Each of the form fields maps to the XML document structure we figured out earlier.

The form will also check to make sure that users enter an ID – without an Article ID, most of the site’s functionality won’t work.

Here’s the code for that page:

<?php   
session_start();  
 
if ($_SESSION["login"] != "true"){  
 header("Location:login.php");  
 $_SESSION["error"] = "<font color=red>You don't have privileges to see the admin page.</font>";  
 exit;  
}  
 
?>  
<html>  
<head>  
<title>Create an XML Article</title>  
<script>  
function isReady(form){  
 if(form.id.value == "") {  
   alert("Please enter an ID!");  
   return false;  
 }  
}  
</script>  
</head>  
<body>  
<h1>Create an XML Article</h1>  
 
<a href="adminindex.php">Cancel</a><br><br>  
<form name="createArticle" action="addArticle.php" method="post" onSubmit="return isReady(this)">  
 <table border=1 cellspacing=0 cellpadding=3>  
   <tr valign=top>    
     <td width="135">Article ID</td>  
     <td width="634"> <input name="id" type="text" id="id"> <br> <font size="-1">(no    
       spaces, must be unique)</font></td>  
   </tr>  
   <tr valign=top>  
     <td>Status</td>  
     <td>In Progress <input type="hidden" name="status" value="in progress"></td>  
   </tr>  
   <tr valign=top>    
     <td>Headline</td>  
     <td> <input name="headline" type="text" id="headline" size="60"></td>  
   </tr>  
   <tr valign=top>    
     <td>Author Name</td>  
     <td> <input name="name" type="text" id="name" size="30"></td>  
   </tr>  
   <tr valign=top>    
     <td>Author Email</td>  
     <td> <input name="email" type="text" id="email" size="30"></td>  
   </tr>  
   <tr valign=top>    
     <td>Keywords</td>  
     <td> <p>    
         <input name="keywords" type="text" id="keywords">  
         <font size="-1"><br>  
         </font><font size="-1">(separate keywords with commas)</font> </p></td>  
   </tr>  
   <tr valign=top>    
     <td>Abstract</td>  
     <td><textarea name="abstract" cols="50" rows="5" id="abstract"></textarea></td>  
   </tr>  
   <tr valign=top>    
     <td> <p>Article Body<br>  
       </p></td>  
     <td> <p>Intro paragraph:</p>  
       <p>    
         <textarea name="body[intro]" cols="70" rows="10" wrap="soft" id="body[intro]"></textarea>  
       </p>  
       <p>Main paragraph:</p>  
       <p>    
         <textarea name="body[main]" cols="70" rows="10" wrap="soft" id="body[main]" ></textarea>  
       </p>  
       <p>&nbsp;</p>  
       <p>Conclusion paragraph:</p>  
       <p>    
         <textarea name="body[conclusion]" cols="70" rows="10" wrap="soft"></textarea>  
       </p></td>  
   </tr>  
   <tr valign=top>    
     <td colspan=2> <div align="center">    
         <input type="submit" name="Add Article" value="Add Article">  
         &nbsp;    
         <input name="reset" type="reset" id="reset" value="Reset">  
       </div></td>  
   </tr>  
 </table>  
</form>  
</body></html>

This form’s action is set to the addArticle.php page, which uses DOMXML functions to create an XML article from the information in the form. Because this is a little complex, I’ll go over the code section by section.

The first part of the code initializes our new XML file, setting the version and creating the root node, which is <article>.

<?php   
//create document root  
$doc = domxml_new_doc("1.0");  
$root = $doc->create_element("article");  
$root = $doc->append_child($root);

Next, we add an id attribute to the <article> node. First, however, we need to make sure that users have chosen a unique value, as the id will be used as the file name. We perform this check by looking at all the articles in the xml directory. If we find a filename that contains the id from the form (stored in the incoming $id variable), then we add a “–” and the number of seconds since the beginning of the UNIX epoch to our id. Although it’s not considered good form to change user input without a warning, this will do for now. Finally, we add the id attribute to the <article> node.

//add ID attribute   
//FIRST, let's make sure that the id they chose isn't going to overwrite a file!  
$dh = opendir('./xml/');  
 
while ($file = readdir($dh)){  
 $string = $id . \\".xml\\";  
   
 if (eregi("^\.\.?$", $file)) {  
   continue;  
 }  
 if (eregi($string, $file)){  
   $time = date("U"); //num of seconds since unix epoch  
   $id = $id . "-" . $time;  
 }  
}  
 
$root->set_attribute('id', $id);

Now that we’ve created the root, it’s time to create each of that node’s children in order. The first is <headline>. Notice that the <headline> node is a child of <article>, and that the headline text is a child of <headline>.

//create headline   
$head = $doc->create_element("headline");  
$head = $root->append_child($head);  
$htext = $doc->create_text_node($headline);  
$htext = $head->append_child($htext);

The same is true of the <author>, <email>, <abstract> and <status> nodes:

//create author name   
$aname = $doc->create_element("author");  
$aname = $root->append_child($aname);  
$atext = $doc->create_text_node($name);  
$atext = $aname->append_child($atext);  
 
//create author email  
$mail = $doc->create_element("email");  
$mail = $root->append_child($mail);  
$mtext = $doc->create_text_node($email);  
$mtext = $mail->append_child($mtext);  
 
//create abstract  
$abs = $doc->create_element("abstract");  
$abs = $root->append_child($abs);  
$abstext = $doc->create_text_node($abstract);  
$abstext = $abs->append_child($abstext);  
 
//create status, always in progress when first created  
$stat = $doc->create_element("status");  
$stat = $root->append_child($stat);  
$stat_text = $doc->create_text_node($status);  
$stat_text = $stat->append_child($stat_text);

Next come the keywords:

//create keyword listing   
$keylisting = $doc->create_element("keywords");  
$keylisting = $root->append_child($keylisting);  
$ktext = $doc->create_text_node($keywords);  
$ktext = $keylisting->append_child($ktext);

The paragraphs can be handled as an array, as they are being passed in as body[lead], body[second], and so on. Our PHP code is set up to create para tags using these passed-in keys; we’ll end up with tags named <para-lead>, <para-second>, and so on.

//create paras   
if (is_array($body)){  
 foreach ($body as $K => $V){  
     
   if ($V != ""){  
     $para = $doc->create_element("para-$K");  
     $para = $root->append_child($para);  
     $ptext = $doc->create_text_node($V);  
     $ptext = $para->append_child($ptext);  
     //$para->set_attribute('order', $K);  
   }  
 }  
}

Finally, we can write this entire XML tree to a file, using id as a filename, and send the user back to adminindex.php, where they should see the file they just created added to the list of XML articles.

//write to the file   
$filename = "./xml/".$id . ".xml";  
$doc->dump_file($filename, false, true);  
 
//send user back to adminindex  
header("Location:adminindex.php");

Editing an XML Article

Generally speaking, editing an XML article is very much the same as creating an article, except that you have to load an existing article’s nodes into the edit form, then write your changes out to the file.

In this example, we’re not going to allow any changes to the article’s id attribute, as this would make it very difficult for the rest of the application to function properly. Restricting changes to the id attribute also allows us to perform an easy short cut when updating an XML file – all we have to do is delete the old file and create a new file with our new information. This is crude, but fast and effective.

Here is the code for the editArticle.php page. Note that the form’s action is set to updateArticle.php. Also note the use of the extractText() function to extract content from an XML object and place it into a variable that can then be assigned to a form element.

<?php    
session_start();    
   
   
   
function extractText($array){    
 if(count($array) <= 1){    
   //we only have one tag to process!    
   for ($i = 0; $i<count($array); $i++){    
     $node = $array[$i];    
     $value = $node->get_content();    
   }    
   return $value;    
 }    
     
}      
   
if ($_SESSION["login"] != "true"){    
 header("Location:login.php");    
 $_SESSION["error"] = "<font color=red>You don't have privileges to see the admin page.</font>";    
 exit;    
}    
//pull in the XML file    
   
if ($file == ""){    
   
 echo "<h2>You didn't choose a file to edit!</h2>";    
 echo "<a href="adminindex.php">Go back to index and choose a file</a>";    
 exit;    
} else {    
 $filename = "./xml/".$file;    
 $xml = domxml_open_file($filename);    
 $root = $xml->root();    
 $id = $root->get_attribute("id");    
   
 $h_array = $root->get_elements_by_tagname("headline");    
 $headline = extractText($h_array);    
   
 $stat_array = $root->get_elements_by_tagname("status");    
 $status = extractText($stat_array);    
   
       
 $a_array = $root->get_elements_by_tagname("author");    
 $author = extractText($a_array);    
     
 $e_array = $root->get_elements_by_tagname("email");    
 $email = extractText($e_array);    
     
 $ab_array = $root->get_elements_by_tagname("abstract");    
 $abstract = extractText($ab_array);    
   
 $kl_array = $root->get_elements_by_tagname("keywords");    
 $keywords = extractText($kl_array);    
     
 $lead_array = $root->get_elements_by_tagname("para-intro");    
 $plead = extractText($lead_array);    
   
 $second_array = $root->get_elements_by_tagname("para-main");    
 $pmain = extractText($second_array);    
   
 $con_array = $root->get_elements_by_tagname("para-conclusion");    
 $pcon = extractText($con_array);    
     
 $statusList = array("live","in progress");    
     
?>    
<html>    
<title>Edit an XML Article</title>    
<body>    
<h1>Edit an XML Article</h1>    
   
<a href="adminindex.php">Cancel</a><br><br>    
<form name="createArticle" action="updateArticle.php" method="post">    
 <table border=1 cellspacing=0 cellpadding=3>    
   <tr valign=top>    
     <td width="135">Article ID</td>    
     <td width="634"> <?php echo htmlspecialchars($id); ?> <input type="hidden" name="id" value="<?php echo $id; ?>">    
     </td>    
   </tr>    
   <tr valign=top>    
     <td>Status</td>    
     <td>    
   <select name="status">    
   <?php    
     foreach ($statusList as $stat){    
     if($stat == $status){    
       echo "<option value="".$stat."" selected>$stat";    
     } else {    
       echo "<option value="".$stat."">$stat";    
   
     }    
   }    
   ?>    
   </select>    
   </td>    
   </tr>    
   <tr valign=top>    
     <td>Headline</td>    
     <td> <input name="headline" type="text" id="headline" value="<?php echo htmlspecialchars($headline); ?>" size="60"></td>    
   </tr>    
   <tr valign=top>    
     <td>Author Name</td>    
     <td> <input name="name" type="text" id="name" value="<?php echo htmlspecialchars($author); ?>"size="30"></td>    
   </tr>    
   <tr valign=top>    
     <td>Author Email</td>    
     <td> <input name="email" type="text" id="email" value="<?php echo htmlspecialchars($email); ?>"size="30"></td>    
   </tr>    
   <tr valign=top>    
     <td>Keywords</td>    
     <td> <input name="keywords" type="text" value="<?php echo htmlspecialchars($keywords); ?>">    
       <br> <font size="-1">(separate keywords with commas)</font> </td>    
   </tr>    
   <tr valign=top>    
     <td>Abstract</td>    
     <td><textarea name="abstract" cols="50" rows="5" id="abstract"><?php echo htmlspecialchars($abstract); ?></textarea></td>    
   </tr>    
   <tr valign=top>    
     <td> <p>Article Body<br>    
       </p></td>    
     <td> <p>Intro paragraph:</p>    
       <p>    
         <textarea name="body[intro]" cols="70" rows="10" wrap="soft" id="body[intro]"><?php echo htmlspecialchars($plead); ?></textarea>    
       </p>    
       <p>Main paragraph:</p>    
       <p>    
         <textarea name="body[main]" cols="70" rows="10" wrap="soft" id="body[main]" ><?php echo htmlspecialchars($pmain); ?></textarea>    
       </p>    
       <p>Conclusion paragraph:</p>    
       <p>    
         <textarea name="body[conclusion]" cols="70" rows="10" wrap="soft"><?php echo htmlspecialchars($pcon); ?></textarea>    
       </p></td>    
   </tr>    
   <tr valign=top>    
     <td colspan=2> <div align="center">    
         <input type="submit" name="Add Article" value="Add Article">    
         &nbsp;    
         <input name="reset" type="reset" id="reset" value="Reset">    
       </div></td>    
   </tr>    
 </table>    
</form>    
</body>    
</html>    
<?php    
}//end if-else    
?>

The updateArticle.php page is very similar to the addArticle.php page, but there’s no need to check for a unique id attribute at the front. Also, at the end, the PHP code will delete the existing file and then dump the new XML information structure into a newly created file name with the same name. This saves us a lot of time trying to insert edits into the proper nodes.

<?php    
   
//create document root    
$doc = domxml_new_doc("1.0");    
$root = $doc->create_element("article");    
$root = $doc->append_child($root);    
   
//add ID attribute    
$root->set_attribute('id', $id);    
   
//create headline    
$head = $doc->create_element("headline");    
$head = $root->append_child($head);    
$htext = $doc->create_text_node($headline);    
$htext = $head->append_child($htext);    
   
//create author name    
$aname = $doc->create_element("author");    
$aname = $root->append_child($aname);    
$atext = $doc->create_text_node($name);    
$atext = $aname->append_child($atext);    
   
//create author email    
$mail = $doc->create_element("email");    
$mail = $root->append_child($mail);    
$mtext = $doc->create_text_node($email);    
$mtext = $mail->append_child($mtext);    
   
//create abstract    
$abs = $doc->create_element("abstract");    
$abs = $root->append_child($abs);    
$abstext = $doc->create_text_node($abstract);    
$abstext = $abs->append_child($abstext);    
   
//create keyword listing    
$keylisting = $doc->create_element("keywords");    
$keylisting = $root->append_child($keylisting);    
$ktext = $doc->create_text_node($keywords);    
$ktext = $keylisting->append_child($ktext);    
   
//create status, always in progress when first created    
$stat = $doc->create_element("status");    
$stat = $root->append_child($stat);    
$stat_text = $doc->create_text_node($status);    
$stat_text = $stat->append_child($stat_text);    
   
   
//create paras    
if (is_array($body)){    
 foreach ($body as $K => $V){    
       
   if ($V != ""){    
     $V = stripslashes($V);    
     $para = $doc->create_element("para-$K");    
     $para = $root->append_child($para);    
     $ptext = $doc->create_text_node($V);    
     $ptext = $para->append_child($ptext);    
   }    
 }    
}    
   
//write to the file (first delete existing one!)    
$filename = "./xml/".$id . ".xml";    
unlink($filename);    
$doc->dump_file($filename, false, true);    
   
//send user back to adminindex    
header("Location:adminindex.php");    
?>

Deleting an XML Article

Deleting an XML file is very simple. All you have to do is pass a filename to the delArticle.php page, unlink the file, and send the user back to the adminindex.php page:

<?php     
session_start();    
   
if ($_SESSION["login"] != "true"){    
 header("Location:login.php");    
 $_SESSION["error"] = "<font color=red>You don't have privileges to see the admin page.</font>";    
 exit;    
}    
$dir = "./xml/";    
$filetoburn = $dir . $file;    
unlink($filetoburn);    
header("Location: adminindex.php");    
?>

At this point, we’ve just completed the article management portion of our XML-powered CMS. We’ve built a login page, an administrative index, and pages for adding, editing, and deleting articles.

In the next part of our article, we’ll build the display side of the Website so that visitors can read and search for articles.

Building the Display Side

Now that we’ve defined the XML article structure and built a very simple, secure administration tool to help us create, edit, delete, and publish files, it’s time to build that part of the site that displays articles for site visitors.

Let’s recap our display-side requirements:

  • Site visitors will be able to view “live” articles listed by headline. They will also be able to perform a search on headline and keywords.
  • The site itself will consist of the following pages:
    • A home page that lists five articles published on the site and a search function. Furthermore, site visitors can click a link to show all articles.
    • An article detail page that displays one article at a time.
    • A search result page that will list all articles by an author, keyword, or string entered into the search engine.

The Home Page

The home page, index.php, contains code that basically repeats what we created on the admin side. It basically loops through all the articles in the xml/ directory, opens each file and looks for a status, headline, and abstract. If the status is set to something other than “live,” then the loop finds the next article.

The result is a list of article headlines and abstracts displayed to the home page. Each link leads the site visitor to the showArticle.php page.

Here’s the code (note that I’ve included a search widget also):

<?php     
   
function extractText($array){    
 if(count($array) <= 1){    
   //we only have one tag to process!    
   for ($i = 0; $i<count($array); $i++){    
     $node = $array[$i];    
     $value = $node->get_content();    
   }    
   return $value;    
 }      
     
}      
?>    
   
<html>    
<head>    
<title>Welcome to XMLTEST</title>    
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">    
</head>    
   
<body>    
<h1>Welcome to the XMLTEST site</h1>    
<p><br>    
 <a href="adminindex.php">Admin Login</a> </p>    
<form name="search" method="post" action="searchArticles.php">    
 Search articles:      
 <input name="search" type="text" id="search">    
 <input name="Search" type="submit" id="Search" value="Search">    
</form>    
<p>The following articles are available: </p>    
<table border=1 cellspacing=0 cellpadding=2 width=500>    
<?php    
$dh = opendir('./xml/');    
   
$fileCount = 0;    
while ($file = readdir($dh) and $fileCount <= 5){    
 if (eregi("^\.\.?$", $file)) {    
   continue;    
 }    
 $open = "xml/".$file;    
 $xml = domxml_open_file($open);    
   
 //we need to pull out all the things from this file that we will need to      
 //build our links    
 $root = $xml->root();    
 $stat_array = $root->get_elements_by_tagname("status");    
 $status = extractText($stat_array);    
     
 $ab_array = $root->get_elements_by_tagname("abstract");    
 $abstract = extractText($ab_array);    
   
 $h_array = $root->get_elements_by_tagname("headline");    
 $headline = extractText($h_array);    
   
 if ($status != "live"){    
   continue;    
 }    
 echo "<tr valign=top><td>";    
 echo "<a href="showArticle.php?file=".$file . "">".$headline . "</a><br>";    
 echo $abstract;    
 echo "</td></tr>";    
     
 $fileCount++;    
}    
?>    
</table>    
   
<br><a href="adminindex.php">Admin Login</a>    
</body>    
</html>

Show Article Page

The showArticle.php page is where individual articles are displayed.

Everything you learned about pulling text out of an XML structure applies here, too. Basically, you open the XML article based on the filename that you pass to the page. Then you pull out each appropriate node that you need and store them as variables. In the case of our paragraphs, we’ll store them in an array so we can print them out with a simple foreach loop. Then we print the variables to the screen.

Here’s the code:

<?php     
session_start();    
   
function extractText($array){    
 if(count($array) <= 1){    
   //we only have one tag to process!    
   for ($i = 0; $i<count($array); $i++){    
     $node = $array[$i];    
     $value = $node->get_content();    
   }    
   return $value;    
 }      
     
}      
   
   
//pull in the XML file    
   
if ($file == ""){    
   
 echo "<h2>You didn't choose a file to edit!</h2>";    
 echo "<a href="index.php">Go back to index and choose a file</a>";    
 exit;    
} else {    
 $open = "./xml/" . $file;    
 $xml = domxml_open_file($open);    
 $root = $xml->root();    
     
 $id = $root->get_attribute("id");    
   
 $h_array = $root->get_elements_by_tagname("headline");    
 $headline = extractText($h_array);    
   
 $stat_array = $root->get_elements_by_tagname("status");    
 $status = extractText($stat_array);    
   
       
 $a_array = $root->get_elements_by_tagname("author");    
 $author = extractText($a_array);    
     
 $e_array = $root->get_elements_by_tagname("email");    
 $email = extractText($e_array);    
     
 $ab_array = $root->get_elements_by_tagname("abstract");    
 $abstract = extractText($ab_array);    
   
 $kl_array = $root->get_elements_by_tagname("keywords");    
 $keywords = extractText($kl_array);    
     
 $lead_array = $root->get_elements_by_tagname("para-intro");    
 $para["intro"] = extractText($lead_array);    
   
 $second_array = $root->get_elements_by_tagname("para-main");    
 $para["main"] = extractText($second_array);    
   
 $con_array = $root->get_elements_by_tagname("para-conclusion");    
 $para["con"] = extractText($con_array);    
     
   
?>    
<html>    
<head>    
<title><?php echo $headline; ?></title>    
<meta name="keywords" content="<?php echo $keywords; ?>">    
<meta name="description" content="<?php echo $abstract; ?>">    
   
</head>    
<body>    
<h1><?php echo $headline; ?></h1>    
<a href="index.php">back to main</a>    
<p>by <a href="mailto:<?php echo $email; ?>"><?php echo $author; ?></a></p>    
<p><small><?php echo $abstract; ?></small></p>    
<?php    
foreach ($para as $k => $v){    
 echo "<p>".$v."</p>n";    
}    
   
?>    
</body>    
</html>    
   
<?php    
}    
?>

Searching Through Articles

Searching through an XML article archive is a lot like searching through a database table, except that with XML, you need to open each file and compare the search term with each of the nodes you want to search on, collect pertinent data (like filename, headline, and abstract) and then display these on a page.

In the case of our example, I’m going to use the quick and easy route – opening all files in order, checking to see if they are live (if they aren’t live, I go on to the next file), and storing the headline, abstract, and filename in a multidimensional array.

When I send the results to the screen, I make sure that I count the number of records in this array. If the count is 0, then I display a message saying that no files matched the search term. Otherwise, I display a linked headline and abstract for each article I find.

This search algorithm is the result of three minutes of thinking and about ten minutes of implementing. It is fast and dirty, and works like a charm if you have a small number of articles. In future articles, we’ll look at more robust methods for searching XML files.

Here’s the code:

<?php      
session_start();      
$results = array();      
     
//this is a very simple, potentially very slow search      
function extractText($array){      
 if(count($array) <= 1){      
   //we only have one tag to process!      
   for ($i = 0; $i<count($array); $i++){      
     $node = $array[$i];      
     $value = $node->get_content();      
   }      
   return $value;      
 }      
       
}        
     
$dh = opendir('./xml/');      
     
while ($file = readdir($dh)){      
 if (eregi("^\.\.?$", $file)) {      
   continue;      
 }      
 $open = "./xml/".$file;      
 $xml = domxml_open_file($open);      
     
 //we need to pull out all the things from this file that we will need to      
 //build our links      
 $root = $xml->root();      
 $stat_array = $root->get_elements_by_tagname("status");      
 $status = extractText($stat_array);      
       
 $k_array = $root->get_elements_by_tagname("keywords");      
 $keywords = extractText($k_array);      
     
 $h_array = $root->get_elements_by_tagname("headline");      
 $headline = extractText($h_array);      
       
 $ab_array = $root->get_elements_by_tagname("abstract");      
 $abstract = extractText($ab_array);      
     
 if ($status != "live"){      
   continue;      
 }      
     
 if (eregi($searchTerm, $keywords) or eregi($searchTerm,$headline)){      
   $list['abstract'] = $abstract;      
   $list['headline'] = $headline;      
   $list['file'] = $file;      
   $results[] = $list;      
 }      
     
}      
     
$results = array_unique($results);      
     
?>      
<h1>Search Results</h1>      
<a href="index.php">back to main</a>      
<p>You searched for: <i><?php echo $searchTerm ?></i></p>      
<hr>      
     
     
<?php      
     
if (count($results)>0){      
 echo "<p>Your search results:</p>";      
 foreach ($results as $key => $listing){      
   echo "<br><a href="showArticle.php?file=".$listing["file"]."">" . $listing["headline"]."</a>n";      
   echo "<br>". $listing["abstract"];      
   echo "<br>";      
 }      
} else {      
     
 echo "<p>Sorry, no articles matched your search term.";      
}      
     
?>
Conclusion

That’s it! Well done — we’ve covered a lot of territory. We learned:

  • Some rudimentary tools for developing XML structures
  • How to build a very simple and secure admin tool to take care of basic site functions
  • How to build a home page, article view page, and a simple search engine

In each case, there’s room for improvement, refinement, and robustness. The key is, while working with the tools, you were exposed to some of the basic concepts of XML data structures, application workflow, and simple CMS design.

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • Bruce

    Thanks for writing such a handy tutorial :)

  • http://www.doglegdesign.co.uk Ben Duffin

    Nice Tutorial! I have written two applications that use XML as their Modal, a simple License Server and a full blown CMS – people following this tutorial would do well to keep in mind that while XML was very fast at READ INTENSIVE applications it did start to fall down with WRITE INTENSIVE stuff – even with a non-blocking lock mechanism i would get errors if there were requests being made while the XML was being written out – doing the old .xml.tmp then rename to real name trick did not always prevent this!

    Other than that I found XML to be a wonderful mediu to read data from – and XPath’s capabilities for searching bordered on the complexity of MySQL fulltext, bar the relevance bit!