XML will save the world!
…At least, that’s what I was told. But somehow, several years later, most of us are not using eXtensible Markup Language (XML) in our every-day programming.
All the same, from time to time, you may be asked to provide a service that will consume or output XML, so it’s a good idea to get comfortable with the technology.
Two examples probably dominate the experience I’ve had with XML. The first is a share price service that a number of companies provide for businesses that want to display their share price information on their Websites. The second is the outputting of companies’ press releases as an XML feed for interested parties.
Here, I’ll outline the techniques we use to consume XML in a JSP environment, and briefly comment on outputting XML in a JSP environment. In order to do that, I’ll explain how XML is handled in JSP, and we can then go on to try it out in code.
Key Takeaways
- XML is often underutilized in day-to-day programming, but understanding its consumption and output in JSP is beneficial for specific services like share price updates or press release feeds.
- SAX (Simple API for XML) is recommended for high-volume and time-sensitive XML processing in JSP due to its efficiency in handling data streams and minimal memory usage.
- The InvestorInformation class and InvestorInfoLoader class demonstrate how to encapsulate and manage share-related data retrieved from XML, using SAX for parsing.
- Custom JSP tags, such as the ShowSharePrices tag, facilitate the integration of XML data into web applications, allowing for simplified XML processing within JSP pages.
- The article emphasizes practical, straightforward approaches to XML processing in JSP, including error handling and debugging tips, to enhance web application performance and maintainability.
The XML Share Document
As part of my job, I often have to provide share price information for companies who wish to add the service to their Websites.
Here’s a typical share information document in XML. The service is usually provided over HTTP — that is, you can find it simply by pointing a browser at a URL such as http://www.myinvestorinfo.com/sharefeed.jsp?company=XXX
A share price document might look like this:
<shares>
<share>
<code>CSW</code>
<price>502.25</price>
<change>2.25</change>
<percentageChange>0.45</
percentageChange>
<open>500</open>
<high>506.32</high>
<low>499.85</low>
<yearLow>423.45</yearLow>
<yearHigh>586.92</yearHigh>
<volume>12486123</volume>
<date>12.45 12/06/2003</date>
</share>
</shares>
If you subscribe to more than one offering, multiple elements are provided. For now, let’s assume that we only need statistics for a single share, and that we specifically need the Price, Change and Percentage Change figures, together with the date the price was set, for that share.
These statistics represent, in order:
- the current value of each share
- the amount by which this value has changed since the last share value in dollars
- the amount by which this value has changed since the last share value as a percentage
- the date and time this value was set
Usually, there is a high cost involved in presenting to-the-minute share updates, so they are often 20 minutes or more out of date.
Java and SAX
For high volume, time critical processing of XML, the Simple API for XML (SAX) is recommended. This basically means that each element and piece of text is passed, in the order in which it occurs, to a processor or handler that is programmed to respond in a predefined way to the elements in the document.
In this case, we have a potentially high volume of data being presented on the site. Any content is time-critical on the Web, and while people will usually forgive site owners for slight delays in financial data updates, there’s no point in unnecessarily testing their patience. Ironically, the expectation of up-to-date information seems to be higher in the instance of free content. As there’s no cost to use free services, users will quickly give up on these facilities in favor of the more current information that’s so widely available on the Web.
Another benefit that should not be overlooked is that the amount of memory required for document processing in SAX is quite small. Also, our task is very simple (we merely want to extract the information to show it), so there’s no need to process or otherwise manipulate the data.
The Data Class
As we’ve already discussed, the information we’re interested in is the share’s:
- name
- current price
- dollar change
- time
- date
- percentage change
The price, time, and all other values except date will be stored as Strings.
The data class for this example would therefore be as shown below. Note that I’ve omitted the obvious pieces of code such as the constructors and straight forward getters-and-setters:
package com.clearlysomethingwrong;
import java.util.Date;
import java.text.SimpleDateFormat;
import java.text.ParseException;
public class InvestorInformation {
String name;
String currentPrice;
String change;
String time;
Date date;
String percentage;
public String getFormattedDate() {
SimpleDateFormat formatter = new SimpleDateFormat("dd-MMM-yyyy");
if(date!=null) {
return formatter.format(date);
} else {
return formatter.format(new Date());
}
}
public void setDate(String strdate) {
SimpleDateFormat formatter = new SimpleDateFormat("dd/MM/yy");
try {
date = formatter.parse(strdate);
} catch (ParseException e) {
date = new Date();
}
}
}
The String formats for the incoming and outgoing dates are hard-coded here, but it would not be difficult to change that. The check for NULL in the getter ensures that we don’t get a NullPointerException by passing NULL to the formatter.
Loading the XML Document
Well, that was straightforward, as data classes should be! The next class we’ll write is a kind of factory. We give it a java.io.InputStream of the XML source, and it returns a populated InvestorInformation class.
SAX parsing usually involves the use of identical base code; you create a parser to read the incoming XML, and you pass that parser a Handler, which handles each of the types of content in the XML file.
As the combined code (that which creates the parser and handles the XML elements) is quite simple, the next class we’ll create will achieve both tasks. It will implement the Handler class, and provide a method for processing an XML source.
To start with, we extend the org.xml.sax.helpers.DefaultHandler
class:
package com.clearlysomethingwrong;
import org.xml.sax.helpers.DefaultHandler;
import org.xml.sax.SAXException;
import javax.xml.parsers.ParserConfigurationException;
import java.io.InputStream;
import java.io.IOException;
public class InvestorInfoLoader extends DefaultHandler {
static InvestorInformation info;
static String code;
static String name;
private boolean foundCode;
private boolean setName;
private boolean setCurrentPrice;
private boolean setChange;
private boolean setPercentage;
private boolean setTime;
private boolean setDate;
Each of these flags indicates a state of the XML document in which we’re interested. As the program enters each of the appropriate elements, the matching flag will be set. Then, when character content is sent to the program, it can assign this content to the appropriate property of the InvestorInformation class.
The foundCode
property notes that the share with the given code has been found. The name property allows the user of the class to substitute a user-friendly name for the company, as usual share names (such as IBM, SUNW, XOM, and FNM) can be cryptic for those not in the know. If the value is null, we can simply return the code as the name of the share information object.
Next, we define the static processor method, loadInvestorInfo()
. This method instantiates a parser via the factory method, creates an instance of itself to act as a Handler, and then passes the input stream and itself in. The result is stored in the static InvestorInformation property for this class, which is then returned:
public static InvestorInformation loadInvestorInfo(InputStream is,
String code, String name)
throws SAXException, IOException, ParserConfigurationException {
// detect invalid code, invalid input stream will be detected by parser.
if(code==null || code.length() == 0) {
throw new IllegalArgumentException("Code is null. Please provide a " +
"share code.");
}
InvestorInfoLoader.code = code;
InvestorInfoLoader.name = name;
InvestorInfoLoader iil = new InvestorInfoLoader();
// load the XML file
// create a new sax parser factory
javax.xml.parsers.SAXParserFactory factory =
javax.xml.parsers.SAXParserFactory.newInstance();
// turn off namespaces for now
factory.setNamespaceAware(false);
factory.setValidating(true);
javax.xml.parsers.SAXParser parser = factory.newSAXParser();
parser.parse(is, iil);
// load each bit of info for font
return info;
}
Fairly standard stuff. The handler methods are pretty basic, as well. Notice that I’ve turned off name space awareness here — a DTD is usually provided, but for applications of the level of simplicity shown here (and I find this is quite typical), it may not be vital to have namespace awareness.
The payoff is usually performance versus security, and is your decision. No doubt your choice will depend on the level of relationship you have with the service’s supplier (many change their configuration without notice), and how critical the application is.
Detecting the Elements
The fully qualified name of each element is the name as defined above, <price>
for the price, and so on. The share element is detected in order to create a new InvestorInformation
instance .
public void startElement(String uri, String localName, String qName,
org.xml.sax.Attributes attributes) {
if(qName.equals("share") && info==null) {
info=new InvestorInformation();
} else if (qName.equals("code")) {
setName=true;
} else if(qName.equals("price")) {
setCurrentPrice = true;
} else if(qName.equals("change")) {
setChange = true;
} else if(qName.equals("time")) {
setTime = true;
} else if(qName.equals("date")) {
setDate = true;
} else if(qName.equals("percentageChange")) {
setPercentage = true;
}
}
public void endElement(String uri, String localName, String qName) {
if(qName.equals("share")) {
foundCode = false;
} else if(qName.equals("code")) {
setName = false;
} else if(qName.equals("price")) {
setCurrentPrice = false;
} else if(qName.equals("change")) {
setChange = false;
} else if(qName.equals("time")) {
setTime = false;
} else if(qName.equals("date")) {
setDate = false;
} else if(qName.equals("percentageChange")) {
setPercentage = false;
}
}
Setting Share Values
The characters()
method then collects the share information.
public void characters(char[] ch, int start, int length){
if(setName) {
String localName = new String(ch, start, length);
if(localName.equals(code)) {
this.foundCode = true;
if(name==null) {
info.setName(localName);
} else {
info.setName(name);
}
} else {
// reset the found code class in case there are multiple share entries
foundCode = false;
}
} else if(foundCode && setCurrentPrice) {
info.setCurrentPrice(new String(ch, start, length));
} else if(foundCode && setChange) {
info.setChange(new String(ch, start, length));
} else if(foundCode && setPercentage) {
info.setPercentage(new String(ch, start, length));
} else if(foundCode && setTime) {
info.setTime(new String(ch, start, length));
} else if(foundCode && setDate) {
info.setDate(new String(ch, start, length));
}
}
The only thing that’s particularly notable here is the handling of the code element. Since the document may contain multiple share offerings, we need to detect the correct share offering, but we can do so only after a new ShareInformation
class has been created, and we’re inside a share-offering object. This is where the foundCode
flag comes in.
The funny thing about this code is that almost all SAX processing code looks exactly like that above. It’s actually quite straightforward, so that most of the programming required to use the information gathered by the service happens elsewhere.
At this point I would strongly recommend testing this class, as its simplicity might fool you into assuming it will work first time. Try out some range testing to make sure you’ve covered all the possible values.
Designing the Web Component
Until this point, the coding has been quite simple. There have been relatively few decisions to make, namely whether or not we’ll turn on the various optional features of a parser.
If, upon testing, the class doesn’t work as expected, you may find it easier to break out the handler from the work class. Although the classes are tiny, the separation of roles usually helps make clear where you’ve been going wrong. It depends on how confident you are at juggling static and instance roles.
In addition, it wasn’t absolutely necessary to make the loadInvestorInfo()
method static. It could have worked just as well to make the instantiation of the InvestorInfoLoader class mandatory. I guess in the back of my mind somewhere I considered the possibility of caching the result — a static factory-class-like class is one of the few ways we could achieve this.
I’ve considered 10 different designs before settling on this as the solution I’d use — sometimes you just have to get on and write the code, or nothing gets done. Other considerations included keeping a Hashtable of results that could be cached (and flushed) and indexed according to code at this level, versus holding the data in the context, and so on.
In the end, I chose to implement the main functionality as a Tag. This allows me to teach designers (non-programmers) how to use the tag, and provides reasonable performance. Additionally, this approach requires no intervention on my part when combined in Ant with a standard build script that copies the necessary jar file and configuration for the tag to work. But even without that, the designers at work were ok about making sure the required assets were present in the script.
The alternative solutions of using a Servlet or even a struts class, in my opinion introduce too much configuration for what is often the only dynamic feature on the site. They also require actual programming on my part each time the service is used, which was inappropriate for this application.
Writing the Tag
Writing the tag should now be fairly simple. We need to pick up the following attributes from the designer:
- the URL to the XML source
- the code for the company (let’s use CSW, the friendly name for my mythical company – Clearly Something Wrong Company Ltd)
- the attribute name under which we would like the result stored
The tag doesn’t need to have any body content, so I implement TagSupport
.
Straight off, that gives us:
package com.clearlysomethingwrong;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.jsp.tagext.TagSupport;
import javax.servlet.jsp.tagext.Tag;
import javax.servlet.jsp.JspException;
import org.xml.sax.SAXException;
import javax.xml.parsers.ParserConfigurationException;
import java.io.IOException;
import java.net.URL;
public class ShowSharePrices extends TagSupport {
public String investorURL;
public String code;
public String friendlyName;
public String attName;
First, we get the request. We’re going to store the attribute here, so we’ll need it. In the real application, we could also check that all the values are valid and have been supplied in a meaningful way to provide the user (of the tag, and of the system) with useful error messages. We create a URL object with the given URL string. This URL object simply gives us access to a URL, including providing a number of details about it, and opening an InputStream to it.
Provided the contents of the URL are what we want, this value can then be passed to our investor information loader, together with the code that will extract the appropriate share details and the optional user-friendly name for our company.
If the result is not NULL, we set the attribute using the given attribute name, and exit:
public int doStartTag() throws JspException {
HttpServletRequest request = (HttpServletRequest) pageContext.getRequest();
InvestorInformation info = null;
try{
URL url = new URL(investorURL);
info = InvestorInfoLoader.loadInvestorInfo(url.openStream(),
code, friendlyName);
} catch(IOException e) {
throw new JspException("Error reading XML source");
} catch (SAXException e) {
throw new JspException("Sax exception occurred: " + e.getMessage());
} catch (ParserConfigurationException e) {
throw new JspException("Configuration error:" + e.getMessage());
}
if(info!=null) {
request.setAttribute(attName, info);
}
return Tag.SKIP_BODY;
}
Except for the self-evident mutators and accessors, we can now create the tag library descriptor:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<!DOCTYPE taglib
PUBLIC "-//Sun Microsystems, Inc.//DTD JSP Tag Library 1.2//EN"
"http://java.sun.com/dtd/web-jsptaglibrary_1_2.dtd">
<taglib>
<tlib-version>1.0</tlib-version>
<jsp-version>1.2</jsp-version>
<short-name>xmlInvestor</short-name>
<uri>/xmlInvestor</uri>
<description>
Investor Information Loader Tag
</description>
<tag>
<name>loader</name>
<tag-class>InvestorInfoLoader</tag-class>
<body-content>empty</body-content>
<description>
Loads investor information given a URL of the XML source, a company investor code, and an attribute name to
store the resultant InvestorInformation class. Optionally, a friendly name can be provided for the company.
</description>
<attribute>
<name>investorURL</name>
<required>true</required>
<rtexprvalue>false</rtexprvalue>
</attribute>
<attribute>
<name>code</name>
<required>true</required>
<rtexprvalue>false</rtexprvalue>
</attribute>
<attribute>
<name>attName</name>
<required>true</required>
<rtexprvalue>false</rtexprvalue>
</attribute>
<attribute>
<name>friendlyName</name>
<required>false</required>
<rtexprvalue>false</rtexprvalue>
</attribute>
</tag>
</taglib>
When packaged according to the standard JSP Tag requirements, using the tag is as simple as giving the following JSP file:
<%@ page import="com.clearlysomethingwrong.InvestorInformation"%>
<%@ taglib uri="/xmlInvestor" prefix="xmlInvestor"%>
<xmlInvestor:loader friendlyName="Something Clearly Wrong"
code="CSW" attName="InvestInfo"
investorURL="http://localhost:8080/source.xml"/>
<%
InvestorInformation info = (InvestorInformation)request.getAttribute("InvestInfo");
if(info!=null) {
%><%=info.getName()%><br>
<%=info.getCurrentPrice()%><br>
<%=info.getChange()%><br>
<%=info.getTime()%><br>
<%=info.getFormattedDate()%><br>
<%=info.getPercentage()%><br>
<%
}
%>
The only possible complaint might be the lack of use of another tag or set of tags for outputting the data, however, I’ve found that tags still confuse programs like Dreamweaver.
The scripts are basically provided because Dreamweaver tends to deal more gracefully with them than tags it doesn’t know about. A significant amount of support is required to write the plug-in that will prevent the average designer wrecking the code when redesign is necessary.
A Note About Outputting XML In JSP
It is my opinion that, unless the project warrants serious investment in software development that’s not core to the average Web developer, writing complex code to output XML using DOM in a JSP based application is counter-productive, and introduces expensive maintenance requirements.
On the whole, where a reasonably flat structure is all that’s needed, the intuitive approach to providing XML output is to merely copy the format of the document into a JSP page and insert scriptlets that will insert the necessary data.
The document we’ve just designed to provide share information would look like this:
<shares>
<share>
<code><%=info.getName()%></code>
<price><%=info.getPrice()%></price>
... etc
</share>
</shares>
In many cases, this has saved a great deal of time, is very easy to check (because loading it into Internet Explorer will also check its validity), and does the job nicely.
To Run This Code
My primary development environment is J2SDK1.4, in which XML processing is included as standard. Those on JDK1.3 will need to download the XML pack available from http://java.sun.com/. Otherwise, the code includes only standard java classes.
To make it easy for you to run the code, I’ve packaged the Tag in a jar file according to the JSP tag specifications. This makes for a very simple way of including the functionality provided by the tag in a JSP file. Merely drop the jar file in the lib directory within WEB-INF
. I recommend you spend some time to learn this simple method, as it can save the usual configuration rush at the end of a project, and is more reusable.
Conclusion
The code is quite straightforward, but this is as it should be, if possible. Each class is quite simple and nicely encapsulates its role without too much overlap in roles.
The most important thing is the SAX processing — using the template above, it is simple to modify the program to meet whatever requirements you have. One use to which we’ve put XML is for reliable communication between ASP and JSP, and as we make information available over XML, we usually find it can be used elsewhere.
Download the sample files for this tutorial here.
Frequently Asked Questions (FAQs) about Consuming XML with JSP
How can I parse XML data in JSP?
Parsing XML data in JSP involves using the JSTL XML library. This library provides several tags that can be used to parse XML data. The first step is to include the library in your JSP file. After that, you can use the ‘parse’ tag to parse the XML data. The ‘parse’ tag takes a source attribute which can be a URL, a string, or an XML document. After parsing, you can use other tags like ‘forEach’, ‘if’, ‘choose’, ‘when’, ‘otherwise’ to iterate over the XML data and extract the information you need.
What is the role of JSTL in handling XML data?
JavaServer Pages Standard Tag Library (JSTL) plays a crucial role in handling XML data in JSP. It provides a collection of tags that simplify the JSP code. The XML tags provided by JSTL allow you to parse XML data, perform conditional processing, and iterate over the XML data. This makes it easier to handle XML data in JSP.
How can I handle errors while parsing XML data in JSP?
Error handling while parsing XML data in JSP can be done using the ‘catch’ tag provided by JSTL. The ‘catch’ tag allows you to catch any exceptions that occur while parsing the XML data. You can specify the type of exception you want to catch and provide a block of code that will be executed when that exception occurs.
Can I use XPath expressions in JSP to query XML data?
Yes, you can use XPath expressions in JSP to query XML data. JSTL provides the ‘xpath’ tag that allows you to use XPath expressions. The ‘xpath’ tag takes an ‘select’ attribute which is the XPath expression and a ‘var’ attribute which is the variable that will hold the result of the XPath expression.
How can I convert XML data to other formats in JSP?
Converting XML data to other formats in JSP can be done using XSLT. XSLT is a language for transforming XML documents into other formats like HTML, PDF, etc. JSTL provides the ‘transform’ tag that allows you to apply an XSLT stylesheet to an XML document.
How can I validate XML data in JSP?
Validating XML data in JSP can be done using XML Schema. XML Schema is a language for expressing constraints on the structure and content of XML documents. You can use the ‘validate’ tag provided by JSTL to validate an XML document against an XML Schema.
Can I generate XML data from JSP?
Yes, you can generate XML data from JSP. JSP provides several ways to generate XML data. One way is to use JSP scriptlets and expressions to generate the XML tags and content. Another way is to use JSTL and EL to generate the XML data in a more structured and maintainable way.
How can I handle namespaces in XML data in JSP?
Handling namespaces in XML data in JSP can be done using the ‘namespace’ attribute provided by JSTL. The ‘namespace’ attribute allows you to specify the namespace for the XML tags. This is useful when you are dealing with XML data that uses namespaces.
Can I use JSP to process large XML files?
Yes, you can use JSP to process large XML files. However, processing large XML files can be memory-intensive and slow. To improve performance, you can use SAX (Simple API for XML) which is a streaming API for XML. SAX allows you to process XML data as it is being read, which is more efficient for large XML files.
How can I debug issues while processing XML data in JSP?
Debugging issues while processing XML data in JSP can be done using various techniques. One technique is to use the ‘out’ object to print out the XML data and any intermediate results. Another technique is to use a debugger that supports JSP. A debugger allows you to step through the JSP code and inspect the variables and expressions.
Chanoch is a Senior Developer for Kiwi Media Productions Ltd, which specialises in corporate communications. Chanoch has previously written in a number of books including Professional Apache Tomcat.