Ajax and Web Service Data Formats Part 1: XML, SOAP, and HTML

When the AJAX acronym was devised by Jesse James Garrett, its original meaning was “Asynchronous JavaScript and XML.” In essence, you follow this process:

  1. Create a web service, e.g. a PHP ‘page’ which is passed HTTP GET/POST arguments and returns a response in XML.
  2. Write client-side JavaScript code to consume the web service, i.e. pass arguments and retrieve the XML response. The call is handled asynchronously so the browser isn’t locked while it waits for data to arrive.
  3. Parse the XML and update the HTML document accordingly.

The AJAX name stuck and the term was used and abused by developers and marketing-types alike. Today, the uppercase acronym has evolved into the term “Ajax” — a name for any technique where data is sent between the browser and the server without requiring a full page reload. The reason:

  1. It’s not essential to use asynchronous methods (although it’s usually desirable).
  2. You don’t necessarily require JavaScript.
  3. You certainly don’t need XML.

Ultimately, whatever technology or technique you’re using, you must still pass data between two devices. This is the first part in a series of three articles which discusses various formatting options and their pros and cons.

XML

In the beginning, XML was the logical choice. Few other data exchange formats had been formalized and most languages provided libraries for creating, validating and parsing XML data. Even if your language didn’t directly support XML, it’s essentially plain text:


<?xml version="1.0"?>
<products>
	<book>
		<title>The Principles of Beautiful Web Design, 2nd Edition</title>
		<url>http://www.sitepoint.com/books/design2/</url>
		<author>Jason Beaird</author>
		<publisher>SitePoint</publisher>
		<price currency="USD">39.95</price>
	</book>
	<book>
		<title>jQuery: Novice to Ninja</title>
		<url>http://www.sitepoint.com/books/jquery1/</url>
		<author>Earle Castledine &amp; Craig Sharkie</author>
		<publisher>SitePoint</publisher>
		<price currency="USD">29.95</price>
	</book>
	<book>
		<title>Build Your Own Database Driven Website</title>
		<url>http://www.sitepoint.com/books/phpmysql4/</url>
		<author>Kevin Yank</author>
		<publisher>SitePoint</publisher>
		<price currency="USD">39.95</price>
	</book>
</products>

The benefits of XML include:

  • XML can be read by humans and is easier to understand than some other formats (assuming you use understandable tags). In my previous series of articles, How to Create Your Own Twitter Widget in PHP, I used the XML feed for reference even though the application didn’t use it.
  • Most languages provide excellent support for XML including, crucially, JavaScript.
  • XML offers reasonable security. Data must be extracted and parsed so it’s not easy to send a malicious payload.

Unfortunately, there are several disadvantages when using XML:

  • There won’t always be an industry-approved XML format (schema) for the data you’re publishing. You may be able to adapt a format such as RSS but, even then, the JavaScript client must be programmed to understand it.
  • XML can be verbose with a low data to structure ratio. Ideally, an Ajax response should be small in order to minimize bandwidth and lessen the burden on the browser.
  • XML can be a little ambiguous. Should an item of data be a new element or an attribute for an existing one? You can reduce an XML document size by choosing attributes, but that’s not necessarily a good reason to adopt them.
  • XML parsing in JavaScript is tedious. XPath support is patchy at best, so it’s necessary to extract data and convert the string to a real value before it can be used, e.g.

// grab value in first <data> element
var xml = xhr.responseXML;
var nodes = xml.getElementsByTagName("data");
var data = (nodes.length > 0 ? nodes[0].firstChild.nodeValue : null);

Many developers consider XML to be all but dead. I disagree. It may not be the best choice for Ajax clients, but you won’t always know how a web service will be consumed. XML’s ubiquity makes it a great choice — don’t ignore it.

SOAP

SOAP is a standardized format for web service data exchange. The full technical details run to hundreds of pages but, ultimately, SOAP relies on well-defined XML schemas.

Few developers use SOAP directly (the smell gives them away!) The beauty of SOAP is that client libraries automatically parse the XML response into native objects. For example, .NET developers can create SOAP-based web services and clients with very little effort. As far as the developer is concerned, they’re simply instantiating a C# object — it doesn’t matter that it was created on a remote machine.

Unfortunately, SOAP’s XML roots show:

  • SOAP is even more verbose than a typical XML response.
  • Parsing SOAP messages in JavaScript remains difficult. It is possible and SOAP libraries can help, but it’s a lot of effort for the coder and the browser.

SOAP remains a viable choice for server-to-server communications — especially if they’re within the same network. However, it’s too unwieldy if the majority of calls are from Ajax requests.

HTML

HTML is an easy format to use if you want to insert the Ajax response directly on to the page without further analysis. For example, assume you have a small shopping cart widget which appears on every page. You already have server-side code which creates that HTML — it could be adapted to return the same HTML as an Ajax response when an item is purchased.

The benefits include:

  • It’s easy to reuse existing code and create a web service.
  • There’s no need for complex data parsing on the client.
  • The HTML can be quickly added to the page using innerHTML.

But there are disadvantages:

  • It may be difficult to extract useful data. For example, if you wanted to show the cart total elsewhere, it may not be easy to identify that value within the HTML.
  • The message is more verbose than necessary and will probably be larger than an equivalent XML message.
  • Injecting HTML into a page risks breaking the current layout.
  • Security could be an issue — the response could contain a malicious script.

In my next post, we’ll discuss a couple of data formats which are far more efficient for Ajax use: JSON and JSONP.

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • ricktheartist

    Great write-up and agree with all your points. I have always found xml difficult to deal with, but it is getting better, especially server-side. I love using JSON so I am looking forward to your next post and I am curious about JSONP.

  • Just a Joe

    JSON: Pass the object and Parse it, this is how I do it now

    You can have arrays help you organize the data and then use eval to take care of the data result with javascript

  • BrenFM

    Attention! Broken link alert!

    Just been trying to get to part two as featured in the Sitepoint email newsletter (http://blogs.sitepoint.com/2011/02/09/ajax-data-formats-json-jsonp/)… and the link is broken. Also can’t find the content anywhere on the site. I assume this means it’s not yet published? Dead keen to read more, Craig!

  • Tejas Patel

    Agree with your view. I always end up using HTML as its easy to use. Will be waiting for your next post as more curious with JSON and JSONP

  • http://www.timbaxter.co.uk that_tim_fella

    Interesting. Look forward to the next articles.