OSCON 2006: Cross-site Ajax

This week, Kevin Yank is reporting from OSCON 2006 in Portland, OR.

Paralyzed by indecision (at any given time at OSCON, there are literally three different sessions I would consider “must-see” going on), I went to Plaxo developer Joseph Smarr’s Cross-site Ajax: Challenges and Techniques for Building Rich Web 2.0 Mashups, mainly because I am giving a talk on the same subject at Web Directions in September. Slides will be available on the Plaxo blog.

Mashups, if you’ve been living under a rock, are web applications built by combining services provided by several specialized web applications, typically using AJAX as the glue. One of the main challenges faced by developers of mashups is the same-origin policy, which prevents JavaScript on one site from contacting other sites as a security measure. For mashups to really work, developers need to find a way around that restriction.

The story so far…

A number of well-understood solutions exist to this problem, but they all have problems of their own. Because the same-origin policy only affects the browser, a server-side proxy is the most obvious approach, where your JavaScript contacts your web application, which then makes the request to the service in another domain. Unfortunately, this makes your server the bottleneck, rather than allowing each client to contact the service directly.

Another option is to use a Flash movie to act as the proxy. Flash has a similar same-origin policy about the requests it makes, but that policy can be suppressed if the destination service provides a crossdomain.xml file on its domain. If Flash sees this file, and it contains a listing for the domain wanting to make the request, the cross-domain request is allowed. Of course, Flash is a proprietary, non-standards-based technology, which may not be an option for you.

A more intricate solution, called JSON-P, dynamically adds a script element to the page in order to make the request. <script> tags aren’t subject to the same-origin policy, so the src attribute can point to another domain and the request will go through without complaint.

Since the browser expects to receive and execute JavaScript code from this request, that’s exactly what a service called in this manner needs to return, but you need that code to do something meaningful for your application. The most common way to support this is to send in the query string of the request the name of a callback function in your code that the service can use to make a function call in the response to your request.

Problems with JSON-P as it currently exists include a lack of error handling, the requirement that your request be made via HTTP GET, and the possibility that the service might return malicious code that could execute within your application (for example, stealing security-sensitive cookie values).

What’s New?

Plaxo wanted to build a system that would allow you to pick email addresses out of your Gmail address book and have them inserted into a text field on a third party site, simply by clicking a button on that 3rd party site (and providing your Gmail login credentials the first time). That would require that 3rd party site to have cross-domain access to Gmail, and none of the above solutions would do the trick.

“The JavaScript Wormhole” is what Plaxo calls the solution it has developed, and although this technique is ingenious, it’s also rather twisty, so much so that it’s difficult to convey in words—but I’ll give it a shot:

The provider of a service such as Plaxo’s address book selector provides an HTML file that must be added to any site that consumes the service. This file provides the callback mechanism for the service to return data to the requesting site.

When the service is invoked (e.g. the browser opens the address book selector from Plaxo in a pop-up window), and the user finishes interacting with it (e.g. the user selects one or more contacts and clicks “OK”), the service’s pop-up window loads the callback HTML file from the requesting site into an iframe. The callback HTML file, in turn, contains a <script> tag that requests the results of the service invokation (e.g. the contacts that the user selected) from Plaxo using the JSON-P technique discussed above. Because the HTML file is hosted on the requesting site, the JavaScript returned in the response is able to write those results into the original page on that site (e.g. insert the selected email addresses into the appropriate form field).

Smarr indicated that Plaxo was putting some work into creating a “Generalized JavaScript Wormhole” standard that other services could easily adopt, eliminating the need for every service to provide its own callback HTML file for site owners to install. This solution would include the ability for site owners to provide a list of trusted domains whose services would be able to use the callback to update the site’s pages (e.g. insert email addresses into a form) upon request.

The JavaScript Wormhole is no panacea—it suffers from the same drawbacks as JSON-P, since it is built around that technique—but it’s certainly the slickest (if not the only) solution for services that need to modify a page of the requesting site following some user interaction with the service.

Where to from here?

If we think about the future now, maybe we can get a platform that will not require us to “break into our own house” to build mashups. Smarr believes (and I wholeheartedly agree) that developers should be discussing these issues now to shape the platforms of the future so that they support the needs of mashup developers without compromising the strict—yet invisible to the user—security model that is in place today.

There are a number of proposed solutions already on the table. One possibility would be to set up a standard way to define trust relationships between sites. Flash does it with crossdomain.xml, web services do it with certificates and IP whitelists, and the generalized JavaScript wormhole solution does it too. But would this quash the freedom of mashups, which currently allow developers to dream up an idea and “just do it”? Formalized trust relationships will breed service contracts and licensing agreements, which will kill the spirit of mashups.

Other ideas on the table include proposed browser enhancements such as Chris Holland’s ContextAgnosticXmlHttpRequest, which unlike the current XMLHttpRequest object would permit cross-domain requests, but would not send any cookie or HTTP authentication headers, and might only connect to servers that send the X-Allow-Foreign-Hosts header in their HTTP responses. Douglas Crockford’s proposed JSONRequest would work under similar restrictions, but would be specific to JSON-formatted services. But the restrictions proposed with these solutions might be going too far.

Whichever solution you prefer, the real message is that the future of AJAX in general and mashups specifically is wide open, and you need to get involved in the process of defining its evolution. There are no easy answers yet, and these problems aren’t going away anytime soon. But perhaps most importantly, the browser developers are listening more than ever, and we need to tell them what we, as developers, want.

Win an Annual Membership to Learnable,

SitePoint's Learning Platform