WebSockets: Stable and Ready for Developers

WebSockets are stable and ready for developers to start creating innovative applications and services. This tutorial provides a simple introduction to the W3C WebSocket API and its underlying WebSocket protocol. The updated Flipbook demo uses the latest version of the API and protocol.

Working groups have made significant progress and the WebSocket API is a W3C Candidate Recommendation. Internet Explorer 10 implements this version of the spec. You can learn about the evolution of the spec here.

WebSockets enable Web applications to deliver real-time notifications and updates in the browser. Developers have faced problems in working around the limitations in the browser’s original HTTP request-response model, which was not designed for real-time scenarios. WebSockets enable browsers to open a bidirectional, full-duplex communication channel with services. Each side can then use this channel to immediately send data to the other. Now, sites from social networking and games to financial sites can deliver better real-time scenarios, ideally using the same markup across different browsers.

Introduction to the WebSocket API Using an Echo Example

The code snippets below use a simple echo server created with ASP.NET’s System.Web.WebSockets namespace to echo back text and binary messages that are sent from the application. The application allows the user to type in text to be sent and echoed back as a text message, or draw a picture that can be sent and echoed back as a binary message.

For a more complex example that allows you to experiment with latency and performance differences between WebSockets and HTTP polling, see the Flipbook demo.

Details of Connecting to a WebSocket Server

This simple explanation is based on a direct connection between the application and the server. If a proxy is configured, then IE10 starts the process by sending a HTTP CONNECT request to the proxy.

When a WebSocket object is created, a handshake is exchanged between the client and the server to establish the WebSocket connection.

IE10 starts the process by sending a HTTP request to the server:

GET /echo HTTP/1.1
Host: example.microsoft.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Origin: http://microsoft.com
Sec-WebSocket-Version: 13

Let’s look at each part of this request. The connection process starts with a standard HTTP GET request which allows the request to traverse firewalls, proxies, and other intermediaries:

GET /echo HTTP/1.1
Host: example.microsoft.com

The HTTP Upgrade header requests that the server switch the application-layer protocol from HTTP to the WebSocket protocol.

Upgrade: websocket
Connection: Upgrade

The server transforms the value in the Sec-WebSocket-Key header in its response to demonstrate that it understands the WebSocket protocol:

Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==

The Origin header is set by IE10 to allow the server to enforce origin-based security.

Origin: http://microsoft.com

The Sec-WebSocket-Version header identifies the requested protocol version. Version 13 is the final version in the IETF proposed standard:

Sec-WebSocket-Version: 13

If the server accepts the request to upgrade the application-layer protocol, it returns a HTTP 101 Switching Protocols response:

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade

To demonstrate that it understands the WebSocket Protocol, the server performs a standardized transformation on the Sec-WebSocket-Key from the client request and returns the results in the Sec-WebSocket-Accept header:

Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=

IE10 then compares Sec-WebSocket-Key with Sec-WebSocket-Accept to validate that the server is a WebSocket server and not a HTTP server with delusions of grandeur.

The client handshake established a HTTP-on-TCP connection between IE10 and server. After the server returns its 101 response, the application-layer protocol switches from HTTP to WebSockets which uses the previously established TCP connection.

HTTP is completely out of the picture at this point. Using the lightweight WebSocket wire protocol, messages can now be sent or received by either endpoint at any time.

Programming Connecting to a WebSocket Server

The WebSocket protocol defines two new URI schemes which are similar to the HTTP schemes.

  • “ws:” “//” host [ ":" port ] path [ "?" query ] is modeled on the “http:” scheme. Its default port is 80. It is used for unsecure (unencrypted) connections.
  • “wss:” “//” host [ ":" port ] path [ "?" query ] is modeled on the “https:” scheme. Its default port is 443. It is used for secure connections tunneled over Transport Layer Security.

When proxies or network intermediaries are present, there is a higher probability that secure connections will be successful, as intermediaries are less inclined to attempt to transform secure traffic.

The following code snippet establishes a WebSocket connection:

var host = "ws://example.microsoft.com";
var socket = new WebSocket(host);

ReadyState – Ready … Set … Go …

The WebSocket.readyState attribute represents the state of the connection: CONNECTING, OPEN, CLOSING, or CLOSED. When the WebSocket is first created, the readyState is set to CONNECTING. When the connection is established, the readyState is set to OPEN. If the connection fails to be established, then the readyState is set to CLOSED.

Registering for Open Events

To receive notifications when the connection has been created, the application must register for open events.

socket.onopen = function (openEvent) {
document.getElementById("serverStatus").innerHTML = 'Web Socket State::' + 'OPEN';
};

Details Behind Sending and Receiving Messages

After a successful handshake, the application and the Websocket server may exchange WebSocket messages. A message is composed as a sequence of one or more message fragments or data “frames.”

Each frame includes information such as:

  • Frame length
  • Type of message (binary or text) in the first frame in the message
  • A flag (FIN) indicating whether this is the last frame in the message

IE10 reassembles the frames into a complete message before passing it to the script.

Programming Sending and Receiving Messages

The send API allows applications to send messages to a Websocket server as UTF-8 text, ArrayBuffers, or Blobs.

For example, this snippet retrieves the text entered by the user and sends it to the server as a UTF-8 text message to be echoed back. It verifies that the Websocket is in an OPEN readyState:

function sendTextMessage() {
	if (socket.readyState != WebSocket.OPEN)
		return;
	var e = document.getElementById("textmessage");
	socket.send(e.value);
}

This snippet retrieves the image drawn by the user in a canvas and sends it to the server as a binary message:

function sendBinaryMessage() {
	if (socket.readyState != WebSocket.OPEN)
		return;
	var sourceCanvas = document.getElementById('source');
	// msToBlob returns a blob object from a canvas image or drawing
	socket.send(sourceCanvas.msToBlob());
	// ...
}

Registering for Message Events

To receive messages, the application must register for message events. The event handler receives a MessageEvent which contains the data in MessageEvent.data. Data can be received as text or binary messages.

When a binary message is received, the WebSocket.binaryType attribute controls whether the message data is returned as a Blob or an ArrayBuffer datatype. The attribute can be set to either “blob” or “arraybuffer.” The examples below use the default value which is “blob.”

This snippet receives the echoed image or text from the websocket server. If the data is a Blob, then an image was returned and is drawn in the destination canvas;
otherwise, a UTF-8 text message was returned and is displayed in a text field.

socket.onmessage = function (messageEvent) {
	if (messageEvent.data instanceof Blob) {
		var destinationCanvas = document.getElementById('destination');
		var destinationContext = destinationCanvas.getContext('2d');
		var image = new Image();
		image.onload = function () {
			destinationContext.clearRect(0, 0, destinationCanvas.width, destinationCanvas.height);
			destinationContext.drawImage(image, 0, 0);
		}
		image.src = URL.createObjectURL(messageEvent.data);
	} else {
		document.getElementById("textresponse").value = messageEvent.data;
	}
};

Details of Closing a WebSocket Connection

Similar to the opening handshake, there is a closing handshake. Either endpoint (the application or the server) can initiate this handshake.
A special kind of frame – a close frame – is sent to the other endpoint. The close frame may contain an optional status code and reason for the close. The protocol defines a set of appropriate values for the status code. The sender of the close frame must not send further application data after the close frame.
When the other endpoint receives the close frame, it responds with its own close frame in response. It may send pending messages prior to responding with the close frame.

Programming Closing a WebSocket and Registering for Close Events

The application initiates the close handshake on an open connection with the close API:

socket.close(1000, "normal close");

To receive notifications when the connection has been closed, the application must register for close events.

socket.onclose = function (closeEvent) {
	document.getElementById("serverStatus").innerHTML = 'Web Socket State::' + 'CLOSED';
};

The close API accepts two optional parameters: a status code as defined by the protocol and a description. The status code must be either 1000 or in the range 3000 to 4999. When close is executed, the readyState attribute is set to CLOSING. After IE10 receives the close response from the server, the readyState attribute is set to CLOSED and a close event is fired.

Using Fiddler to See WebSockets Traffic

Fiddler is a popular HTTP debugging proxy. There is some support in the latest versions for the WebSocket protocol. You can inspect the headers exchanged in the WebSocket handshake:

All the WebSocket messages are also logged. In the screenshot below, you can see that “spiral” was sent to the server as a UTF-8 text message and echoed back:

Conclusion

If you want to learn more about WebSockets, you may watch these sessions from the Microsoft //Build/ conference from September 2011:

If you’re curious about using Microsoft technologies to create a WebSocket service, these posts are good introductions:

Start developing with WebSockets today!
Electrical Socket image via Shutterstock

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • http://www.leggetter.co.uk Phil Leggetter

    Great to see WebSockets covered here in a very comprehensive way.

    Realtime bi-directional functionality has been achievable for quite some time via the Comet paradigm and using HTTP Long-Polling or HTTP Streaming for server -> client connection, with the addition of second HTTP short-lived connections for client -> server communication. It’s great that we’ve finally now got a full duplex bi-directional method of doing this.

    My personal opinion about how WebSockets are going to be used is that they won’t be directly used too often by developers. How often do web developers directly use `document.getElementById`? Even if they write their own code that uses this function they’ll frequently wrap that function to do a bit more. Many of us auto-include jQuery in our apps, and use `$(‘#someId’).doStuff().doMoreStuff().wow()`.

    My point is that as developers we’ll end up using a library that ‘under the hood’ uses WebSockets and, whilst it is important to know how they work, I don’t think it’ll be something we worry about all that often.

    WebSockets offer:

    * Connect via `new WebSocket`
    * `onopen`
    * `onclose`
    * `onerror`
    * `onmessage`
    * `ws.send(‘data’);`
    * `ws.close(code, reason)`

    But it’s unlikely that this alone will be enough for use in an application. Like `document.getElementById` we generally need more.

    The ‘more’ in the majority (all?) Comet servers (note: Comet servers also use WebSockets) for the past 10 years has been a layer of connection handling – auto reconnection is an absolute minimum – and PubSub (http://en.wikipedia.org/wiki/Publish%E2%80%93subscribe_pattern) to manage requesting and receiving data. At Pusher (http://pusher.com – where I work) we provide channels (http://pusher.com/docs/channels) as a way of filtering data e.g. subscribing to #tech tweets on Twitter. We also expose a way of binding to events on channels e.g. `new_tweet`, `tweet_updated` and `tweet_deleted`. When the event is triggered the associated data is also passed.

    So, with any solution that uses WebSockets we’re likely to see that layer of abstraction which offers the real minimum requirement of functionality that a developer needs. WebSockets offer that first layer of amazing long-awaited functionality, but most of the time we’ll need that additional layer to help us build truly amazing interactive and engaging realtime web applications.

  • Patrick

    Ready for developers? Um, not really. Support is limited to IE10, FF11+ and Chrome 16+. Nothing for Safari or Opera. I’m guessing that no mobile browser supports them. So yeah, if you have an up-to-date browser and are developing something for personal use, or if you want to showcase the technology, you can use websockets today. For any application developed for consumption by the general public, we’ll be waiting at least a few years.

    Great technology, but like so many new web innovations, its main purpose right now is just to tease developers and show them how much better things could be.

    • http://hello10.com Stephen
    • http://www.leggetter.co.uk Phil Leggetter

      As Stephen’s link shows support for WebSockets is actually:

      * Firefox
      * Chrome
      * Safari
      * Safari mobile
      * Firefox mobile
      * Chrome mobile (will become the default browser for Android)
      * Opera (but support is disabled by default)
      * Opera mobile
      * IE10

      Also, by using the Flash Socket fallback, which adds a WebSocket object to the JavaScript runtime, support catches much older browsers as long as Flash is installed.

      WebSockets are actively used in many websites and mobile web apps in production today e.g. SlideShare, Mailchimp, UserVoice, Gauges, ITV News, Nurph, Blether.co, CloudApp to name just a few.

  • Frank

    I’m curious if it is possible to use sockets to stream audio and/or video? Or do we need to wait for the elusive “device” element?

  • Coco

    Thank you for the great article. I have to say, though, I’ve only heard about WebSockets very briefly in the past so I am still trying to absorb the whole concept. A few questions if I may:

    1) You say that both text and binary content can be sent and recieved with the WebSocket. How do you handle security on the server side? What would stop someone from writing their own client to connect to your WebSocket server and uploading some sort of rogue executable, for example? Can you incorporate traditional methods of web based password protection? Is there a way to validate (on the server side) what is being sent down the WebSocket connection? Are these concerns even relevant?

    2) Does the WebSocket connection pass through a web server (like Apache or IIS) or is the WebSocket server a standalone beastie on its own port? Or does it pass through a web server only for the first part where it does the HTTP GET request?

    Thanks again.

    • http://www.leggetter.co.uk Phil Leggetter

      1) You can do this in a few ways. At Pusher the URL that the WebSocket connects to has an application key which initially identifies that the connection is at least valid.

      Once the connection is established you can add your own authentication as you require by checking the messages that are sent from client to server. Part of that could be an initial username/password message. Messages could all be signed for authenticity.

      The executable example isn’t very likely unless you take the data sent over the WebSocket connection, create a file out of it and then run it (or similar). So, you have the same control over the WebSocket connection as you do an HTTP connection.

      2) IIS doesn’t natively support WebSockets yet. We’ll be waiting for Windows Server 8 for that. Apache has a module for WebSocket support. You can also get standalone WebSocket servers.

      There are a whole bunch of options available see: http://www.leggetter.co.uk/real-time-web-technologies-guide

  • Litost

    Nice article! I am still weighing arguments about websockets, though. Should I go like “So now, we’re back to where we were 15 years ago – yet with less powerful graphical controls”, or like “Deployed as served, runs almost everywhere”…? Kind of reminds me of Java applets. Anyway, I am not sure that we’ll have lots of relevant use cases for sockets in HTML, but I guess it’s good to see some innovation. Cheers.

    • http://www.leggetter.co.uk Phil Leggetter

      Java applets were the first way that realtime server push could be achieved. Then we moved to native solutions such as a hidden IFRAME then solutions that used the XMLHttpRequest object. WebSockets are a native browser solution that solves the ‘hacked’ HTTP-based solutions.

      The use cases are the same that the previous solutions were used for in addition to new innovative uses:

      * Realtime data – simple updates such as share prices or sports data
      * Notifications – Usually used along with Create, Update and Delete server/database events. e.g. a new news item is available or a new comment being added to a blog post.
      * Chat – interaction between multiple users. I’m sure we’ve all seen chat before. But WebSockets are a perfect solution to this feature.
      * Collaborative applications – e.g. Google docs where multiple users are editing the same document.
      * 2nd Screen experiences – applications which provide realtime information that complements or adds additional value to something you are watching on your main screen (your TV).
      * Multiplayer games – we’ve all played multiplayer games but WebSockets make building multiplayer HTML5 games easier.
      * Cool Ardunio stuff – think Internet of Things. See: http://twitter.com/ninjablocks and http://twitter.com/readiymate

      The latter is quite interesting as it clearly demonstrates that WebSockets aren’t just about server to browser communication. It’s server to any client – web, mobile, desktop, server or even robot/fridge/cooker/garage door/plug socket.

      Hope this clarifies some of the use cases.