Visualizing Data with D3

D3.js is a relatively new addition to the JavaScript toolbox. The three D’s stand for Data Driven Documents. You may have heard that D3 is just another JavaScript graphing library, but that’s only partially true. Indeed, D3 does produce excellent graphics, but its real value lies in its ability to respond dynamically to changes in your data.

In this article, we’ll take a quick look at D3 and focus on a few basic ideas that make D3 such an interesting approach to JavaScript-based graphics. We’ll look at enough code snippets to give you a sense of how the D3 library works.

The Basics

Many people say that the D3 learning curve is steep, but that all depends on your perspective. It can be difficult to learn the intricacies of any library, but if you’ve been down the path with jQuery, you’ve picked up a lot of the same concepts that are used in D3. And, if you’re familiar with the SVG (Scalable Vector Graphics) format, then you’re even further along on your journey.

As an example, consider this line of D3 code, and see if you can guess what it does:

d3.selectAll("p").style("color", "red");

If you guessed it does essentially the same thing as the following jQuery statement, give yourself a pat on the back!

$("p").css("color", "red");

The selectAll() function selects all of the elements that match the given pattern, while the style() function implements a styling change on the selection.

So where does D3 differ from jQuery? For starters, it is very good at creating elements on the fly – not just HTML elements and CSS attributes, but it can build and navigate through SVG elements. For example, the following code selects a div element with the ID test, and appends an SVG element with a specific width and height:

var testBox = d3.select("#test")
  .append("svg")
  .attr("width", 400)
  .attr("height", 150);

This code carves out a box on the browser and reserves it for SVG. Notice how the commands are chained together, similar to jQuery. However, unlike jQuery, some of the chained commands in D3 return a reference to a new element, rather than the original selected element. In the previous example, the append() function creates a new SVG element and returns a reference to it. Subsequent chained commands use this element to apply its attributes.

Now that you have a reference to the new SVG box, you can draw something inside it.

testBox.append("circle")
  .style("stroke", "black")
  .style("fill", "green")
  .attr("r", 50)
  .attr("cx", 100)
  .attr("cy", 75);

As you might have deduced, the previous code draws a circle, with a radius of 50, and offset by (100, 75) in the coordinate space. The circle is drawn with a black stroke and filled with green.

D3 — It’s Data Driven!

D3 really shines when it comes to implementing data driven graphics. Unfortunately, this is where the difficult part starts. As a D3 programmer, you have to understand how data enters the D3 application and what it does once it gets there. Additionally, you have to think of how data leaves the application.

Let’s return to the testBox SVG element created above. Think of this box as a system that automatically adjusts to the data you put into it. Data works with the box using one of three mechanisms:

Data enters the box.
Data updates while it’s in the box.
Data leaves the box.

These concepts can be summarized using the functions enter(), update(), and exit().

Imagine the testBox above as a retainer to show data in the form of circles. Each circle represents a data point, and each data point has three attributes. These three attributes could be rendered as a position on the x-axis, a position on the y-axis, and a radius. The data set could look something like this:

var bubbleChart = [[43, 54, 23], [97, 15, 14], [114, 100, 20]];

Obviously, this example lacks a sense of the real world. To be more realistic, we would include the data in some sort of JSON structure that would look like the output of a real database. But, we’ll keep it simple for this example by sticking with this three column matrix. Later, we’ll add and remove rows from the matrix while the program runs. D3 contains some powerful mechanisms to handle your data, including the ability to query data from an external source. This is very useful when tracking dynamic values like the weather, the stock market, earthquakes, etc.

Let’s start over with the testBox example from above. We’ll get rid of the circle we drew, and in its place we’ll let the data draw the circles for us.

var bubbleChart = [[43, 54, 23], [97, 15, 14], [114, 100, 20]];
var testBox = d3.select("#test")
  .append("svg")
  .attr("width", 400)
  .attr("height", 150);
var tb = testBox.selectAll("circle").data(bubbleChart);

tb.enter()
  .append("circle")
  .style("stroke", "black")
  .style("fill", "green")
  .attr("cx", function(d) { return d[0]; })
  .attr("cy", function(d) { return d[1]; })
  .attr("r", function(d) { return d[2]; });

You can see the declaration of the data in the bubbleChart array, and the testBox variable simply carves out a SVG space with the dimentions 400×150. The “joining” of the data with the SVG takes place as we define the tb variable:

var tb = testBox.selectAll("circle").data(bubbleChart);

This line looks bizarre, because we haven’t yet defined any selection called circle, so initially it would appear that the selection is empty. Well, that’s not really true, because the subsequent data() function call tells D3 to join all circle selections to the bubbleChart data.

Keep in mind that when the application is initially run, there is no data in the box. When the joining takes place, the data, as contained in bubbleChart, suddenly “enters” the box. Afterwards, the enter() function is called. The tb.enter() call appends circle elements to the SVG box and styles each with a stroke and fill color.

Next, the individual rows of the data structure are broken out for each circle. For example, the y-position information is set by this attr() function call:

.attr("cy", function(d) { return d[1]; })

This function takes two parameters: the name of the attribute being set (in this case, the y-position), and the value of that attribute. Because this element has been joined with a data structure, the second attribute consists of a function call that automatically works on members of that data structure. D3 implements a declarative programming style, so you don’t actually program the looping yourself — the enter() function gets called for each first-level element in the data structure. In this case, we have a two-dimensional matrix, so on each iteration, a different array is handed to the function call. All we have to do is pull out the individual elements of the array and use them to set the x, y, and radius of each circle.

Dynamics

So far, we’ve looked at rendering graphics based on data, but we haven’t looked at the dynamic aspect of D3. As previously mentioned, data is entering, updating, or leaving the system. In the above example, a matrix with three columns represented the data. D3 considers that matrix to be the data, where each row of the matrix is an additional data element. To illustrate how the data changes, we would have to encapsulate most of the above logic in a function, and then run the function each time the data changes.

For example, with each run of the function, we select new random values for the rows in bubbleChart. To take it one step further, we either add rows or remove rows from bubbleChart with each change. When rows are added, the enter() function is called to process the new information. When rows are removed, the exit() function is called to ensure they are removed. Finally, when an element changes its value, the update() function is called to process the updated information. Note that there is no update() function per se. When the data() function is called to join the data with the graphical element, it returns a pointer to an update function.

The final JavaScript code appears in the listing below. Note that the update() function (simply tb) colors the circles red, whereas the enter() function colors the new circles green. The exit() function simply removes the circles from the graphical element. Also note that a “run” button was added so that new data could be generated with each push of the button.

var root = d3.select("#test");
var testBox = root.append("svg")
  .attr("width", 400)
  .attr("height", 150);    
var runCircles = function() {
  var bubbleChart = [];
  var numCircles = Math.floor(Math.random() * 11); // select 0 - 10 circles

  for (var i = 0; i < numCircles; i++) {
    bubbleChart.push([Math.floor(10 + Math.random() * 390), Math.floor(10 + Math.random() * 140), Math.floor(10 + Math.random() * 40)]);
  }

  var tb = testBox.selectAll("circle").data(bubbleChart);

  tb.style("stroke", "black").style("fill", "red")
    .attr("cx", function(d) { return d[0]; })
    .attr("cy", function(d) { return d[1]; })
    .attr("r", function(d) { return d[2]; })
    .attr("opacity", .5);

  tb.enter()
    .append("circle")
    .style("stroke", "black")
    .style("fill", "green")
    .attr("cx", function(d) { return d[0]; })
    .attr("cy", function(d) { return d[1]; })
    .attr("r", function(d) { return d[2]; })
    .attr("opacity", .5);

  tb.exit().remove();
};
root.append("button").text("run").on("click", runCircles);

In the following figures, you can see what happens between two subsequent runs. In the first run, there were four elements in bubbleChart, and therefore, four circles on the screen. The one red circle is an update from the previous run, and there were three new data elements, denoted by the color green.

On the next run, the previous four elements now show up in red. They have changed positions and size, but they’re still updates, so they appear in red. Meanwhile, four more new elements were added to the database, showing up in green.

As a final note, D3 provides some fancy ways to animate the transitions of data. So, the above example could have faded and/or moved the existing graphical elements from one state to another as they updated, while the new elements could have faded in. There are a number of impressive transition effects available through the tutorials on the D3 website.

Conclusion

D3.js is a powerful graphics library for JavaScript. Rather than simply rendering graphics, however, it can join a data set with a set of graphic elements and provide a true data driven graphical environment. This article touches upon some of the main concepts of D3. Though D3 has a fairly steep learning curve, if you are already familiar with jQuery and SVG, you will find D3 fairly straightforward to learn. You can find complete details and a number of helpful tutorials on the D3 site.