🤯 50% Off! 700+ courses, assessments, and books

Editorial: To Benchmark, or Not to Benchmark?

Nilson Jacques
Share

You might have seen some headlines recently about Google’s plans to retire their Octane JavaScript benchmark suite. If you’re not aware of this or didn’t read past the headline, let me briefly recap. Google introduced Octane to replace the industry-standard SunSpider benchmark. SunSpider was created by Apple’s Safari team and was one of the first JavaScript benchmarks.

Performance meter with arrow on 100%. Performance benchmark.

There were two problems with SunSpider. First of all, it was based on microbenchmarks (think testing the creation of a new array thousands of times) which didn’t reflect real-world usage very accurately. Secondly, SunSpider rankings came to carry a lot of weight among browser makers, resulting in some optimizing their JavaScript engines for better benchmark scores rather than the needs of real programs. In some instances, these tweaks even lead to production code running slower than before!

Octane focused on trying to create tests that more accurately simulated real workloads, and became a standard against which JavaScript impĺementations were measured. However, browser makers have once again caught up and we’re seeing optimizations tailored to Octane’s tests. That’s not to say that benchmarks haven’t been useful. The competition between browsers has resulted in massive improvements to JavaScript performance across the board.

Vaguely interesting, you might say, but how does this affect my job day-to-day as a developer? Benchmarks are often cited when trying to convince people of the benefits of framework y over framework x, and some people put a lot of importance in these numbers. Last week I noticed a new UI library called MoonJS doing the rounds on some of the news aggregators. MoonJS positions itself as a ‘minimal, blazing fast’ library, and cites some benchmark figures to try to back that up.

To be clear, I’m not picking on MoonJS here. This focus on speed is quite common, especially among UI libraries (take a look at any of the React clones as an example). As we saw above with the examples of SunSpider and Octane, though, benchmarks can be misleading. Many modern JavaScript view libraries and frameworks utilize some form of virtual DOM to render output. In the process of researching different implementations, Boris Kaul spent some time looking at ways of benchmarking virtual DOM performance and found it was relatively easy to tweak VDOM performance to do well on the benchmarks. His conclusion? “Don’t use numbers from any web framework benchmarks to make a decision when you are choosing a framework or a library.”

There are other reasons to be cautious when comparing libraries based on their claimed speed. It’s important to remember that, like SunSpider, many benchmarks are microbenchmarks; They are measuring repeated operations on a scale that you’re unlikely to match when creating interfaces for your applications.

It’s also worth asking how important speed is for your particular use-case. Building a bread-and-butter CRUD app is unlikely to bring any UI library to its knees, and factors such as the learning curve, available talent pool, and developer happiness are also important considerations. I’ve seen many discussions in the past on whether Ruby was too slow for building web applications but, despite faster options existing, a good many apps have been and continue to be written in Ruby.

Speed metrics can be misleading, but they may also be of limited use depending on what you’re building. As with all rules of thumb and good practices, it’s always good to stop and think how (or if) it applies to your situation. I’m interested to hear your experiences: Have you used software that didn’t live up to its benchmark claims in practice? Have you built apps where that difference in speed was important? Leave me a comment and let me know!