Editorial: To Benchmark, or Not to Benchmark?

Nilson Jacques
Nilson Jacques
Share

You might have seen some headlines recently about Google’s plans to retire their Octane JavaScript benchmark suite. If you’re not aware of this or didn’t read past the headline, let me briefly recap. Google introduced Octane to replace the industry-standard SunSpider benchmark. SunSpider was created by Apple’s Safari team and was one of the first JavaScript benchmarks.

Performance meter with arrow on 100%. Performance benchmark.

There were two problems with SunSpider. First of all, it was based on microbenchmarks (think testing the creation of a new array thousands of times) which didn’t reflect real-world usage very accurately. Secondly, SunSpider rankings came to carry a lot of weight among browser makers, resulting in some optimizing their JavaScript engines for better benchmark scores rather than the needs of real programs. In some instances, these tweaks even lead to production code running slower than before!

Octane focused on trying to create tests that more accurately simulated real workloads, and became a standard against which JavaScript impĺementations were measured. However, browser makers have once again caught up and we’re seeing optimizations tailored to Octane’s tests. That’s not to say that benchmarks haven’t been useful. The competition between browsers has resulted in massive improvements to JavaScript performance across the board.

Vaguely interesting, you might say, but how does this affect my job day-to-day as a developer? Benchmarks are often cited when trying to convince people of the benefits of framework y over framework x, and some people put a lot of importance in these numbers. Last week I noticed a new UI library called MoonJS doing the rounds on some of the news aggregators. MoonJS positions itself as a ‘minimal, blazing fast’ library, and cites some benchmark figures to try to back that up.

To be clear, I’m not picking on MoonJS here. This focus on speed is quite common, especially among UI libraries (take a look at any of the React clones as an example). As we saw above with the examples of SunSpider and Octane, though, benchmarks can be misleading. Many modern JavaScript view libraries and frameworks utilize some form of virtual DOM to render output. In the process of researching different implementations, Boris Kaul spent some time looking at ways of benchmarking virtual DOM performance and found it was relatively easy to tweak VDOM performance to do well on the benchmarks. His conclusion? “Don’t use numbers from any web framework benchmarks to make a decision when you are choosing a framework or a library.”

There are other reasons to be cautious when comparing libraries based on their claimed speed. It’s important to remember that, like SunSpider, many benchmarks are microbenchmarks; They are measuring repeated operations on a scale that you’re unlikely to match when creating interfaces for your applications.

It’s also worth asking how important speed is for your particular use-case. Building a bread-and-butter CRUD app is unlikely to bring any UI library to its knees, and factors such as the learning curve, available talent pool, and developer happiness are also important considerations. I’ve seen many discussions in the past on whether Ruby was too slow for building web applications but, despite faster options existing, a good many apps have been and continue to be written in Ruby.

Speed metrics can be misleading, but they may also be of limited use depending on what you’re building. As with all rules of thumb and good practices, it’s always good to stop and think how (or if) it applies to your situation. I’m interested to hear your experiences: Have you used software that didn’t live up to its benchmark claims in practice? Have you built apps where that difference in speed was important? Leave me a comment and let me know!

Frequently Asked Questions (FAQs) about Benchmarking in JavaScript

What is the purpose of benchmarking in JavaScript?

Benchmarking in JavaScript is a process that measures the performance of a specific piece of code or a function. It helps developers understand how efficient their code is and identify areas for improvement. By comparing the execution time of different code snippets, developers can choose the most efficient solution for their needs. Benchmarking is crucial in JavaScript development as it directly impacts the user experience, especially in terms of speed and responsiveness of web applications.

How does SunSpider benchmarking tool work?

SunSpider is a popular JavaScript benchmarking tool developed by WebKit. It runs a series of tests on a JavaScript engine and measures the time it takes to complete each test. The tests cover various aspects of JavaScript, including control flow, string processing, and mathematical calculations. The lower the total time taken, the better the performance of the JavaScript engine.

What is the difference between SunSpider and other benchmarking tools?

While all benchmarking tools aim to measure JavaScript performance, they differ in the types of tests they run and how they calculate the results. SunSpider focuses on real-world use cases and avoids micro-benchmarks that only test a single feature. Other tools like jsben.ch and jsbench.me allow developers to create and run their own tests, providing more flexibility.

How can I interpret the results of a JavaScript benchmark?

Benchmark results usually provide a time measurement, which indicates how long a specific operation took to complete. The lower the time, the better the performance. However, interpreting these results requires understanding the context. For example, a difference of a few milliseconds might not be significant in a user interface, but it could be crucial in a high-performance server application.

Can I use benchmarking to compare different JavaScript engines?

Yes, benchmarking is a common way to compare the performance of different JavaScript engines. By running the same tests on different engines, you can get a sense of their relative performance. However, keep in mind that real-world performance can be influenced by many factors, and benchmark results are just one piece of the puzzle.

How can I create my own benchmarks?

Tools like jsben.ch and jsbench.me allow you to write and run your own JavaScript benchmarks. You can use these tools to test specific pieces of code or compare different approaches to solving a problem. When creating a benchmark, it’s important to make the test as realistic as possible and to run it multiple times to get an accurate measurement.

Are there any limitations to JavaScript benchmarking?

While benchmarking is a powerful tool, it has its limitations. It can be difficult to create realistic tests, and the results can be influenced by many factors, including the specific hardware and software environment. Also, focusing too much on benchmark results can lead to over-optimization, where developers spend too much time improving code that has little impact on overall performance.

What is the role of benchmarking in the development process?

Benchmarking is an important part of the development process, as it helps developers identify performance bottlenecks and verify that their changes have improved performance. However, it should not be the only tool used to evaluate code quality. Other factors, such as readability, maintainability, and functionality, are also important.

Can benchmarking help improve the performance of my web application?

Yes, benchmarking can help you identify areas of your code that are slowing down your application. By optimizing these areas, you can improve the overall performance of your application. However, remember that performance is just one aspect of a quality web application. Usability, functionality, and design are also important.

How often should I benchmark my code?

The frequency of benchmarking can depend on the nature of your project. For performance-critical applications, you might want to benchmark regularly, even after small changes. For less critical applications, benchmarking can be done less frequently, such as after major changes or before a new release.