How Browser Market Share is Calculated

    Craig Buckler

    I post a browser market share report every month or two. I hope you find them interesting or at least revel in the news that IE6 usage is dropping. However, they often result in more questions than they answer. This is my attempt at explaining how browser market share is calculated. It’s a theoretical overview rather than a mathematical thesis and calculation methods will differ from system to system.

    Lies, damn lies and statistics

    …and then there’s web statistics. Collating information from web browsers is notoriously difficult — see Why Your Website Statistics Reports Are Wrong. StatCounter and Google Analytics produce great-looking reports with figures to several decimal places but you should be aware that analysis is based on a hierarchy of assumptions.

    That’s not to say web statistics are useless. They’re great for spotting trends, but attempting map results onto the activities of individual users is often futile.

    What market share isn’t…

    Before we discuss what market share is, perhaps it’s best to determine what it’s not:

    1. Browser downloads
    The number of browser downloads is not a viable method of comparing market share:

    • All operating systems are provided with a default browser and many users will never consider an alternative.
    • You could download a browser many times but never install it.
    • You could download a browser once and install it on hundreds of PCs.
    • Browsers which implement automatic updates would be disadvantaged.

    2. Browser installations
    Browser installations are similarly flawed:

    • You may never use the browser that came with your OS. IE is installed on all Windows PCs but it doesn’t follow that IE has the same 90% market share as Windows itself.
    • You could install another browser and never use it.
    • You could install and uninstall the same browser multiple times.

    That’s not to say the figures aren’t useful, but they’re not necessarily indicative of market share.

    So what is market share?

    The first cause of confusion is that market share tables show a percentage of users. In reality, market share is determined from actual browser usage. The figures are probabilities.

    Assume browserX has a 50% market share. If you examine a random file hit on a random website, the chance that browserX was used is 1 in 2. It doesn’t matter whether you look at a random hit, visitor session, or an individual user — the same probability will apply.

    Since we’re calculating browser usage proportions, the underlying data does not need to record individuals. However, the results retain a direct correlation to users. We could conclude that:

    • everyone uses browserX 50% of the time, or
    • more realistically, 50% of users use browserX all the time

    The result is somewhere between those two extremes. Ultimately, it doesn’t matter — we’re analyzing the usage patterns of a group.

    How market share is calculated

    When you visit a website, every file request (hit) is logged and your browser is identified from the user agent string passed in the HTTP header. Essentially, if 50% of hit requests are from browserX during period P, it has a 50% market share at that time.

    The reality is a little more complex. File hits can be ambiguous because different browsers can download different resources, e.g. IE conditional stylesheets or pre-caching linked pages. Therefore, systems may only analyze the actual page view or make other adjustments.

    The next important consideration is the sample size — how many sites and hits are analyzed. There’s no such thing as an “average” website:

    • has a technical audience so Firefox and Chrome usage is higher than others.
    • the audience’s country has an impact, i.e. Opera is more popular in Europe and Russia than the US
    • the day and time affect usage patterns. For example, IE use is normally higher during weekday office hours than evenings or weekends.

    Statistical anomalies reduce if you analyze a wide range of websites from many different countries. In essence, more data results in more accurate browser usage figures. StatCounter analyzes traffic from 3 million websites throughout the world — that appears to be a healthy sample size.

    Ahh, but what if…

    There now follows a list of frequently-asked browser market share questions. If I haven’t answered your query, please leave a comment below.

    Q: Internet usage is growing.
    The number of internet users increases every day. Therefore, it’s possible for a browser’s market share to drop while the actual number of users increases.

    Multiply the number of web users by the browser proportion to estimate changes in population … assuming you can find reasonable net usage figures.

    Q: Would visitor or user sessions be more accurate than page/file hits?
    No. It wouldn’t result in better data because you’re reducing the sample size and introducing unexpected issues. If you only had session data, 3 hours recreational browsing on a single site would equal 30 seconds browsing for a work-related topic. Since many people use IE at work, it would be given an unfair bias over another browser used at home.

    Remember we’re analyzing browser usage: it’s not necessary to understand individual user behavior.

    Q: I use more than one browser. Am I counted multiple times?
    It doesn’t matter. Individuals often have complex browsing patterns, e.g. you may use Firefox 80% of the time and Chrome 20% of the time. That usage is recorded; if you were the only person sampled, Firefox would have an 80% market share.

    Q: What about geeks using the net for 18 hours a day?
    Market share is a record of browser usage. A heavy user’s browsing carries more weight than someone running IE6 once every month.

    However, assuming the sample size is large enough, the effect of an individual or group’s browsing habits is negligible and will not skew the results. For every geek or technophobic, there are thousands of people using the net for an hour or two per day.

    Q: My browser uses an incorrect user agent string.
    It won’t be identified correctly but, again, you’re in a minority and it’s unlikely affect the results by a significant margin.

    Q: How accurate are results for regions or individual countries?
    A smaller sample size results in less robust data. I would have more faith in US-only figures than those for Antarctica.

    Q: My site’s statistics are different?
    They will be. Many factors influence browser usage and few sites can be compared against the global average. Always check your own figures first.

    Q: I don’t believe any of these numbers!
    A healthy dose of skepticism is good for you. Blindly using a report without understanding the underlying data or analysis is dangerous.

    Coming soon — Browser Trends, May 2011.