Opera: Just 4.13% of Web’s Code is Valid

By Josh Catone
We teamed up with SiteGround
To bring you the latest from the web and tried-and-true hosting, recommended for designers and developers. SitePoint Readers Get Up To 65% OFF Now

A new study from Opera finds that the overwhelming majority of web sites don’t adequately support web standards. The good news is that compared to previous studies, more web sites are valid today than they were in previous years. The bad news is that just 4.13% of the URLs included in the studies sample size — which was over 3.5 million web pages — passed the W3C validator.

The results came from Opera’s “Metadata Analysis and Mining Application” (MAMA), a search engine that “indexes the markup, style, scripting and the technology used while creating Web pages.” Opera engineers and data miners can then ask the search engine questions like, “how many sites use CSS?” (the answer is 80.4%) or, “how many markup errors does the average site have?” (it’s 47).

Opera is vague about what they’ll do with this tool, other than continue to analyze the data and post additional finding, but a press release hints at the possibility of making it publicly available to web developers.

“MAMA will help Web developers find examples of usage of features and functions, look at trends and gather data to justify technology to their clients or managers,” wrote Opera. “This will also encourage standards bodies to take into account developers’ suggestions about what is happening on the Web in reality and will eventually raise the quality and interoperability of specifications, the Web and browsers.”

Ars Technica, which posted an analysis of the findings today, also seems to think that the search tool Opera writes about will be made public.

The MAMA study findings released today, also revealed that ~50% of pages sporting “W3C Valid” buttons aren’t actually valid. The reason, surmises Opera, is that keeping pages valid is not easy for many developers, and for many keeping up with evolving standards is difficult. The key takeaway, says Opera, “is that people are BAD at this ‘HTML thing.’ Improper tag nesting is rampant, and misspelled or misplaced element and attribute names happen all the time. It is very easy to make silly, casual mistakes — we all make them.”

One problem, though, is that the tools people are using to create web pages also turn out ugly, invalid code. MAMA also looked at validation as it related to web page editors and content management systems, and found that with the exception of Apple’s iWeb — for which an impressive 81.91% of URLs pass validation — the results were generally dismal. Just 0.55% of pages made with Microsoft Frontpage are valid, according to MAMA, just 3.44% for Adobe Dreamweaver.

For content management systems, the results weren’t much better. WordPress pages were valid just 9.0% of the time, just 12.74% for Typo, and 6.45% for Joomla. Google’s Blogger fared the worst, with only a paltry 0.30% of URLs passing validation.

There is a lot of interesting information to dig through in the full results and Opera is promising to release more results as part of a planned “long, multi-part saga.”

We teamed up with SiteGround
To bring you the latest from the web and tried-and-true hosting, recommended for designers and developers. SitePoint Readers Get Up To 65% OFF Now
  • The code for the site I work with is an example of poor validation. One of my main goals in the redesign…if we ever get the redesign implemented–is 100% validation. Then the percentages will be 4.130000001%. One of my problems in meeting this goal is our prima donna, AWOL “re”designer, who embedded a profanity-laced tirade against validation in the CSS code itself. (!) (I won’t get into what happened when I objected.)

  • alex

    I think web browser developers have it the wrong way around, we should be FORCED to make valid web pages (not valid wont show/ shows error information) not them make browsers that compensate for our laziness.

  • that’s a neat story, max

  • Mike

    It isn’t the developers faults. It’s a mix of every browser handling design aspects differently, internet explorer completely sucking at everything, and that fact that like 20% of people still use IE6. We have no choice but to hack away at the code to make it useable. If every browser were the same, and supported code correctly, all websites would be forced to use valid code.

  • I guess I don’t think that every single web designer/developer should be forced to validate every single web page they produce. I’m from the school of thought where I figure a page should validate as good as it needs to serve it’s purpose. It can be very time consuming and sometimes very (unnecessarily) costly to make sure a page validates 100%. Time is money in a lot of cases. Why should I care if a site validates on every single page if it works fine in every browser, the content is accessible and the site is able to present what it needs to present without impeding usability? Especially if just to make it ‘validate’ would take a bunch of extra hours and no real noticeable difference in accessibility or usability? Validation is nice, and good in a lot of web site circumstances, but it shouldn’t be forced upon every single web page on the web.

  • zendak

    As mentioned in the article, the majority of tools such as desktop WYSIWYG editors or CMSes generates code that is primarily utilitarian instead of aiming at validity. As long as we tolarate that and refuse to use alternatives, it’s our own fault. There are ways: Ditch WYSIWYG tools (time to actually learn HTML, oh my), use CMSes that let you specify your own markup instead of forcing their grabage on you. The latter problem exists with many — if not most — of “popular” open-source tools and widely used commercial systems. Time to take responsibility and push alternatives that do it right; they exist. This may involve getting over personal laziness or delving into company policies/politics, but hey, nothing’s for free. That is, if you even care, which I suspect is not the case for most “web developers” out there.

    The MAMA study findings released today, also revealed that ~50% of pages sporting “W3C Valid” buttons aren’t actually valid.

    Which is yet more evidence of the fact that “validation badges” are moronic. For a less polemic explanation, I strongly recommend all you proud badge-bearers read this fine pearl of wisdom.

  • =IceBurn=

    I’m proud to say that ALL my works are included in those 4.13% :)

  • I guess I don’t think that every single web designer/developer should be forced to validate every single web page they produce.

    Why not? Would you also accept magazine articles or novels with bad spelling and grammar in every other sentence?

    Validation is like spell-checking your markup. It’s a quality assurance step. If you don’t think your work is important enough for some QA, perhaps it shouldn’t be published at all?

  • What Tommy said!

  • Homie_187

    I’m surprised the percentage was so high. I would have expected it to be less than 1%.

  • Hopefully Opera will make this MAMA tool available, I’d love to try it out.

    I try to make all my sites 100% compliant but when clients start entering content using the CMS, errors are bound to happen. I try to educate clients to avoid the most common mistakes (pasting directly from Word etc) and once a year maybe give the content a validation spring clean.

    47 errors on the average page? A year ago I took over a website that had over 1,500 errors on a single page. Today it 100% validates. Gives me a warm glow inside.

  • Don’t really care. Aren’t W3C’s recommendations supposed to be just “guidelines” anyway? Usability > Validity. End users more important than HTML purists.

  • graedus_dave

    Ever run google.com through the W3C’s validator? It had 67 errors when I checked it on a lark last week.

  • Stephen

    One problem I run into is that our site uses a piece of search software that places its options in the tag, which it then reads and displays the results as directed. But since these made-up terms are not valid attributes for , our site will never be 100% valid.

    How many sites out there are perfectly valid in every way, except they contain code that would not meet W3C validation but are intended to be read by a different piece of software anyway?

  • Jim

    What a great marketing tactic on the part of Opera. They are trying to be known as the most standards compliant browser. They produce a compelling report on validation and presto: we have a buzz, brand association, and character association to that brand.

  • Spheriod

    Here’s a novel thought: the “standards” that we have to use…HTML and CSS…need to be revamped. I can’t tell you how many hours I spent looking for articles on line for free on how to get around all the mind boggling differences between Firefox and IE in their ways of attacking basic layout of DIVs.

    Why doesn’t someone develop a browser (or an add-on to a browser) that uses a NEW STANDARD? Why is the W3C the authority? Can’t a community put something better to the fore? The HTML standards are OUT OF DATE.

  • bsmbahamas

    I think webdevelopers are caught in a catch 22, because the browser developers all try to be unique and better but all they do i cause us to have to write code for more environments.

    We seriously need standards laid out for the average programmer to understand the WC3 reads like vcr instructions in my opinion, and at the same time we need browsers that will obey the standards and i hate to say it – block sites that are broken, but provide ‘plain english’ error reporting so they can be fixed easily.

    until then i’ll write halfway pages because it takes forever to learn code for 7 browsers.

  • bsmbahamas

    virtually every piece of code has startign and ending tags or valid syntax, but if you ignore the syntax or worse yet the browsers don’t punish developers that ignore standard syntax valid sites will always be an illusion.

    I think the developers need to all agree on one standard and stop with all the proprietory browsers, each works differently, and nobody has the time to learn to code for several platforms, our code need to be compact and portable – not a hundred lines long to detect the browser and environment and then adapt before it executes the meat of the code.

  • I think the developers need to all agree on one standard and stop with all the proprietory browsers

    Let’s stop pretending that developers actually have a choice here. Standards are all good, but in the end usability counts. Which usually means different hacks for different browsers. :-/

  • SonFishDesign.com

    My last client page leans heavily on CSS. The whole page is complient except for the -moz-box-sizing hack. Enough said.

  • SonFishDesign.com

    This site has 5 errors in the html but validates css 2.1

  • Antz

    What a thing to wish for Alex, browsers supporting enforcing standards! It would certainly be valuable to assist certain developers who have not yet learned a correct way to code, and could possibly be our opportunity to ditch non-compliant browsers such as IE6.

    True enough, there will need to be an ability for browser manufacturers to register their proprietary commands to avoid such errors.

  • Shadow Caster

    I think perfectionists are the sort of people who really stress on validation of all their pages. In terms of how a search engine looks at it, it doesn’t matter if your page isn’t 100% XHTML 1.0 Strict valid – it won’t mark it down so SEO is unaffected. One of the few reasons why they should make it valid is so that it looks and behaves correctly in the 4 major browsers, but quite often it doesn’t even if it is valid and we have to employ little tweaks.

  • topdown

    What it comes down to is if your not going to use the tools set out for you, and check your work, then it’s your own fault when a customer complains about display issues in a particular browser. IE is no exception, and your code should not need hacks, you should just understand the needed CSS for the particular browser to display the page properly.

    If you are going to call your self a developer and develop web pages, you need to know your field and the software the consumers will use to view it.

    There is simply no reason for bad or non valid code !

  • SimonPhoto

    “..It doesn’t matter if your page isn’t 100% XHTML 1.0 Strict valid..”

    Oh, yes, it does. XHTML, if served properly, will give nothing but an error if there is a single error in the markup. XHTML is an XML-based language, and therefore must be well-formed. HTML is SGML-based, and gives you much more leeway.

    That said, most of my pages don’t validate via the W3C. I follow the standards though, to the point where it is practical. In many cases, you have a choice: meeting your client’s expectations, providing unfettered access to all users, and following the standards. Choose two.

    Sorry guys, but strictly following standards is not a must-have. Its a nice-to-have.

  • Anonymous

    I use iCAB as a browser. It is relative to this thread because iCAB has a little smiley in the bottom row that scowls whenever it shows a page which does not validate. A click will pop up a list of all the errors. Is that neat or what? Handy too!

    Now I realize that iCAB is a Macintosh only product and that Kevin never recognizes it in his epistles. However it is a very competent little browser and wll parse just about everything. It worked on the
    Acid Test from the start. Very laudable from a shop which is essentially a one man operation.

    Why should you validate? Because it makes your client appear to be lazy and sloppy if you do not. Of course who’s going to know? No one is going to read your source code.

    Except for us iCAB users. 8^)

  • Except for us iCAB users. 8^)

    Yepp.. All 8 of you.. :P :)

  • yman

    Tabbed browsing, Quick Find, fraud protection, saved sessions, Speed Dial, notes and the trash make from Opera Browser, a browser more better than it was. Significant speed allow you to spend more time online. I installed it, it works very well and i got it from here: Opera Browser