Opera: Just 4.13% of Web’s Code is Valid

By | | News

A new study from Opera finds that the overwhelming majority of web sites don’t adequately support web standards. The good news is that compared to previous studies, more web sites are valid today than they were in previous years. The bad news is that just 4.13% of the URLs included in the studies sample size — which was over 3.5 million web pages — passed the W3C validator.

The results came from Opera’s “Metadata Analysis and Mining Application” (MAMA), a search engine that “indexes the markup, style, scripting and the technology used while creating Web pages.” Opera engineers and data miners can then ask the search engine questions like, “how many sites use CSS?” (the answer is 80.4%) or, “how many markup errors does the average site have?” (it’s 47).

Opera is vague about what they’ll do with this tool, other than continue to analyze the data and post additional finding, but a press release hints at the possibility of making it publicly available to web developers.

“MAMA will help Web developers find examples of usage of features and functions, look at trends and gather data to justify technology to their clients or managers,” wrote Opera. “This will also encourage standards bodies to take into account developers’ suggestions about what is happening on the Web in reality and will eventually raise the quality and interoperability of specifications, the Web and browsers.”

Ars Technica, which posted an analysis of the findings today, also seems to think that the search tool Opera writes about will be made public.

The MAMA study findings released today, also revealed that ~50% of pages sporting “W3C Valid” buttons aren’t actually valid. The reason, surmises Opera, is that keeping pages valid is not easy for many developers, and for many keeping up with evolving standards is difficult. The key takeaway, says Opera, “is that people are BAD at this ‘HTML thing.’ Improper tag nesting is rampant, and misspelled or misplaced element and attribute names happen all the time. It is very easy to make silly, casual mistakes — we all make them.”

One problem, though, is that the tools people are using to create web pages also turn out ugly, invalid code. MAMA also looked at validation as it related to web page editors and content management systems, and found that with the exception of Apple’s iWeb — for which an impressive 81.91% of URLs pass validation — the results were generally dismal. Just 0.55% of pages made with Microsoft Frontpage are valid, according to MAMA, just 3.44% for Adobe Dreamweaver.

For content management systems, the results weren’t much better. WordPress pages were valid just 9.0% of the time, just 12.74% for Typo, and 6.45% for Joomla. Google’s Blogger fared the worst, with only a paltry 0.30% of URLs passing validation.

There is a lot of interesting information to dig through in the full results and Opera is promising to release more results as part of a planned “long, multi-part saga.”

Josh Catone

Josh Catone is the Lead Blogger at SitePoint. Prior to working at SP, he was the Lead Writer at ReadWriteWeb.

More Posts - Website

{ 28 comments }

yman November 8, 2008 at 8:06 pm

Tabbed browsing, Quick Find, fraud protection, saved sessions, Speed Dial, notes and the trash make from Opera Browser, a browser more better than it was. Significant speed allow you to spend more time online. I installed it, it works very well and i got it from here: Opera Browser

Srirangan October 24, 2008 at 6:50 pm

Except for us iCAB users. 8^)

Yepp.. All 8 of you.. :P :)

Anonymous October 24, 2008 at 5:41 pm

I use iCAB as a browser. It is relative to this thread because iCAB has a little smiley in the bottom row that scowls whenever it shows a page which does not validate. A click will pop up a list of all the errors. Is that neat or what? Handy too!

Now I realize that iCAB is a Macintosh only product and that Kevin never recognizes it in his epistles. However it is a very competent little browser and wll parse just about everything. It worked on the
Acid Test from the start. Very laudable from a shop which is essentially a one man operation.

Why should you validate? Because it makes your client appear to be lazy and sloppy if you do not. Of course who’s going to know? No one is going to read your source code.

Except for us iCAB users. 8^)

SimonPhoto October 23, 2008 at 10:22 pm

“..It doesn’t matter if your page isn’t 100% XHTML 1.0 Strict valid..”

Oh, yes, it does. XHTML, if served properly, will give nothing but an error if there is a single error in the markup. XHTML is an XML-based language, and therefore must be well-formed. HTML is SGML-based, and gives you much more leeway.

That said, most of my pages don’t validate via the W3C. I follow the standards though, to the point where it is practical. In many cases, you have a choice: meeting your client’s expectations, providing unfettered access to all users, and following the standards. Choose two.

Sorry guys, but strictly following standards is not a must-have. Its a nice-to-have.

topdown October 23, 2008 at 7:43 am

What it comes down to is if your not going to use the tools set out for you, and check your work, then it’s your own fault when a customer complains about display issues in a particular browser. IE is no exception, and your code should not need hacks, you should just understand the needed CSS for the particular browser to display the page properly.

If you are going to call your self a developer and develop web pages, you need to know your field and the software the consumers will use to view it.

There is simply no reason for bad or non valid code !

Shadow Caster October 23, 2008 at 6:35 am

I think perfectionists are the sort of people who really stress on validation of all their pages. In terms of how a search engine looks at it, it doesn’t matter if your page isn’t 100% XHTML 1.0 Strict valid – it won’t mark it down so SEO is unaffected. One of the few reasons why they should make it valid is so that it looks and behaves correctly in the 4 major browsers, but quite often it doesn’t even if it is valid and we have to employ little tweaks.

Antz October 22, 2008 at 6:37 am

What a thing to wish for Alex, browsers supporting enforcing standards! It would certainly be valuable to assist certain developers who have not yet learned a correct way to code, and could possibly be our opportunity to ditch non-compliant browsers such as IE6.

True enough, there will need to be an ability for browser manufacturers to register their proprietary commands to avoid such errors.

SonFishDesign.com October 22, 2008 at 4:44 am

This site has 5 errors in the html but validates css 2.1

SonFishDesign.com October 22, 2008 at 4:34 am

My last client page leans heavily on CSS. The whole page is complient except for the -moz-box-sizing hack. Enough said.

Srirangan October 22, 2008 at 4:12 am

I think the developers need to all agree on one standard and stop with all the proprietory browsers

Let’s stop pretending that developers actually have a choice here. Standards are all good, but in the end usability counts. Which usually means different hacks for different browsers. :-/

bsmbahamas October 22, 2008 at 4:04 am

virtually every piece of code has startign and ending tags or valid syntax, but if you ignore the syntax or worse yet the browsers don’t punish developers that ignore standard syntax valid sites will always be an illusion.

I think the developers need to all agree on one standard and stop with all the proprietory browsers, each works differently, and nobody has the time to learn to code for several platforms, our code need to be compact and portable – not a hundred lines long to detect the browser and environment and then adapt before it executes the meat of the code.

bsmbahamas October 22, 2008 at 3:59 am

I think webdevelopers are caught in a catch 22, because the browser developers all try to be unique and better but all they do i cause us to have to write code for more environments.

We seriously need standards laid out for the average programmer to understand the WC3 reads like vcr instructions in my opinion, and at the same time we need browsers that will obey the standards and i hate to say it – block sites that are broken, but provide ‘plain english’ error reporting so they can be fixed easily.

until then i’ll write halfway pages because it takes forever to learn code for 7 browsers.

Spheriod October 21, 2008 at 11:52 pm

Here’s a novel thought: the “standards” that we have to use…HTML and CSS…need to be revamped. I can’t tell you how many hours I spent looking for articles on line for free on how to get around all the mind boggling differences between Firefox and IE in their ways of attacking basic layout of DIVs.

Why doesn’t someone develop a browser (or an add-on to a browser) that uses a NEW STANDARD? Why is the W3C the authority? Can’t a community put something better to the fore? The HTML standards are OUT OF DATE.

Jim October 21, 2008 at 10:05 am

What a great marketing tactic on the part of Opera. They are trying to be known as the most standards compliant browser. They produce a compelling report on validation and presto: we have a buzz, brand association, and character association to that brand.

Stephen October 21, 2008 at 4:11 am

One problem I run into is that our site uses a piece of search software that places its options in the tag, which it then reads and displays the results as directed. But since these made-up terms are not valid attributes for , our site will never be 100% valid.

How many sites out there are perfectly valid in every way, except they contain code that would not meet W3C validation but are intended to be read by a different piece of software anyway?

graedus_dave October 21, 2008 at 2:48 am

Ever run google.com through the W3C’s validator? It had 67 errors when I checked it on a lark last week.

Srirangan October 20, 2008 at 8:29 pm

Don’t really care. Aren’t W3C’s recommendations supposed to be just “guidelines” anyway? Usability > Validity. End users more important than HTML purists.

jasonking October 20, 2008 at 1:05 pm

Hopefully Opera will make this MAMA tool available, I’d love to try it out.

I try to make all my sites 100% compliant but when clients start entering content using the CMS, errors are bound to happen. I try to educate clients to avoid the most common mistakes (pasting directly from Word etc) and once a year maybe give the content a validation spring clean.

47 errors on the average page? A year ago I took over a website that had over 1,500 errors on a single page. Today it 100% validates. Gives me a warm glow inside.

Homie_187 October 19, 2008 at 7:50 am

I’m surprised the percentage was so high. I would have expected it to be less than 1%.

Black Max October 19, 2008 at 5:01 am

What Tommy said!

AutisticCuckoo October 19, 2008 at 1:54 am

I guess I don’t think that every single web designer/developer should be forced to validate every single web page they produce.

Why not? Would you also accept magazine articles or novels with bad spelling and grammar in every other sentence?

Validation is like spell-checking your markup. It’s a quality assurance step. If you don’t think your work is important enough for some QA, perhaps it shouldn’t be published at all?

=IceBurn= October 18, 2008 at 7:09 am

I’m proud to say that ALL my works are included in those 4.13% :)

zendak October 18, 2008 at 4:28 am

As mentioned in the article, the majority of tools such as desktop WYSIWYG editors or CMSes generates code that is primarily utilitarian instead of aiming at validity. As long as we tolarate that and refuse to use alternatives, it’s our own fault. There are ways: Ditch WYSIWYG tools (time to actually learn HTML, oh my), use CMSes that let you specify your own markup instead of forcing their grabage on you. The latter problem exists with many — if not most — of “popular” open-source tools and widely used commercial systems. Time to take responsibility and push alternatives that do it right; they exist. This may involve getting over personal laziness or delving into company policies/politics, but hey, nothing’s for free. That is, if you even care, which I suspect is not the case for most “web developers” out there.

The MAMA study findings released today, also revealed that ~50% of pages sporting “W3C Valid” buttons aren’t actually valid.

Which is yet more evidence of the fact that “validation badges” are moronic. For a less polemic explanation, I strongly recommend all you proud badge-bearers read this fine pearl of wisdom.

lukemeister October 18, 2008 at 3:49 am

I guess I don’t think that every single web designer/developer should be forced to validate every single web page they produce. I’m from the school of thought where I figure a page should validate as good as it needs to serve it’s purpose. It can be very time consuming and sometimes very (unnecessarily) costly to make sure a page validates 100%. Time is money in a lot of cases. Why should I care if a site validates on every single page if it works fine in every browser, the content is accessible and the site is able to present what it needs to present without impeding usability? Especially if just to make it ‘validate’ would take a bunch of extra hours and no real noticeable difference in accessibility or usability? Validation is nice, and good in a lot of web site circumstances, but it shouldn’t be forced upon every single web page on the web.

Mike October 18, 2008 at 2:56 am

It isn’t the developers faults. It’s a mix of every browser handling design aspects differently, internet explorer completely sucking at everything, and that fact that like 20% of people still use IE6. We have no choice but to hack away at the code to make it useable. If every browser were the same, and supported code correctly, all websites would be forced to use valid code.

XtrEM3 October 18, 2008 at 1:34 am

that’s a neat story, max

alex October 17, 2008 at 9:08 pm

I think web browser developers have it the wrong way around, we should be FORCED to make valid web pages (not valid wont show/ shows error information) not them make browsers that compensate for our laziness.

Black Max October 17, 2008 at 12:21 pm

The code for the site I work with is an example of poor validation. One of my main goals in the redesign…if we ever get the redesign implemented–is 100% validation. Then the percentages will be 4.130000001%. One of my problems in meeting this goal is our prima donna, AWOL “re”designer, who embedded a profanity-laced tirade against validation in the CSS code itself. (!) (I won’t get into what happened when I objected.)

Comments on this entry are closed.