Does Your Text to Code Ratio Matter?

Contributing Editor

A page’s text to code ratio is a measure of the quantity of content compared to the structure. For example, assume your page is 1,000 bytes. If 700 bytes are used for HTML tags and embedded CSS or JavaScript, 300 bytes would be readable content. You would have a text ratio of 3:7 or percentage of 30%. There are many tools to help you assess the figure — one of the easiest to use is DOM Monster.

But is there an ideal text to code ratio?

Search Engine Optimization

Some SEO experts claim that higher text ratios improve search engine positions. I’ve even seen 42% stated as a magical perfect percentage which will boost your Google PageRank. However, I suspect that number’s been plucked by an SEO snake oil “specialist” with a Hitchhiker’s Guide to the Galaxy obsession.

Personally, I don’t believe the text to code ratio has a significant impact on SEO. That said, Google ignores content beyond the first 100Kb so larger pages could benefit from a higher text ratio. However, if you’re exceeding 100Kb, I’d suggest splitting documents into several more focused pages would be a more constructive SEO exercise.

Code Efficiency

In general, it’s best practice to use the least amount of code possible. Unnecessary tags incur additional page weight, slower downloads and more inefficient browser rendering. It also makes your code harder to maintain.

If your pages are lightweight and use clean, semantic HTML with external CSS and JavaScript files, your text to code ratio will naturally fall. There are a few exceptions:

  • Shorter pages will have a low text ratio because you require a minimum number of code elements to create a valid HTML document.
  • Media-heavy pages, such as a gallery with images or videos, typically have low text ratio.
  • Flash or Ajax-powered web applications may not have any content whatsoever — but it’s still there.

In general though, a text to code ratio which exceeds 50% is achievable on
most content pages.

So should we, as conscientious developers, measure our text to code ratios?

If you’re already using good coding techniques, there’s little need to bother. You could use it as a factor when measuring efficiency since a good developer will generally have a higher text to code ratio. However, ratios should never be considered in isolation. After all, a page scoring 25% which works everywhere is usually better than one which scores 50% but fails in most browsers.

Do you consider your page’s text to code ratios? Do you use it to evaluate your own performance?

Free book: Jump Start HTML5 Basics

Grab a free copy of one our latest ebooks! Packed with hints and tips on HTML5's most powerful new features.

  • InfraredWebSolutions

    I don’t ever consider this. The only time it even trickles into the back of my mind is when, as you said, I encounter a HUGE page over 100kb…. this very rarely happens though.

    Everything I do on the web is SEO centric, and to be honest, even if you were to consider this it would be WAY down the list of things. There are FAR more important aspects to a page like it’s load time, the keyword density, proper coding techniques, and lets not forget the most important aspect of all….. CONTENT!

    The ONLY way I can possibly see higher text ratios improving SEO is naturally, through content. Obviously if you have more words on your page, you can adjust your keyword density as you see fit easier, along with targeting more keywords. You also most likely, although this is not always the case, have better content with a few paragraphs vs a few sentences. Again, CONTENT = KING! On-page SEO accounts for a mere 10% of all SEO… it’s all about off-page SEO which is driven by your content.

  • Anonymous

    I do not think text to content ratio (even if there is one such thing) is matter. What matters is the content quality and coding standards.
    Benny
    Macronimous.com

    • http://www.optimalworks.net/ Craig Buckler

      You should achieve a high text to code ratio with good quality content and decent coding standards. So it could be used to prove you’re doing that (to a certain extent, anyway).

  • Cromulent

    The SEO industry is 99% myth and old wives tales and 1% fact (numbers taken from the “SEO Industry Guide to Statistics”).

    • http://www.optimalworks.net/ Craig Buckler

      And remember that 98.5% of statistics are made up.

  • AgmLauncher

    Are you sure that Google still limits the crawl to 100Kb? I thought that was a myth. My site has pages that are 1mb in size (due to graphics), but also have 50kb worth of CSS and god knows how much code.

    This isn’t due to poor coding, it’s just because the pages contain a lot of complex information, and the site itself is very complex.

    Despite that, Google has had no problem crawling my pages and I’m well ranked for all the keywords I want to be ranked for.

    • http://www.optimalworks.net/ Craig Buckler

      As far as I’m aware, most search engines have an indexing limit. Many SEO experts say it’s around 100Kb for Google, but some report more. Only Google will know for sure.

      Large pages will still be indexed, but the content beyond that limit may not be.

  • Stormrider

    Seems WordPress doesn’t like my new forum password, and isn’t letting me login to post. Doesn’t inspire much confidence in the developers of WordPress!

    Anyway, for the first part, the ratio is 3:7, not 3:10!

    • http://www.optimalworks.net/ Craig Buckler

      Your message has appeared!

      Whoops — sorry about the ratio. It’s been updated.

      • http://www.cemerson.co.uk Stormrider

        My message appeared but I had to post it as an ‘anonymous’ user!

        For some reason I am still logged in at home, just at work where I can’t log in (although I can to the forum)

    • http://www.deathshadow.com deathshadow60

      Like winning the 2008 pwnie for M4ss PWnage did? Like their primary skin coders not even understanding the basics of how to use classes or how full URL’s are a total waste of bandwidth does?

      Confidence, yeah, that’s a word one associates with turdpress — NOT.

  • http://www.apcooper.co.uk AndrewCooper

    Hey Craig, very interesting post! I understand where you’re coming from because I’ve thought about it a few times in the past but it hasn’t really made me change anything because I know I use good coding practices and what not so I don’t lose any sleep at night thinking about text to code ratios ;)

    However, it certainly makes me think about performance. Every time I develop a Web page I’m always keeping an eye out for the amount of code compared to the amount of content on the page. When I’m looking at the source code of other Web pages I also like to skim and scan the page to see what their ratio is like. I generally think to myself that if there is minimal code then I’m on the good side, but there are times when I look at the source code of other Web pages and see it littered with HTML and CSS and know that it can’t be good for performance!

    Andrew Cooper

  • Virtual Labz

    I greatly rely on the text to code ratio. Though others things matter too, but this one is an important aspect and dont miss this out.

  • http://www.deathshadow.com deathshadow60

    The term is “Code to content ratio”, or at least in terms of markup to content. I find it amazing how said term has fallen into disuse the past decade… but then I’m the guy who still knows what a K-LoC is.

    I have a rule of thumb calculation on HTML to it’s content that is usually pretty spot on…

    1.5K+content size*1.5+200 bytes per ‘object type element’

    Object type elements being OBJECT, EMBED, IMG — of for the HTML5 idiocy AUDIO and VIDEO.

    If your code is larger than that calculation gives you, the markup is probably total trash.

    CSS is harder to scale since you might be using a larger CSS file to pre-cache the appearance of sub-pages instead of doling out billions of little files — but as a rule I find that if you have more than 32k of CSS or more than two files per “media type”, it’s entirely likely the CSS is rubbish.

    THEN you have javascript, and this is where the real idiocy crops up these days with people bloating out their pages with HUNDREDS of K of “Javascript for NOTHING”. This can mostly be blamed on “gee ain’t it neat” animated garbage which does NOTHING but annoy the end user the second time they visit the site, but as much blame goes in the laps of the “framework bandwagon” lunacy… steaming piles of manure like Jquery uncompressed being by itself almost the same size I usually set as the ideal size for a single page on my sites — that’s HTML+CSS+IMAGES+SCRIPTS!!!. After all, there’s a reason Dan Schulz used to say “the only thing you can learn from Jquery is how NOT to program javascript” — and no, that wasn’t meant as a compliment.

    DOM Monster’s numbers are cute, but I kinda wish it would list the CtC without counting images towards it. That would be the real litmus test… it’s “percentages” making ZERO sense compared to doing a cut/paste of my text and comparing to the “document size” in the web dev toolbar (under FF 3.5 since thats another feature broken in 3.6 — my god FF’s web dev stuff is aging like milk)

    While I can’t speak to the SEO side (generally most bloated markup is ignored by engines anyways, so long as you have semantic tags on the correct items bloated extra markup should have no effect) it’s simply a matter of efficiency and ease of maintenance in the future.

    … after all as I’ve said a billion times, the less code you use, the less there is to break.

  • de la Cruz

    I am not sure where I stand on this so-called “code ratio”. It seems that a lot of PHP “developers” try to write the fewest lines of code possible. As if to reduce the executing time. However for those who do parallel programming and or know data structures well will know that the length of code has almost nothing to do with its performance. ie. Fastest algorithm are usually very long but do their job very well and fast.

    PS: This is how I feel when I read peoples PHP code http://pastebin.com/VTVrmRNb . PHP people claim its faster than using helper functions and what not. To which I say “I dont care, I dont know what the hell you are trying to do”. :(

    • http://www.optimalworks.net/ Craig Buckler

      This article discusses HTML code to content ratios — using minimal mark-up will always be the best solution.

      Code such as PHP is another matter. In general, the shorter and simpler the application, the better. However, you should also take other considerations into account such as extendability, reuse, maintenance, etc.

  • Powers

    I think there are more important things to worry about than this “text to code ratio” thingy thing. If you are a decent programmer (not even an expert) you will know how to minimize your markup…HTML/CSS. I don’t even know if it has much to do with optimization but I minimize my coding due to time efficiency and readability. Usually when I write a chunk of code I review it even if it is 100% working with not the slightest of errors and check how I can reduce it. If you follow this practice you will naturally have a good ratio, even if it’s time consuming at first it will eventually save you time in future projects.

  • cadav3r

    i think that a higher text/code ratio is better.