Information Silos are Bad for the Web

Josh Catone

Last week we wrote that if the future of computing lay in closed web platforms like Facebook’s and the iPhone’s, it would be bad for users and the web at large. The same thing could certainly be said for web sites as well — closed information silos are bad for the web.

Today, Tim O’Reilly wondered if self linking is on its way to becoming the norm across the web. O’Reilly pointed to media sites like TechCrunch and the New York Times, which less and less often link outside of their own property lines. TechCrunch, for example, links to a site it owns called CrunchBase for further information about the companies it covers.

As far as I know, gadget blog Engadget pioneered the internal linking strategy that many blogs nows use, whereby links almost always refer to other content on the site rather than send users outside to the rest of the web. In this post for example, about Google’s “Free the Airwaves” initiative, Engadget links to a tag search on their site for “whitespace” and three previous posts (two linked from organization names, rather than linking directly to those organizations). The only link to Google’s actual blog post about the subject — arguably the most important link — appears at the very bottom of their post with a tiny “Read” link.

Internal linking is seen at large media companies as well, and even at search engines. Google, for example, was recently seen giving its Merchant Search home loan comparison site, special treatment in the UK. They tried something similar a couple of years ago in the US called “Google Tips” that would push links to Google content first and foremost. (Tips was pulled just a few weeks after it debuted.)

The reason for this type of internal linking is easy to fathom: the more you can keep people on your site, the more money you can make off them by serving them advertising. But it can also be dangerous for the web, as it leads to the creation of information silos that decrease the utility of the web for users by exposing them to only one point of view and data source.

“When this trend spreads (and I say ‘when’, not ‘if’), this will be a tax on the utility of the web that must be counterbalanced by the utility of the intervening pages,” says O’Reilly. “If they are really good, with lots of useful, curated data that you wouldn’t easily find elsewhere, this may be an acceptable tax. In fact, they may even be beneficial, and a real way to increase the value of the site to its readers. If they are purely designed to capture additional clicks, they will be a degradation of the web’s fundamental currency, much like the black hat search engine pages that construct link farms out of search engine results.”

O’Reilly lays out two rules for anyone to consider before linking to previous content or special made content on their own site:

  1. Ensure that no more than 50% of the links on any page are to yourself. (Even this number may be too high.)
  2. Ensure that the pages you create at those destinations are truly more valuable to your readers than any other external link you might provide.

My own link policy for blog posts at SitePoint (adapted from the policy in place at ReadWriteWeb while I was there) is something like this:

  1. Always link out to as many sources as you can to provide the reader with further reading and to strengthen your argument.
  2. Only link internally if the article provides the context you are looking for (i.e., don’t link to yourself solely to give your content a link — only do it when it makes sense).
  3. Never link to an article about a company rather than the company itself — readers don’t want to click on a company name and be brought to previous coverage, they’re clicking on that link to get to the company’s web site, so get them there. (For larger companies, like Google, or companies that are mentioned only in passing and aren’t a focus of the post I may forgo linking altogether.)

How much internal linking you do is important for any site that brokers in information to consider. Information silos are bad for the web and web sites should link to each other as much as possible, rather than just to themselves.