Salvaging links from WMT “Crawl Errors” list

When someone links to your website, but makes a typo while doing it, those broken inbound links will show up in Google Webmaster Tools in the Crawl Errors section as “Not Found”. Often they are easy to salvage by just adding a 301 redirect in the htaccess file.

But sometimes the typo is really weird - and that’s what I need your help with.

If it is something easy, like they just lost the last part of the URL, ( such as ) then I fix it in htaccess this way:

RewriteCond %{HTTP_HOST} ^mydomain\.com$ [OR]
RewriteCond %{HTTP_HOST} ^www\.mydomain\.com$
RewriteRule ^pagenam$ “http\:\/\/www\.mydomain\.com\/pagename\.html” [R=301,L]

But what about when the last part of the URL is really screwed up? Especially with non-text characters, like these:"

How is the htaccess Rewrite Rule typed to deal with these oddballs?

Your help is greatly appreciated. Thanks!


I think writing a rule for every conceivable typo is impossible and even attempting would add bloat to your htaccess file;

Couldn’t you do something like

# PAGE (folder) NOT FOUND
RewriteRule ^.$ /notreal [L]

ErrorDocument 404 /error.php?error=404

where “/notreal” is something you know doesn’t and won’t ever exist
and “error.php” is a page that informs of the error and provides some help eg. a site map, site search input, etc.

Hello Mittineague,

I do get your point, and don’t intend to redirect them ALL. Fortunately we do not have a big pile of these. But some of these broken links are from good sites and I would like to salvage them for SEO purposes.

So it would really help to know how to deal with these broken links that have typo errors in the URL, especially non-text characters at the very end (after the .html)

Can you provide some tips on those?