SitePoint Sponsor

User Tag List

Results 1 to 3 of 3
  1. #1
    SitePoint Zealot
    Join Date
    Oct 2012
    Posts
    137
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)

    Salvaging links from WMT “Crawl Errors” list

    When someone links to your website, but makes a typo while doing it, those broken inbound links will show up in Google Webmaster Tools in the Crawl Errors section as “Not Found”. Often they are easy to salvage by just adding a 301 redirect in the htaccess file.

    But sometimes the typo is really weird - and that's what I need your help with.

    If it is something easy, like they just lost the last part of the URL, ( such as www.mydomain.com/pagenam ) then I fix it in htaccess this way:

    RewriteCond %{HTTP_HOST} ^mydomain\.com$ [OR]
    RewriteCond %{HTTP_HOST} ^www\.mydomain\.com$
    RewriteRule ^pagenam$ "http\:\/\/www\.mydomain\.com\/pagename\.html" [R=301,L]

    But what about when the last part of the URL is really screwed up? Especially with non-text characters, like these:
    www.mydomain.com/pagename.htmlsale
    www.mydomain.com/pagename.htmlhttp://
    www.mydomain.com/pagename.html%22
    www.mydomain.com/pagename.html/

    How is the htaccess Rewrite Rule typed to deal with these oddballs?

    Your help is greatly appreciated. Thanks!

    Greg

  2. #2
    Programming Team silver trophybronze trophy
    Mittineague's Avatar
    Join Date
    Jul 2005
    Location
    West Springfield, Massachusetts
    Posts
    17,154
    Mentioned
    190 Post(s)
    Tagged
    2 Thread(s)
    I think writing a rule for every conceivable typo is impossible and even attempting would add bloat to your htaccess file;

    Couldn't you do something like
    Code:
    # PAGE (folder) NOT FOUND
    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} !-d
    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} !-f
    RewriteRule ^.$ /notreal [L]
    
    ErrorDocument 404 /error.php?error=404
    where "/notreal" is something you know doesn't and won't ever exist
    and "error.php" is a page that informs of the error and provides some help eg. a site map, site search input, etc.

  3. #3
    SitePoint Zealot
    Join Date
    Oct 2012
    Posts
    137
    Mentioned
    2 Post(s)
    Tagged
    0 Thread(s)
    Hello Mittineague,

    I do get your point, and don't intend to redirect them ALL. Fortunately we do not have a big pile of these. But some of these broken links are from good sites and I would like to salvage them for SEO purposes.

    So it would really help to know how to deal with these broken links that have typo errors in the URL, especially non-text characters at the very end (after the .html)

    Can you provide some tips on those?

    Thanks!


Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •