SitePoint Sponsor

User Tag List

Results 1 to 6 of 6

Thread: Web crawl errors

  1. #1
    Web Enthusiast
    Join Date
    Jul 2000
    Location
    Western Massachusetts, USA
    Posts
    1,329
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Web crawl errors

    I am getting a huge number of crawl errors from Google Webmaster tools, for example,

    Furthermore, my favorite link checker Link Sleuth, only crawls the first page.

    The site seems to have generated errors from the beginning.

    Here is my .htaccess file:

    Code:
    #suPHP_ConfigPath /home/berkshi1
    DirectoryIndex index.php
    
    #to redirect urls from old to new urls
    Redirect 301 /pages/services.htm http://berkshiredentist.com/patient-services/
    Redirect 301 /pages/home.htm http://berkshiredentist.com/
    Redirect 301 /pages/library.htm http://berkshiredentist.com/patient-library/
    Redirect 301 /pages/contact.htm http://berkshiredentist.com/contact-us/
    Redirect 301 /pages/about.htm http://berkshiredentist.com/about-us/
    Redirect 301 /pages/links.htm http://berkshiredentist.com/61-2/
    Redirect 301 /pages/gallery.htm http://berkshiredentist.com/photo-gallery/
    Redirect 301 /pages/forms.htm http://berkshiredentist.com/new-patient-forms/
    Redirect 301 /pages/patient_library/veneers.htm http://berkshiredentist.com/porcelain-veneers/
    Redirect 301 /pages/patient_library/gumdisease.htm http://berkshiredentist.com/gum-disease-serious-but-treatable/
    Redirect 301 /pages/patient_library/diagnodent.htm http://berkshiredentist.com/diagnodent-laser-reflection-finds-imperfection-aids-in-correction/
    Redirect 301 /pages/patient_library/airabrasion.htm http://berkshiredentist.com/patient-library/
    Redirect 301 /pages/patient_library/implants.htm http://berkshiredentist.com/implants-permanent-secure-tooth-replacements/
    Redirect 301 /pages/patient_library/digitalxray.htm http://berkshiredentist.com/digital-x-rays/
    Redirect 301 /pages/special.htm http://berkshiredentist.com/
    Redirect 301 /pages/services.htm http://berkshiredentist.com/
    Redirect 301 /pages/faqs.htm http://berkshiredentist.com/
    
    # BEGIN WordPress
    <IfModule mod_rewrite.c>
    RewriteEngine On
    RewriteBase /
    RewriteRule ^index\.php$ - [L]
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule . /index.php [L]
    </IfModule>
    # END WordPress
    What am I missing?
    Paul C.
    ClickBasics
    http://www.clickbasics.com

  2. #2
    SitePoint Wizard bronze trophy C. Ankerstjerne's Avatar
    Join Date
    Jan 2004
    Location
    The Kingdom of Denmark
    Posts
    2,692
    Mentioned
    7 Post(s)
    Tagged
    0 Thread(s)
    The links in your source looks fine (although your source code does look a bit crowded). Which type of crawl errors are you getting?
    Christian Ankerstjerne
    <p<strong<abbr/HTML/ 4 teh win</>
    <>In Soviet Russia, website codes you!

  3. #3
    Mouse catcher silver trophy
    SitePoint Award Recipient Stevie D's Avatar
    Join Date
    Mar 2006
    Location
    Yorkshire, UK
    Posts
    5,105
    Mentioned
    66 Post(s)
    Tagged
    1 Thread(s)
    Jesus wept ... and that was before he looked at the source code of a Wordpress page.

    For some reason, there's a bunch of Javascript that seems to be screwing up your outbound links:
    Code:
    <a title="About Cosmetic Dentistry" href="http://www.aboutcosmeticdentistry.com/" onclick="javascript:_gaq.push(['_trackPageview','/yoast-ga/outbound-article/www.aboutcosmeticdentistry.com']);">About Cosmetic Dentistry</a>
    It looks like bots are trying to follow the link in the script snippet as well as, or instead of, the actual link destination. As that link is only there to track what people have clicked on, and doesn't actually resolve to the correct page but instead to a 404 error, it looks like that's why Googlebot is having trouble.

  4. #4
    Web Enthusiast
    Join Date
    Jul 2000
    Location
    Western Massachusetts, USA
    Posts
    1,329
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Very interesting about the Javascript code. It is not in the content HTML.

    HTML Code:
    <strong><a title="About Cosmetic Dentistry" href="http://www.aboutcosmeticdentistry.com/">About Cosmetic Dentistry</a></strong> - Learn about procedures, view photos and find out how you can beautify your smile with cosmetic dentistry.
    Turns out the Wordpress Google Analytics program was set to track outbound links. I disabled that, and now the Javascript is gone in the source view.

    Many thanks for pointing me in the right direction.
    Paul C.
    ClickBasics
    http://www.clickbasics.com

  5. #5
    SitePoint Member
    Join Date
    Oct 2012
    Posts
    1
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    How do you find the Wordpress Google Analytics program setting to disable it? I am having the same problem.

  6. #6
    Web Enthusiast
    Join Date
    Jul 2000
    Location
    Western Massachusetts, USA
    Posts
    1,329
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Frankly, it's been so long, I don't remember. I googled on
    google analytics event tracking outbound links
    and found a whole bunch of articles including this one.

    https://developers.google.com/analyt...ntTrackerGuide

    Hope that helps.
    Paul C.
    ClickBasics
    http://www.clickbasics.com

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •