SitePoint Sponsor

User Tag List

Results 1 to 4 of 4
  1. #1
    SitePoint Wizard
    Join Date
    May 2002
    Posts
    1,370
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    XML sitemap - Google indexing less than submitted

    About a month back, I updated a sitemap which is being indexed total by G going on a couple years now.

    Though in June, 230 pages were submitted. It thereafter indexed 229 for several weeks until most recently, when viewing the sitemap info under Webmaster Tools, G started to leave something to the effect "Oops, something is wrong here... we are looking into this"

    Now I go in and 228 pages are indexed with no error message, like all is well & fine.

    How to find the pages being left off and what to do?

  2. #2
    SitePoint Mentor silver trophybronze trophy
    Mikl's Avatar
    Join Date
    Dec 2011
    Location
    Edinburgh, Scotland
    Posts
    1,608
    Mentioned
    66 Post(s)
    Tagged
    0 Thread(s)
    Just because a page is referenced in the sitemap, that doesn't mean it will always appear in Google's index. The page might be blocked, for example by robots.txt or a noindex attribute. Or it could be that the crawler was unable to access it for some reason.

    Have you checked "Crawl Errors" and "Blocked URLs" in Webmaster Tools?

    Mike

  3. #3
    SitePoint Wizard
    Join Date
    May 2002
    Posts
    1,370
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by Mikl View Post
    Have you checked "Crawl Errors" and "Blocked URLs" in Webmaster Tools?

    Mike
    Thanks for the reply Mike.

    No blocked urls but crawl errors -- looks like a couple external links chopped my urls. Didn't know this could be cause for the grand master saying that the two pages aren't being indexed...

    Wondering if G is changing the way they are reporting 404's.

    About 2 1/2 yrs ago the site was redone. I did the custom 404 page for a list of pages I felt shouldn't be 301's, only choosing those that most closely relate with the new pages - in an effort to try and not exploit the 301/overload with too many redirects. After this, well over a year later, G decides to start resurrecting those pages that naturally fell off their index as 404's, as if this is not what they want to hear. Maybe I should go back and do all 301's?

    Also, I thought there might some sort of penalty for overuse of the 301.

  4. #4
    SitePoint Mentor silver trophybronze trophy
    Mikl's Avatar
    Join Date
    Dec 2011
    Location
    Edinburgh, Scotland
    Posts
    1,608
    Mentioned
    66 Post(s)
    Tagged
    0 Thread(s)
    Datadriven,

    An incorrect external link won't prevent any pages from being indexed. Typically, these are links where the domain name is correct, but either the filename is wrong, or it's got some spurious characters tagged on the end. Google will report these in Webmaster Tools, but they won't do you any harm.

    Wondering if G is changing the way they are reporting 404's
    No, not that I've heard of. I'd be very surprised if they had.

    Also, I thought there might some sort of penalty for overuse of the 301
    Again, no. I've never heard of this, and it would be surprising if they did do any harm.

    If you're sure you haven't intentionally blocked any pages (on many sites, things like contact pages or privacy policies are intentionally blocked), your only option is to try to figure out which pages aren't indexed, which will be laborious and time-consuming. Alternatively, you can just not worry about it. You say you've got 228 out of 230 pages indexed. Chances are that's enough to bring you the traffic you want.

    Mike


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •