SitePoint Sponsor

User Tag List

Results 1 to 5 of 5
  1. #1
    SitePoint Enthusiast phoebe2's Avatar
    Join Date
    Jan 2005
    Location
    Anchorage
    Posts
    90
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Question Robots.txt not working

    I noticed that Google has indexed and shows all of my .pdf documents on my website.

    I didn't want them to be accessed this way, so I put them in a special folder called:
    /graph-paper/
    and told my robots.txt file this:
    User-agent: *
    Disallow: /graph-paper/
    Did I do it wrong?

  2. #2
    CSS & JS/DOM Adept bronze trophy
    Join Date
    Mar 2005
    Location
    USA
    Posts
    5,482
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    It may take a few weeks for those PDF's to be removed from Google's index,
    We miss you, Dan Schulz.
    Learn CSS. | X/HTML Validator | CSS validator
    Dynamic Site Solutions
    Code for Firefox, Chrome, Safari, & Opera, then add fixes for IE, not vice versa.

  3. #3
    SitePoint Zealot
    Join Date
    Mar 2005
    Posts
    140
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    You can get indexed pages removed from index within 24 h. Google has instructions..
    Robots.txt needed as above. +other things

  4. #4
    CSS & JS/DOM Adept bronze trophy
    Join Date
    Mar 2005
    Location
    USA
    Posts
    5,482
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    We miss you, Dan Schulz.
    Learn CSS. | X/HTML Validator | CSS validator
    Dynamic Site Solutions
    Code for Firefox, Chrome, Safari, & Opera, then add fixes for IE, not vice versa.

  5. #5
    Robert Wellock silver trophybronze trophy xhtmlcoder's Avatar
    Join Date
    Apr 2002
    Location
    A Maze of Twisty Little Passages
    Posts
    6,316
    Mentioned
    60 Post(s)
    Tagged
    0 Thread(s)
    I assume you have the robots.txt in your root, I also expect you have hyperlinks to the PDF files from an indexed page that could be the reason they got caught.

    User-agent: *
    Disallow: /graph-paper/

    Is fine assuming you have a directory named that; perhaps you could also use the following type of meta on the pages that link to that directory.
    <meta name="robots" content="noindex, nofollow" />


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •