SitePoint Sponsor

User Tag List

Results 1 to 3 of 3
  1. #1
    SitePoint Enthusiast
    Join Date
    Apr 2005
    Posts
    51
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)

    Keeping robots out of virtual domain subdirectory

    I'm hosting two sites on a shared hosting provider. One lives in the base-level public html directory, the other is in a directory immediately inside that base level directory (virtual site, virtual directory). There are no relative links between the two sites - any links between the two are absolute.

    What I want is for robots NOT to crawl into the secondary (inner directory) FROM the main directory, but instead go to it directly via its url. I'd prefer not to have pages indexed like:

    http://www.mainsite.com/secondarysite/index.html

    but rather

    http://www.secondarysite.com/index.html

    That make sense?


    Can I just enter a line in the main site's robots.txt to exclude robots from /secondarysite/, and then have a robots.txt file inside the secondarysite folder to deal with its own incoming crawls? I figure this isn't really a "second" robots.txt file, since it's in its own domain. Robots should still arrive independently at the secondarysite folder, regardless of settings made in the main site's robots file.

    Right?

    - Bob

  2. #2
    SitePoint Enthusiast
    Join Date
    Jun 2005
    Posts
    61
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Just some thoughts off the top of my head:

    Firstly, if there are no relative links from the primary site to the secondary, then robots wouldn't find the directory from the primary site.

    Secondly, just to cover all bases, why not put a .htaccess file for a permanent redirect to the secondary site for all pages within the sub-domain?

    Example would be that http://www.mainsite.com/secondarysite/index.html woud automatically resolve to http://www.secondarysite.com/index.html and so on and so forth.

  3. #3
    SitePoint Enthusiast
    Join Date
    Aug 2006
    Posts
    46
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    You can just put a robots.txt file in the second site and block all SE spiders from that folders files but then of course it will never rank for anything.
    Learn to Play Anything Song in Minutes Online Lessons


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •