SitePoint Sponsor

User Tag List

Page 2 of 2 FirstFirst 12
Results 26 to 38 of 38
  1. #26
    Error 404: Life not found silver trophybronze trophy
    Join Date
    Dec 2007
    Location
    UK Nr Manchester
    Posts
    3,460
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by stymiee View Post
    mod_rewrite won't affect that. Either the content is duplicated or it isn't. The URL won't be a factor in determining that.
    I am confused all over the place.........

    In the Sitepoint SEM kit, written by the owners of this forum, it states quite clearly that dynamic URLs with parameters that can have more than one value create the risk of a bot creating multiple versions of the same page by following all permutations of the dynamic links.

    [Section: Duplicate Content p104, Advanced SEO and se friendly design)

    So the content doesn't exist until the spider creates it on the fly when it tries to crawl every link. I can't delete content that isn't there. I can only use mod_rewrite or the htaccess file to change the URLs to more se friendly versions by stopping the spider trying index multiple parameter value pages.

    Right?

  2. #27
    SitePoint Wizard bronze trophy bigalreturns's Avatar
    Join Date
    Mar 2006
    Posts
    1,295
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    It's up to you as a webmaster to ensure that duplicate content doesn't exist within your site. This can happen with or without rewritten urls.
    If you're using dynamic, index.php?id=1 may be the same as index.php?id=2, and they'll be seen as dupes. But using rewritten URLs, such as index/1 and index/2 won't eliminate the problem whatsoever.
    Having said all this, I don't think duplicate content is the issue it was once thought to be. Google have stated that if they see duplicate content, they make a decision (based on some combination of age, links etc.) as to which page was the "original", and which is the "duplicate". The dupe gets thrown in supplemental, the original is indexed normally, with no negative effects.
    The only issue really is if people start linking to the page Google sees as the duplicate, as these are more or less wasted to you. So, duplicate content in itself won't hurt you, but it can mean it's more difficult to achieve better rankings, as links are shared between different versions of the same page, some of which won't be counted.
    In summary, eliminate duplicate content, but don't rely on rewritten URLs to do this for you!
    "The proper function of man is to live - not to exist."
    Get a Free TomTom


  3. #28
    Error 404: Life not found silver trophybronze trophy
    Join Date
    Dec 2007
    Location
    UK Nr Manchester
    Posts
    3,460
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by bigalreturns View Post
    It's up to you as a webmaster to ensure that duplicate content doesn't exist within your site. This can happen with or without rewritten urls.
    If you're using dynamic, index.php?id=1 may be the same as index.php?id=2, and they'll be seen as dupes. But using rewritten URLs, such as index/1 and index/2 won't eliminate the problem whatsoever.
    Having said all this, I don't think duplicate content is the issue it was once thought to be. Google have stated that if they see duplicate content, they make a decision (based on some combination of age, links etc.) as to which page was the "original", and which is the "duplicate". The dupe gets thrown in supplemental, the original is indexed normally, with no negative effects.
    The only issue really is if people start linking to the page Google sees as the duplicate, as these are more or less wasted to you. So, duplicate content in itself won't hurt you, but it can mean it's more difficult to achieve better rankings, as links are shared between different versions of the same page, some of which won't be counted.
    In summary, eliminate duplicate content, but don't rely on rewritten URLs to do this for you!
    Great info, thanks.

    So if someone sorted a category say, by price, then linked to that URL, and someone else did the same but sorted that category by weight before linking to it, Google might see the two links to two different URLs but see that the content is the same and ignore them both?

    I was thinking along the lines of using mod_rewrite to eliminate URL combinations which could create a dup content issue all together and only show the same url no matter how that category is sorted, or at least tell the spider not to index them using instructions in the htaccess file.

    That way, people could sort and backlink as much as they wanted and the URL would always be the same right?

  4. #29
    He's No Good To Me Dead silver trophybronze trophy stymiee's Avatar
    Join Date
    Feb 2003
    Location
    Slave I
    Posts
    23,424
    Mentioned
    2 Post(s)
    Tagged
    1 Thread(s)
    Quote Originally Posted by JJMcClure View Post
    In the Sitepoint SEM kit, written by the owners of this forum,
    Actually it is written by someone who is not a SitePoint employee (at least that I know of).

    Quote Originally Posted by JJMcClure View Post
    it states quite clearly that dynamic URLs with parameters that can have more than one value create the risk of a bot creating multiple versions of the same page by following all permutations of the dynamic links.
    That's why you have to be sure to prevent that from happening and prevent the duplicate content from being an issue. That's a development issue and not really a query string issue. The query string isn't really scrutinized by the search engines. It's the content on the page that matters.

  5. #30
    Error 404: Life not found silver trophybronze trophy
    Join Date
    Dec 2007
    Location
    UK Nr Manchester
    Posts
    3,460
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by stymiee View Post
    That's why you have to be sure to prevent that from happening and prevent the duplicate content from being an issue. That's a development issue and not really a query string issue. The query string isn't really scrutinized by the search engines. It's the content on the page that matters.
    Sorry if I'm going in circles here but isn't that what mod_rewrite and the htaccess file can be used for? To simplify query strings?

    For example, I have a dynamic site that needs SEOing, a client's actually who's agreed to let me have a play because I'm doing it for free, and it uses out of the box Directory software.

    The URLs have several parameters many of which can have more than one value, and the guys who wrote the software say that they have no SEO advice to give although in Jan a mod_rewrite capable version is coming out.

    I'll probably use the htaccess file to write rules that will prevent the spiders from trying to index pages that would just be duplicate content.

    Is that a correct application of what you're saying, I'm avoiding the issue of duplicate content in the first place?

  6. #31
    He's No Good To Me Dead silver trophybronze trophy stymiee's Avatar
    Join Date
    Feb 2003
    Location
    Slave I
    Posts
    23,424
    Mentioned
    2 Post(s)
    Tagged
    1 Thread(s)
    Quote Originally Posted by JJMcClure View Post
    Sorry if I'm going in circles here but isn't that what mod_rewrite and the htaccess file can be used for? To simplify query strings?
    It can simplify query strings but it won't stop the duplicate content which is the primary issue. Making the query string look different doesn't change the ultimate results.

    Quote Originally Posted by JJMcClure View Post
    I'll probably use the htaccess file to write rules that will prevent the spiders from trying to index pages that would just be duplicate content.

    Is that a correct application of what you're saying, I'm avoiding the issue of duplicate content in the first place?
    That is exactly what you want to do. Although Ideally the software wouldn't create duplicate content at all. Even blocking the duplicate content still doesn't stop the "holes" from being in your directory which is not a good thing to have and ultimately hurts you.

  7. #32
    Error 404: Life not found silver trophybronze trophy
    Join Date
    Dec 2007
    Location
    UK Nr Manchester
    Posts
    3,460
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by stymiee View Post
    That is exactly what you want to do. Although Ideally the software wouldn't create duplicate content at all. Even blocking the duplicate content still doesn't stop the "holes" from being in your directory which is not a good thing to have and ultimately hurts you.
    Great, thanks. Don't know how to do that yet but I'll figure it out I'm sure. I can't change the scripts unfortunately, they're proprietary, but after Jan I will be able to use mod_rewrite.

    Here's an actual URL from the site for a page that got indexed on the 5th and then dropped out of the index on the 10th.....

    xxxxxxxxx.com/index.php?module=company&pId=102&start=0

    I have no idea why this would get indexed and then removed so quickly, do you?

  8. #33
    He's No Good To Me Dead silver trophybronze trophy stymiee's Avatar
    Join Date
    Feb 2003
    Location
    Slave I
    Posts
    23,424
    Mentioned
    2 Post(s)
    Tagged
    1 Thread(s)
    Quote Originally Posted by JJMcClure View Post
    Great, thanks. Don't know how to do that yet but I'll figure it out I'm sure. I can't change the scripts unfortunately, they're proprietary, but after Jan I will be able to use mod_rewrite.

    Here's an actual URL from the site for a page that got indexed on the 5th and then dropped out of the index on the 10th.....

    xxxxxxxxx.com/index.php?module=company&pId=102&start=0

    I have no idea why this would get indexed and then removed so quickly, do you?
    1) Low PR

    2) If it is a newer site it will take time for Google to fully crawl and index the site which will affect how much content is found in their index.

  9. #34
    SitePoint Evangelist
    Join Date
    Nov 2007
    Posts
    472
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    if the site is old u have traffic don't rewrite, bcz cached page will show broken link.. if its new u can . but no use in SEO

  10. #35
    Error 404: Life not found silver trophybronze trophy
    Join Date
    Dec 2007
    Location
    UK Nr Manchester
    Posts
    3,460
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by stymiee View Post
    1) Low PR

    2) If it is a newer site it will take time for Google to fully crawl and index the site which will affect how much content is found in their index.
    You think that page got indexed then de-indexed because of PR? If the PR is low, which it will be because it's a new site, why did it get indexed in the first place?

    At this point I'm still watching to see how the bots react to it now that they can find it. I haven't changed anything about the link structure yet.

  11. #36
    He's No Good To Me Dead silver trophybronze trophy stymiee's Avatar
    Join Date
    Feb 2003
    Location
    Slave I
    Posts
    23,424
    Mentioned
    2 Post(s)
    Tagged
    1 Thread(s)
    Quote Originally Posted by JJMcClure View Post
    You think that page got indexed then de-indexed because of PR? If the PR is low, which it will be because it's a new site, why did it get indexed in the first place?
    PR fluctuates although you can't see it. Plus it is possible other factors affect it as well. But increasing the PR of that page will surely solve the problem regardless of the cause (unless there are other issues like duplicate content).

  12. #37
    Error 404: Life not found silver trophybronze trophy
    Join Date
    Dec 2007
    Location
    UK Nr Manchester
    Posts
    3,460
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    Quote Originally Posted by stymiee View Post
    PR fluctuates although you can't see it. Plus it is possible other factors affect it as well. But increasing the PR of that page will surely solve the problem regardless of the cause (unless there are other issues like duplicate content).
    Cheers, need to take that one away and worry it for a bit...

  13. #38
    SitePoint Addict
    Join Date
    Nov 2007
    Location
    California
    Posts
    357
    Mentioned
    0 Post(s)
    Tagged
    0 Thread(s)
    This happened to me too. I just got more backlinks to my site, did directory submissions and all, and I got my site indexed back in less than a week.


Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •