My question is how to avoid being penalized by Google for “duplicate content”.
This relates to my Subsection page which lists hundreds of “article summaries”.
To make things more manageable, I added Sorting and Pagination, and so now things look like this…
www.debbie.com/finance/economy/?sortname=by-date&sortdir=desc&page=1
www.debbie.com/finance/economy/?sortname=by-date&sortdir=desc&page=2
www.debbie.com/finance/economy/?sortname=by-date&sortdir=desc&page=3
www.debbie.com/finance/economy/?sortname=by-date&sortdir=desc&page=4
www.debbie.com/finance/economy/?sortname=by-date&sortdir=desc&page=5
To address the “duplicate content” issue, I have taken these steps so far…
Step #1: Changed the URL from a directory structure style…
www.debbie.com/finance/economy/by-date/desc/5
to a URL with a Query String…
www.debbie.com/finance/economy/?sortname=by-date&sortdir=desc&page=5
Step #2: Used PHP to dynamically add in rel=“prev” and rel=“next” meta-tags to each page…
<!-- Page Relationships -->
<link rel='prev' href='http://local.debbie/finance/economy/?sortname=by-date&sortdir=desc&page=2'>
<link rel='next' href='http://local.debbie/finance/economy/?sortname=by-date&sortdir=desc&page=4'>
So far, so good…
However, where I am confused is this…
What needs to be done so that Google doesn’t penalize me when a page is similar (or the same) because of Sorting?
Originally, I was going to try and implement rel=canonical, but things get rather tricky when you consider these pages also include pagination!
According to Google’s 5 common mistakes with rel=canonical, you should NOT use rel=canonical on the first page of a paginated series.
As far as I can tell, using rel=canonical would cancel out using rel=“prev” and rel=“next” in my particular situation.
The best idea that I can come up with is to NOT use rel=canonical, but instead use Google’s Webmaster Tool thing, and define that I want the Googlebot to ignore the parameters sortname and sortdir.
What do you think?
Sincerely,
Debbie