OK I'm confused. I need rel="canonical" Headers in htaccess pointing to PDF

So I have an exact PDF and HTML duplicate for each file. I want to say the PDF should be the preferred indexed file in the Search results. AKA the PDF should be canonical. Right? All the examples I find seem to be saying the opposite. But I don’t know 100% so I cant confirm one way or the other. Thanks for any light. FYI I have 100 PDFs so htaccess headers is prob best.

Apply rel="canonical" to PDFs

<Files download.pdf>
	Header add Link '<http://www.tomanthony.co.uk/httest/pdf-download.html>; rel="canonical"'
</Files>

OK would it look like this in the htaccess? If I wanted to say ignore the html and prefer the pdf? If so, do I need this bit… Files “abc-name.html”? And if so, why is it just a relative to nowhere link? Why not absolute? Hopefully someone knows all these answers. Cuz I’m not putting this in my htaccess until I know 100% what its doing.

<Files “abc-name.html”>

Header and Link “<htttp://www.site.com/folder1/folder2/abc-name.pdf>; rel=”canonical””

</Files>

lol even comparing these two examples there are coding differences. I have not ben able to find two pages on the web that show the same.

I wouldn’t consider an HTML page and a PDF file to be the same. AFAIK, canonical is for when a page can be accessed by different paths. eg.

.../2015/december/blue-sprockets.php 
.../archived/blue-sprockets.php 
.../category/blue/blue-sprockets.php 
.../tag/sprockets/blue-sprockets.php 
.../blue-sprockets.php 

i.e. “…/blue-sprockets.php” would be the canonical.

Hello. I’m no expert, but I think duplicate content (as in word for word in this case) is a case for canonical. Here is the site in question. http://goo.gl/Ye8dJs. Expand the folders and let me know what you think - if canonical should be used. I’m ok either way. Just thought i’d go the extra seo step this time.

I wouldn’t worry about doing it for any supposed SEO reasons, it isn’t like you’ll get a penalty if you don’t. They are different files. Would you actually want to risk having some user agents use one and not the other regardless of what the HTTP request was for? Much easier to simply take down the one you don’t want.

IMHO, you could better spend your time writing the next bit of content.

If you want the PDF to be appearing as the original form of content in search results, the HTML part should have a canonical tag which highlights the PDF version in there. For example, your HTML part should have a canonical tag like below,

<link rel="canonical" href="http://www.abc.com/examplefile.pdf" />

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.