My web designer has put a PDF option on all my web pages and each one directs to a separate web page with exactly the same content in printable format. He assures me that google will recognise this for what it is and not penalise me for duplicate content but I am not convinced. see example with pdf button in right hand corner of page (not allowed links yet) www.enduroturf.com.au/brisbane Does anyone know the answer to this or if there is way of me being able to verify whether Google see this as duplicate content?
First know that I have seen sites who serve both html and pdf versions and seem to do fine.. Google's stance: read more read more There you have it!! In my opinion, Robots.txt isn't neccessarily the best option or ONLY option. You can also canonicalize ALL print-friendly versions so that Google knows exactly which version was the "original source" of the article. A robots file blocking PDF's would look something like this: User-agent: Googlebot Disallow: /*.pdf$
Thanks Brad There is lots of great info there that I will go over, as I am noticing that in some circumstances that Google is presenting the PDF page over the HTML or presenting both options which concerns me a bit. I have discussed taking down the PDF's altogether with my web guy but he seem to think this would negatively impact my page count?