I knew it was coming... we have a nice website with core content + text& explanations around it. 90% of the site is in pure HTML for SEO purposes. So far 100% of the site is in English. I have been asked to modify the site in order to make it accessible in French and Spanish: each page has to be duplicated 2 times with no change in the core content (English) BUT everything else (name of the sections, commands, introduction, footer...) etc. has to be in french or spanish. Is there a way to do that so that the site is still SEO friendly? I'm worried it will trigger duplicate content problems with Google because the core content will be the same on 3 pages... BTW I cannot create a subdomain, I cannot translate the core content in those two languages, I cannot use localization function in PHP.. etc. Is googlebot able to index/cache the pages in a different way if I just configure the page as another language? Not sure how they handle those hybrid english+1 other language pages...
What you could do is use your robots.txt to disallow search engines to spider specific areas of your site. Let's say your French pages are in the directory "french" and your Spanish pages are in "spanish" in your domain root. Then you would use the following code in your robots.txt (should to be located in the domain root too; e.g. www.site.com/robots.txt) to block this directory from being spidered: User-agent: * Disallow: /french/ Disallow: /spanish/ Code (markup): Google won't spider those directories then, so you won't have to worry about duplicate content. Of course you can play around with it to show the introductory pages to the robots. You will find a lot of more details and examples on Google, just search for "robots.txt".
I have thought about that. But the problem is that it will prevent all the pages in a foreign language to be indexed... And for example I'd like to make the pages in French available on google.fr etc. The only solution I've found so far is to make the portion of the pages English text available as a image only. I think I will end up writing a script to convert the English text into a svg and then put the svg on the pae with the foreign content. It will reduce the accuracy of the indexing of course...