Hey everyone, If I wanted to allow robots only to the index.html page on one of my sites (and disallow them to all the other pages on that site), then would I put the robots.txt file in the public_html directory and do this... User-agent: * Allow: /index.html Disallow: / Or would I put the robots.txt file *above* the public_html directory and do this... User-agent: * Allow: public_html/index.html Disallow: public_html/
The bots don't see the folder structure of your hosting account. They only see folder structure of your domain. So it would be the first option.
Best is the first option. disallow all other pages and allow only the index page. You can do the same from other pages header tag by specifically saying noindex, nofollow and index, follow for the index.html