I'd like to work on a site, ONLINE, and keep it from being indexed until it's ready. How do I do this? I need just the home page/root indexed but nothing else. I'd like to avoid adding the META "NO" tags to every page if possible. Thank you! When I'm ready, I'd just like to remove whatever feature I'll be adding to do this, WITHOUT being penalized in anyway by the bots.
Use the robots.txt file, just block via it any web pages you don't want indexed. You should block the entire site until it's complete; this includes the index web page. Trust me, this is for good reasons. 1. Create in notepad a file called "robots.txt" 2. Add the below code inside of it. User-agent: * Disallow: / Code (markup): 3. Upload the robots.txt file to the root directory where your web pages go. When you want it to go live remove the / in front of the disallow line. Read more on robots.txt at www.robotstxt.org
Looks like www.robotstxt.org is unreachable today. You can also try http://en.wikipedia.org/wiki/Robots.txt for simple examples
Just have a dummy index.html page till your site is ready. The dummy index page will not have any links to your other pages. You can have a different name for your real index page -say index1.html. With this approach you can test your website till you are satisfied and publish it just by changing the name of the top html page. And if your site is indexed with the dummy page, you can get the benefit of having reduced stay in Google sandbox.
If no one needs to see it except you and whom you permit, use an .htaccess file. http://www.htaccess-guide.com/