What is the best way to remove old pages from the Google index? I have a big site in the Coop that I needed to remove lots of pages from and now it will not validate. The pages removed were all .php and the rest of my files still live are .html. Should I: (a) Do a redirect of all .php pages in htaccess to the home page? (b) Submit lots of removal requests to Google? (c) Wait for the spider to come look for the old pages and drop them naturally? Or something else?
Whether you use the meta-tag or the robots.txt method at the link Shawn provided, be aware that both are only temporary - for 180 days. After that period you need to resubmit the pages or robots.txt file to get another 180 day exclusion, though I have in the past that resubmitting pages didn't work. I have yet to find a way of permanently removing pages from Google's index. I've tried 301 redirecting pages, returning 410 ( Gone ) headers, and of course robots.txt and meta-tags. The pages in question don't return in natural SERPs ( as opposed to site:domain.com searches ) but I guess that won't help if you're trying to validate for the co-op. cheers John