hello there The G crawler has indexed pages from my backoffice! They were in a directory called /affiliate in the main dir for the site. Should this folder have been within the /private dir? And what can I do to fix this without getting penalized by G for loads of pages disappearing? Do I need to customize the 404 page? Should I contact G to remove that dir? What about robots.txt or htaccess? Thanks!
disallow that folder in your robots txt, and you can go to google's console to remove them from the index
http://services.google.com:8882/urlconsole/controller?cmd=reload&lastcmd=login Just be careful and double check each page you are submitting to get removed
just don't link to those pages and disallow google to access your folder and they should drop in a while
if the directory shows up in log files that are public on other sites it can still be crawled. using the robots.txt file or specifically setting an httpasswd on the directory is the best way to go (I'd recommend an htpasswd or other login function).