If I remove the index.html in a directory and then link to the directory from one of my main pages, will google bot go to all the pages linked in that directory page even though there are no direct links to those pages anywhere on my actual site? i.e. I link to mysite.com/archive/ which has no index.html in it so it just displays that apache screen that lists all the other .html pages and files listed in the directory will googlebot go to those orphan pages? Thanks a lot.
if there are pages which you dont want google to follow, i suggest you to add them to your robots.txt file
no I do want them to follow it but i am wondering if google will follow it. I dont want to risk them not being indexed. I thought it might be a problem since each page does not have direct link path its only in a directory that is linked to.
I believe they'll be indexed regardless of the index.html page being in place or not, but you should double check that statement.
I think it will index them, but why don't you just put up a simple page with links to all urls in the folders? just to be sure.
well it would be really cumbersome since its an automatically created archive folder. but it doesnt matter now. I got links to them . THanks for your help though guys.
If they were indexed/visted before then googlebot will keep visiting the pages. To tell googlebot to not visit anymore: 1) use robots.txt 2) or use the meta tag robots