Wordpress Theme - Balance Transfer Credit Cards - Debt Consolidation - Customer services - Whoooop

PDA

View Full Version : robot.txt help


ashiezai
May 31st 2005, 8:30 am
Hi there, im running a link exchange directory and i think it is being hit by the dup filter ...

Currently, the directory generated by the script i use (duncan carver's LMA) is dropped by google ..

Basically the url looks like this
http://www.xxx.com/directory/Alternative/index.html

But when i do a site: command in google and found that the indexed page is
http://www.xxx.com/directory/Alternative/ (without index.html)
And the problem is that without the index.html the page is empty... that is im having all empty pages indexed and finally hit the dup filter..

I've checked all of the links in the page generated by the script is ending with index.html .. i do not know that the version without index.html is indexed :confused:

Is that any way to prohibit the google bot to crawl the page without index.html using robot.txt or .htaccess?

Thanks in advance for any help

noppid
May 31st 2005, 11:16 am
Use robots.txt, not robot.txt and yes you can limit access there.

ashiezai
May 31st 2005, 6:47 pm
But i do not know the code .. can any1 help me ?

I've searched for it but couldnt get anything ..

i want directory/a/index.html to be indexed but not directory/a/

All tutorials i found out doesnt do this.

It's that possible that i 301 redirect them ?