robots.txt

gayc Well-Known Member

Messages:: 533

Likes Received:: 9

Best Answers:: 0

Trophy Points:: 108

#1

Hi

Please excuse what is probably going to be a naive question but I have never used robots.txt before.

I noticed a few times where google came looking for a robots.txt (which I don't have) and then left, ie it never looked at any other pages.

So I decided to have a robots.txt (see below).

One thing I would like to do is exclude some directories that are called 'data' of which there are various eg

holidays/data/
uk/data/
swimming/data/

Can I exclude them in one line or do I risk other directories in the holiday swimming or UK directories being excuded?

eg disallow: /*/data/

Any advice is very welcome.

Thanks, Gay

User-agent: *
Disallow: /cgi-bin/
Disallow: /_borders/
Disallow: /_derived/
Disallow: /_fpclass/
Disallow: /_overlay/
Disallow: /_private/
Disallow: /_themes/
Disallow: /_vti_bin/
Disallow: /_vti_cnf/
Disallow: /_vti_log/
Disallow: /_vti_map/
Disallow: /_vti_pvt/
Disallow: /_vti_txt/

gayc, Jan 27, 2007 IP

kh7 Peon

Messages:: 2,715

Likes Received:: 109

Best Answers:: 0

Trophy Points:: 0

#2

Hi - I have not yet used robot.txt either, so I can't help you with that.

I do know that Google will index your site in its own time (unfortunately). It will not index your site faster if you have a robot.txt file. It already knows your site exists, or it would not have asked for your robot.txt file. That's a start. You probably know the refrain: get links, get visitors, get links.

kh7, Jan 27, 2007 IP

Diether Peon

Messages:: 278

Likes Received:: 4

Best Answers:: 0

Trophy Points:: 0

#3

Google understands wildcards, so I think you can do it.
But you can also upload an empty index.html file into those directories to prevent google and the other search engines from seeing the content of those pages.
As far as I know google is the only SE that uses those wildcards (but I'm not 100% sure of this though).

Diether, Jan 27, 2007 IP

kh7 Peon

Messages:: 2,715

Likes Received:: 109

Best Answers:: 0

Trophy Points:: 0

#4

As long as there are no links to those directories, google isn't likely to index them anyhow. Even if they were indexed, they would not be likely to rank either.

kh7, Jan 27, 2007 IP

sqeeze Peon

Messages:: 169

Likes Received:: 5

Best Answers:: 0

Trophy Points:: 0

#5

www.robotstxt.org - everything you need to know about robots.txt with a complete list of robots that you may allow or disallow.

sqeeze, Jan 27, 2007 IP

gayc Well-Known Member

Messages:: 533

Likes Received:: 9

Best Answers:: 0

Trophy Points:: 108

#6

Many thanks everyone.

I will start with something simple.

gayc, Jan 28, 2007 IP

Log in or Sign up

robots.txt

gayc Well-Known Member

kh7 Peon

Diether Peon

kh7 Peon

sqeeze Peon

gayc Well-Known Member

Useful Searches