robots.txt

waelthmastery Peon

Messages:: 64

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#1

Is there any difference between (1) and (2) if in my robots.txt, I put either

1) User-agent: Mediapartners-Google
Disallow:

OR

2) User-agent: *
Disallow:

waelthmastery, Oct 2, 2006 IP

ketan9 Active Member

Messages:: 548

Likes Received:: 9

Best Answers:: 0

Trophy Points:: 58

#2

waelthmastery said: ↑

1) User-agent: Mediapartners-Google
Disallow:
Click to expand...

Bans Google Adsense bots from crawling the pages

waelthmastery said: ↑

2) User-agent: *
Disallow:
Click to expand...

Bans all the bots from crawling the pages!!

ketan9, Oct 2, 2006 IP

Kaudo Peon

Messages:: 358

Likes Received:: 8

Best Answers:: 0

Trophy Points:: 0

#3

wealthmastery, why do you want to ban the mediapartner google bot from your pages? it helps you in crawling all of your pages with adsense implemented.

Kaudo, Oct 2, 2006 IP

waelthmastery Peon

Messages:: 64

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#4

Ooopsss... My initial intention is to WANT them to crawl my pages.

So what should be the syntax? As below?

User-agent: Mediapartners-Google
Allow:

2) User-agent: *
Allow:

So what if it is

User-agent: *
Disallow: /abc/ ??

I am really confused about teh syntax.

When is 'allow'?
When is 'disallow'?

waelthmastery, Oct 2, 2006 IP

Kaudo Peon

Messages:: 358

Likes Received:: 8

Best Answers:: 0

Trophy Points:: 0

#5

He knows much more than me who has just empty robots.txt files on every domain.
http://www.robotstxt.org/wc/robots.html

Kaudo, Oct 2, 2006 IP

waelthmastery Peon

Messages:: 64

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#6

That is where I got it from.

His one is

# Allow all
User-agent: *
Disallow:

Which means it bans all the bots? That is exactly teh same as my initial post
of option (2)

waelthmastery, Oct 2, 2006 IP

Kaudo Peon

Messages:: 358

Likes Received:: 8

Best Answers:: 0

Trophy Points:: 0

#7

User-agent: *
Disallow: /
Click to expand...

for disallowing them all otherwise theÂ´ll crawl your site.
If you wanna just allow them to come, leave the whole robots.txt file empty (but create one).

Kaudo, Oct 2, 2006 IP

ablaye Well-Known Member

Messages:: 4,024

Likes Received:: 97

Best Answers:: 0

Trophy Points:: 150

#8

Kaudo said: ↑

for disallowing them all otherwise theÂ´ll crawl your site.
If you wanna just allow them to come, leave the whole robots.txt file empty (but create one).
Click to expand...

You don't even need to create one.

ablaye, Oct 2, 2006 IP

Link.ezer.com Peon

Messages:: 647

Likes Received:: 28

Best Answers:: 0

Trophy Points:: 0

#9

google bot
http://www.google.com/support/webmasters/bin/answer.py?answer=40364&topic=8846
msnbot
http://search.msn.com/docs/siteowner.aspx?t=SEARCH_WEBMASTER_REF_RestrictAccessToSite.htm
yaho
http://help.yahoo.com/help/us/ysearch/slurp/

memo:
http://3w.ezer.com/robots/robots.txt/disallow.asp

Link.ezer.com, Oct 2, 2006 IP

waelthmastery Peon

Messages:: 64

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#10

Thanks guys.

(1) Let me get this right.

WITHOUT a "/" after "disallow" meaning ALLOWING the bot to crawl?

From google page
http://www.google.com/support/webmasters/bin/answer.py?answer=40364&topic=8846

"Allowing Googlebot
If you want to block access to all bots other than the Googlebot, you can use the following syntax:

User-agent: *
Disallow: /

User-agent: Googlebot
Disallow:"

(2) So back to my original post ...

waelthmastery said:

1) User-agent: Mediapartners-Google
Disallow:

ketan9 said: ↑

Bans Google Adsense bots from crawling the pages
Click to expand...

Click to expand...

waelthmastery said:

2) User-agent: *
Disallow:

ketan9 said: ↑

Bans all the bots from crawling the pages!!
Click to expand...

Click to expand...

I actually ALLOW them to crawl as there is NO "/" after "disallow"?
Am I correct?

(3) If we just specify one type of bot without specifiying others, BY DEFAULT, it means ALLOWING OTHER bots to crawl right?

For example if my robots.txt is only.

User-agent: Googlebot-Image
Disallow:/

Does that mean, I BAN only google from crawling my images
BUT allowing other bots?

waelthmastery, Oct 3, 2006 IP

rehash Well-Known Member

Messages:: 1,502

Likes Received:: 30

Best Answers:: 0

Trophy Points:: 150

#11

dont create a robots.txt file unless you want to ban some bot
by default they are all allowed

rehash, Oct 3, 2006 IP

waelthmastery Peon

Messages:: 64

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#12

rehash said: ↑

dont create a robots.txt file unless you want to ban some bot
by default they are all allowed
Click to expand...

There is "error" on the error log, if I do not create robots.txt. Ok I can create an empty one. But that is not the point. The point here is I am confused about the syntax.

That is why I asked

Thanks and I hope I get some straight answers to my post.

This is my robots.txt

----
User-agent: Mediapartners-Google
Disallow:

User-agent: OmniExplorer_Bot
Disallow: /

User-agent: FreeFind
Disallow: /
----

waelthmastery, Oct 3, 2006 IP

waelthmastery Peon

Messages:: 64

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#13

Yeah, it looks like there are different school of thoughts about robots.txt

1) User-agent: *
Disallow:

maybe seen as an error and be interpreted as

User-agent: *
Disallow: /

2) User-agent: *
Allow:

only used by Googlebot and the rests of the bots do not recognise it

3) It is better not to ahve robots.txt at all (but then how to deal with error notification liek "robots.txt is not found " in the error log?

gosh... thsi thing can sometimes drive one crazy...

waelthmastery, Oct 3, 2006 IP

Jean-Luc Peon

Messages:: 601

Likes Received:: 30

Best Answers:: 0

Trophy Points:: 0

#14

Hi waelthmastery,

You got a lot of confusing answers here !

waelthmastery said: ↑

1) User-agent: *
Disallow:

maybe seen as an error and be interpreted as

User-agent: *
Disallow: /
Click to expand...

No, all robots understand the syntax you are using.

waelthmastery said: ↑

2) User-agent: *
Allow:

only used by Googlebot and the rests of the bots do not recognise it
Click to expand...

It is correct. I never use it. Most robots do not support the "Allow" directive. The ones that support it do not agree on the exact meaning of it.

waelthmastery said: ↑

3) It is better not to ahve robots.txt at all (but then how to deal with error notification liek "robots.txt is not found " in the error log?
Click to expand...

A site without robots.txt is fine for search engines, but it fills your error log. A completely empty robots.txt is a good solution. It has exactly the same meaning as this :
User-agent: *
Disallow:
Code (markup):
In case of doubt, refer to the "official" standard used by all serious robot designers : Robots exclusion standard (1994 edition). It is not a nice web page, but it contains all you might want to know about the standard.

Jean-Luc

Jean-Luc, Oct 3, 2006 IP

Link.ezer.com Peon

Messages:: 647

Likes Received:: 28

Best Answers:: 0

Trophy Points:: 0

#15

waelthmastery said: ↑

----
User-agent: Mediapartners-Google
Disallow:

User-agent: OmniExplorer_Bot
Disallow: /

User-agent: FreeFind
Disallow: /
----
Click to expand...

if you do not use

User-agent: Googlebot
Disallow: /

then it is not necessary to list

User-agent: Mediapartners-Google
Disallow:

Link.ezer.com, Oct 3, 2006 IP

waelthmastery Peon

Messages:: 64

Likes Received:: 1

Best Answers:: 0

Trophy Points:: 0

#16

Thanks A Lot Guys!

waelthmastery, Oct 3, 2006 IP

minstrel Illustrious Member

Messages:: 15,082

Likes Received:: 1,243

Best Answers:: 0

Trophy Points:: 480

#17

ketan9 said:

Quote:
Originally Posted by waelthmastery
1) User-agent: Mediapartners-Google
Disallow:

Bans Google Adsense bots from crawling the pages

Quote:
Originally Posted by waelthmastery
2) User-agent: *
Disallow:

Bans all the bots from crawling the pages!!
Click to expand...

No.

"Disallow: " means "disallow nothing - spider everything".

minstrel, Oct 3, 2006 IP

Log in or Sign up

robots.txt

waelthmastery Peon

ketan9 Active Member

Kaudo Peon

waelthmastery Peon

Kaudo Peon

waelthmastery Peon

Kaudo Peon

ablaye Well-Known Member

Link.ezer.com Peon

waelthmastery Peon

rehash Well-Known Member

waelthmastery Peon

waelthmastery Peon

Jean-Luc Peon

Link.ezer.com Peon

waelthmastery Peon

minstrel Illustrious Member

Useful Searches