Review my robots.txt

risoknop Peon

Messages:: 914

Likes Received:: 24

Best Answers:: 0

Trophy Points:: 0

#1

Please review my robots.txt protocol file and tell me if it's good or bad.

http://seowebtips.com/robots.txt

Is this suitable enough for the wordpress blog? Should I add something, does it need modifications?

NOTE: I've allowed Google image bot only to crawl /images/ directory because that's where all images on my site are stored

Thanks in advance.

risoknop, May 20, 2008 IP

manish.chauhan Well-Known Member

Messages:: 1,682

Likes Received:: 35

Best Answers:: 0

Trophy Points:: 110

#2

User-agent: *
Disallow: /cgi-bin/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/
Click to expand...

You have allowed all the bots to crawl your website except four directories that you disallow. It means all search engines will crawl your website but they will not crawl these 4 directories.

User-Agent: Mediapartners-Google
Allow: /

User-Agent: Adsbot-Google
Allow: /

User-Agent: Googlebot-Image
Allow: /images/

User-Agent: Googlebot-Mobile
Allow: /
Click to expand...

Now you are allowing bots individually that you do not need to mention as if you would not allow these bots individually still these bots will crawl your websites. Cause you have allowed all the bots in initial instructions.
So if you remove above lines from robots.txt, it'll work the same as you want...

manish.chauhan, May 20, 2008 IP

risoknop Peon

Messages:: 914

Likes Received:: 24

Best Answers:: 0

Trophy Points:: 0

#3

So basically these lines are useless?

User-Agent: Mediapartners-Google
Allow: /

User-Agent: Adsbot-Google
Allow: /

User-Agent: Googlebot-Image
Allow: /images/

User-Agent: Googlebot-Mobile
Allow: /
Click to expand...

I wanted to especially exclude Google image bot from crawling the whole site because all my images are stored in one folder (/images/)...

risoknop, May 20, 2008 IP

ravi72194 Banned

Messages:: 270

Likes Received:: 4

Best Answers:: 0

Trophy Points:: 0

#4

=======================
User-agent: *
Disallow: /cgi-bin/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/
=======================

no need for this line

Sitemap: http://seowebtips.com/sitemap.xml

ravi72194, May 20, 2008 IP

risoknop Peon

Messages:: 914

Likes Received:: 24

Best Answers:: 0

Trophy Points:: 0

#5

ravi72194 said: ↑

=======================
User-agent: *
Disallow: /cgi-bin/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/
=======================

no need for this line

Sitemap: http://seowebtips.com/sitemap.xml
Click to expand...

Are you sure? Isn't it recommended to have location of your sitemap in robots.txt?

risoknop, May 20, 2008 IP

manish.chauhan Well-Known Member

Messages:: 1,682

Likes Received:: 35

Best Answers:: 0

Trophy Points:: 110

#6

ravi72194 said: ↑

no need for this line

Sitemap: http://seowebtips.com/sitemap.xml
Click to expand...

Not agree with ravi....This is important to put your sitemap within robots.txt for autodiscovery as whenever crawler'll come to your website. It first search your robots.txt, if it find sitemap url there. It can crawl your all pages effectively..
For more info You can check:
http://googlewebmastercentral.blogspot.com/2007/04/whats-new-with-sitemapsorg.html
http://www.sitemaps.org/protocol.php#submit_robots

manish.chauhan, May 20, 2008 IP

manish.chauhan Well-Known Member

Messages:: 1,682

Likes Received:: 35

Best Answers:: 0

Trophy Points:: 110

#7

risoknop said: ↑

I wanted to especially exclude Google image bot from crawling the whole site because all my images are stored in one folder (/images/)...
Click to expand...

To block google image bot for rest of your website..you can add these lines in your robots.txt

User-Agent: Googlebot-Image
Allow: /images/
Disallow: /

manish.chauhan, May 20, 2008 IP

Log in or Sign up

Review my robots.txt

risoknop Peon

manish.chauhan Well-Known Member

risoknop Peon

ravi72194 Banned

risoknop Peon

manish.chauhan Well-Known Member

manish.chauhan Well-Known Member

Useful Searches