Please review my robots.txt protocol file and tell me if it's good or bad. http://seowebtips.com/robots.txt Is this suitable enough for the wordpress blog? Should I add something, does it need modifications? NOTE: I've allowed Google image bot only to crawl /images/ directory because that's where all images on my site are stored Thanks in advance.
You have allowed all the bots to crawl your website except four directories that you disallow. It means all search engines will crawl your website but they will not crawl these 4 directories. Now you are allowing bots individually that you do not need to mention as if you would not allow these bots individually still these bots will crawl your websites. Cause you have allowed all the bots in initial instructions. So if you remove above lines from robots.txt, it'll work the same as you want...
So basically these lines are useless? I wanted to especially exclude Google image bot from crawling the whole site because all my images are stored in one folder (/images/)...
======================= User-agent: * Disallow: /cgi-bin/ Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: /wp-content/ ======================= no need for this line Sitemap: http://seowebtips.com/sitemap.xml
Not agree with ravi....This is important to put your sitemap within robots.txt for autodiscovery as whenever crawler'll come to your website. It first search your robots.txt, if it find sitemap url there. It can crawl your all pages effectively.. For more info You can check: http://googlewebmastercentral.blogspot.com/2007/04/whats-new-with-sitemapsorg.html http://www.sitemaps.org/protocol.php#submit_robots
To block google image bot for rest of your website..you can add these lines in your robots.txt User-Agent: Googlebot-Image Allow: /images/ Disallow: /