Review my robots.txt

Discussion in 'robots.txt' started by risoknop, May 20, 2008.

  1. #1
    Please review my robots.txt protocol file and tell me if it's good or bad.

    http://seowebtips.com/robots.txt

    Is this suitable enough for the wordpress blog? Should I add something, does it need modifications?

    NOTE: I've allowed Google image bot only to crawl /images/ directory because that's where all images on my site are stored ;)

    Thanks in advance.
     
    risoknop, May 20, 2008 IP
  2. manish.chauhan

    manish.chauhan Well-Known Member

    Messages:
    1,682
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    110
    #2
    You have allowed all the bots to crawl your website except four directories that you disallow. It means all search engines will crawl your website but they will not crawl these 4 directories.

    Now you are allowing bots individually that you do not need to mention as if you would not allow these bots individually still these bots will crawl your websites. Cause you have allowed all the bots in initial instructions.
    So if you remove above lines from robots.txt, it'll work the same as you want...:)
     
    manish.chauhan, May 20, 2008 IP
  3. risoknop

    risoknop Peon

    Messages:
    914
    Likes Received:
    24
    Best Answers:
    0
    Trophy Points:
    0
    #3
    So basically these lines are useless?

    I wanted to especially exclude Google image bot from crawling the whole site because all my images are stored in one folder (/images/)...
     
    risoknop, May 20, 2008 IP
  4. ravi72194

    ravi72194 Banned

    Messages:
    270
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #4
    =======================
    User-agent: *
    Disallow: /cgi-bin/
    Disallow: /wp-admin/
    Disallow: /wp-includes/
    Disallow: /wp-content/
    =======================

    no need for this line

    Sitemap: http://seowebtips.com/sitemap.xml
     
    ravi72194, May 20, 2008 IP
  5. risoknop

    risoknop Peon

    Messages:
    914
    Likes Received:
    24
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Are you sure? Isn't it recommended to have location of your sitemap in robots.txt?
     
    risoknop, May 20, 2008 IP
  6. manish.chauhan

    manish.chauhan Well-Known Member

    Messages:
    1,682
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    110
    #6
    Not agree with ravi....This is important to put your sitemap within robots.txt for autodiscovery as whenever crawler'll come to your website. It first search your robots.txt, if it find sitemap url there. It can crawl your all pages effectively..
    For more info You can check:
    http://googlewebmastercentral.blogspot.com/2007/04/whats-new-with-sitemapsorg.html
    http://www.sitemaps.org/protocol.php#submit_robots
     
    manish.chauhan, May 20, 2008 IP
  7. manish.chauhan

    manish.chauhan Well-Known Member

    Messages:
    1,682
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    110
    #7
    To block google image bot for rest of your website..you can add these lines in your robots.txt

    User-Agent: Googlebot-Image
    Allow: /images/
    Disallow: /
     
    manish.chauhan, May 20, 2008 IP