How do I create a robots.txt file?

Discussion in 'robots.txt' started by crappy-ownage, Nov 7, 2008.

  1. #1
    The syntax is very limited and easy to understand. The first part specifies the robot we are referring to.
    
    User-agent: BotName
    Code (markup):
    Replace BotName with the robot name in question. To address all of them, simply use an asterisk.

    User-agent: *
    Code (markup):
    The second part tells the robot in question not to enter certain parts of your web site.

    Disallow: /cgi-bin/
    Code (markup):
    In this example, any path on our site starting with the string /cgi-bin/ is declared off limits. Multiple paths can be excluded per robot by using several Disallow lines.

    User-agent: *
    Disallow: /cgi-bin/
    Disallow: /temp/
    Disallow: /private
    Code (markup):
    This robots.txt file would apply to all bots and instruct them to stay out of directories /cgi-bin/ and /temp/.

    It also tells them any path/URL on your site starting with /private (files and directories) is off limits.

    To declare your entire website off limits to BotName, use the example shown below.
    
    User-agent: BotName
    Disallow: /
    Code (markup):
    To have a generic robots.txt file which welcomes every robot and does not restrict them, use this sample.

    User-agent: *
    Disallow:
    Code (markup):
     
    crappy-ownage, Nov 7, 2008 IP
  2. manish.chauhan

    manish.chauhan Well-Known Member

    Messages:
    1,682
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    110
    #2
    Nice information..thanks..:)
     
    manish.chauhan, Jan 20, 2009 IP
  3. udayns

    udayns Peon

    Messages:
    237
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    really very use full information for a webmaster.
     
    udayns, Jan 28, 2009 IP
  4. BlueChipSEO

    BlueChipSEO Peon

    Messages:
    6
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    I don't see any mention of wild card entries or regular expressions for dynamic URLs in this post. I blogged on how to take advantage of wild card entries in robots.txt here: bluechipseo.com/2009/01/how-to-use-wildcard-entries-in-your.html
     
    BlueChipSEO, Jan 30, 2009 IP
  5. AndyCrow

    AndyCrow Peon

    Messages:
    64
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Thanks. Just gotta work out how to dis-allow access to the pages listen in the robots.txt
    If the pages disallowed have nothing to stop people from viewing them then your better off with no robots.txt, since it's a huge vulnerability and is often exploited by hackers.
     
    AndyCrow, Feb 15, 2009 IP
  6. jamesjame

    jamesjame Peon

    Messages:
    66
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Thanks for Sharing Information.
     
    jamesjame, Mar 24, 2009 IP
  7. terek

    terek Peon

    Messages:
    35
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    COOL.
    Thank you very much!
     
    terek, Mar 25, 2009 IP
  8. JPetrillo

    JPetrillo Peon

    Messages:
    120
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #8
    Thank you... I think I need one for my forum, right? You should have one if you have a vBulletin forum?
     
    JPetrillo, Apr 15, 2009 IP
  9. linkdealer

    linkdealer Active Member

    Messages:
    138
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    90
    #9
    You can check some more information at seocrazy.blogspot.com/2008/04/robotstxt-stop.html
     
    linkdealer, Jun 9, 2009 IP