Robots.txt file help?!?!

Discussion in 'Search Engine Optimization' started by William2009, Aug 17, 2009.

  1. #1
    Does this robot.txt file:
    User-agent: *
    Allow: /.
    Allow: /

    Allow all crawlers to crawl my site? or does it stop them!
     
    William2009, Aug 17, 2009 IP
  2. maineexista

    maineexista Peon

    Messages:
    317
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #2
    User-agent: *
    Disallow:
    Disallow: /cgi-bin/


    you can simply use this or you can go to google webmasters tools and generate your own robots.txt ;)

    Hooooray !
     
    maineexista, Aug 17, 2009 IP
  3. William2009

    William2009 Guest

    Messages:
    276
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    I used google generate and they gave me the one posted!
     
    William2009, Aug 17, 2009 IP
  4. Canonical

    Canonical Well-Known Member

    Messages:
    2,223
    Likes Received:
    141
    Best Answers:
    0
    Trophy Points:
    110
    #4
    If you want to allow ALL crawlers to crawl your entire site then you don't even need a robots.txt unless you want to include a sitemap.xml for discovery by the bots. If you do insist on having one even though you want all crawlers to crawl your entire site then as maineexista said you only need:

     
    Canonical, Aug 17, 2009 IP
  5. William2009

    William2009 Guest

    Messages:
    276
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Would you say its better to have a robot.txt file?
    Or should there be certain robots i want to stop?
     
    William2009, Aug 17, 2009 IP
  6. SmallPotatoes

    SmallPotatoes Peon

    Messages:
    1,321
    Likes Received:
    41
    Best Answers:
    0
    Trophy Points:
    0
    #6
    It is the same as having no robots.txt file at all.
     
    SmallPotatoes, Aug 17, 2009 IP
  7. Canonical

    Canonical Well-Known Member

    Messages:
    2,223
    Likes Received:
    141
    Best Answers:
    0
    Trophy Points:
    110
    #7
    You really only need one like I said if 1) there are certain pages you DO NOT want indexed OR 2) if you want to specify a sitemap.xml in your robots.txt so bots can auto-discover it.

    Otherwise, having one serves no purpose. The bots that you DO NOT want crawling your site likely ignore it anyway.
     
    Canonical, Aug 17, 2009 IP
  8. Dan Schulz

    Dan Schulz Peon

    Messages:
    6,032
    Likes Received:
    437
    Best Answers:
    0
    Trophy Points:
    0
    #8
    Not entirely true - raise your hand if you like seeing your site's server logs cluttered with 404 errors due to a missing robots.txt file every time it gets requested. At the very least, having one will prevent those pointless errors and clean up the log fie.

    Oh, and as far as bots are concerned, you mean like Yahoo! Slurp? :eek:
     
    Dan Schulz, Aug 17, 2009 IP
  9. sydzapp

    sydzapp Well-Known Member

    Messages:
    703
    Likes Received:
    12
    Best Answers:
    0
    Trophy Points:
    100
    #9
    I guess he meant Malware bots :rolleyes:
     
    sydzapp, Aug 17, 2009 IP
  10. Dan Schulz

    Dan Schulz Peon

    Messages:
    6,032
    Likes Received:
    437
    Best Answers:
    0
    Trophy Points:
    0
    #10
    I know, but one of my friends happens to manage a very large online gaming community site for a niche tabletop game publisher, and Yahoo! was constantly hammering his server last year (he has his own dedicated server). He eventually traced it down to Yahoo! Slurp, then added a Disallow directive which the bot ignored.

    So he got ticked off, said "screw it" and banned the entire IP block. (If you don't believe me, you can ask him here on Digital Point. His username is deathshadow.)
     
    Dan Schulz, Aug 17, 2009 IP