Robots.txt

Discussion in 'robots.txt' started by CWN, Jul 19, 2006.

  1. #1
    I currently do not have a robots.txt, but feel I should have one. Now, I would like to accept all search engines - perhaps not google image, but other than that...? As well I have one folder I do not want searched "adminpanel". How would I write that? Anything else I should keep in mind?

    Thanks very much

    CWN
     
    CWN, Jul 19, 2006 IP
  2. jaguar-archie2006

    jaguar-archie2006 Banned

    Messages:
    631
    Likes Received:
    16
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Disalow:/the folder/

    it's that your prob?
     
    jaguar-archie2006, Jul 19, 2006 IP
  3. Smyrl

    Smyrl Tomato Republic Staff

    Messages:
    13,740
    Likes Received:
    1,702
    Best Answers:
    78
    Trophy Points:
    510
    #3
    Smyrl, Jul 19, 2006 IP
    wrmineo likes this.
  4. banless

    banless Peon

    Messages:
    1,745
    Likes Received:
    217
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Copy and past the code below into your robots.txt file

    User-agent: *
    Disallow: /adminpanel/
     
    banless, Jul 19, 2006 IP
  5. catanich

    catanich Peon

    Messages:
    1,921
    Likes Received:
    40
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Hi

    It is easy. It must be a .txt file and saved in the root directory with the exact name of "robots.txt"

    A good example is go to Google's: http:// www. google. com/robots. txt

    or look at mine. I've added some examples to it for you.

    ------------------------------------

    # Robots.txt file created by 6/21/06
    # For domain: http:// www. catanich. com

    # All other robots will spider the domain
    User-agent: * # index everything, all bots

    Disallow: /_* # do not index any file starting with "_"

    Disallow: /catanich.html # do not index the file catanich.html

    Disallow: /catanich/ # do not index the directory catanich

    -------------------------------
    Note: the # is the start of a robots.txt comment

    But a quick Google Search on robots.txt will help you as well.

    Try http://www. javascriptkit. com/ howto/ robots. shtml for a quick overview. It was the first one I came to that made sence.

    Good luck

    Jim Catanich
    www. catanich. com
     
    catanich, Jul 19, 2006 IP
  6. mdvaldosta

    mdvaldosta Peon

    Messages:
    4,079
    Likes Received:
    362
    Best Answers:
    0
    Trophy Points:
    0
    #6
    If anything, you should at the very least have a blank robots.txt file
     
    mdvaldosta, Jul 19, 2006 IP
  7. CWN

    CWN Banned

    Messages:
    311
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #7
    Thanks everyone, very helpful! :D
     
    CWN, Jul 19, 2006 IP
  8. wrmineo

    wrmineo Peon

    Messages:
    3,087
    Likes Received:
    379
    Best Answers:
    0
    Trophy Points:
    0
    #8
    Whether it's an open-door robots.txt or a restrictive one, it's still smart to have one ... without it, many bots will 'unfairly' register 404s looking for it ;)
     
    wrmineo, Jul 19, 2006 IP
  9. CWN

    CWN Banned

    Messages:
    311
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #9
    open-door robots.txt? does that mean, allow everything? anyone have a robots.txt that allows everything? this is what i am looking for :)
     
    CWN, Jul 19, 2006 IP
  10. Smyrl

    Smyrl Tomato Republic Staff

    Messages:
    13,740
    Likes Received:
    1,702
    Best Answers:
    78
    Trophy Points:
    510
    #10
    User-agent: *
    Disallow:
     
    Smyrl, Jul 19, 2006 IP
  11. wrmineo

    wrmineo Peon

    Messages:
    3,087
    Likes Received:
    379
    Best Answers:
    0
    Trophy Points:
    0
    #11
    Yes, that's an open-door as I call it.

    However, it's very easy to restrict some folders, which you might want to do, for example:

    User-agent: *
    Disallow: /test/
    Disallow: /cgi-bin/
    Disallow: /cp/
     
    wrmineo, Jul 19, 2006 IP
  12. CWN

    CWN Banned

    Messages:
    311
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #12
    You guys ROCK! Thanks very much!
     
    CWN, Jul 19, 2006 IP
    wrmineo likes this.
  13. CWN

    CWN Banned

    Messages:
    311
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #13
    This is my robot.txt ...

    User-agent: *
    Disallow: /adminpanel/
    Disallow: /cgi-bin/

    It seems simple - would this be considered "efficient"? I think it suits my requirements
     
    CWN, Jul 19, 2006 IP
  14. wrmineo

    wrmineo Peon

    Messages:
    3,087
    Likes Received:
    379
    Best Answers:
    0
    Trophy Points:
    0
    #14

    Good move on keeping bots out of your admin and cgi. Yes, seems efficient and it will mitigate unnecessary 404s when bots can't find the robots.txt file.

    The next thing you need is a favico ;)
     
    wrmineo, Jul 20, 2006 IP