How to create robot.txt file ? how to stop indexing files from cgi-bin ?

Discussion in 'robots.txt' started by poseidon, Jan 23, 2006.

  1. #1
    Hi,

    I am new to the concept as far as robots.txt is concerned. What exactly it does ? How to use it ? Also I don't want search engines to crawl and index my cgi-bin directory. How can I do it.

    Some code will be really helpful.

    Regards.
     
    poseidon, Jan 23, 2006 IP
  2. Cristian Mezei

    Cristian Mezei Notable Member

    Messages:
    3,332
    Likes Received:
    355
    Best Answers:
    0
    Trophy Points:
    213
    #2
    You should read this.
     
    Cristian Mezei, Jan 24, 2006 IP
  3. Jean-Luc

    Jean-Luc Peon

    Messages:
    601
    Likes Received:
    30
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Use this robots.txt :
    User-agent: * 
    Disallow: /cgi-bin/
    Code (markup):
    Jean-Luc
     
    Jean-Luc, Jan 24, 2006 IP
  4. poseidon

    poseidon Banned

    Messages:
    4,356
    Likes Received:
    246
    Best Answers:
    0
    Trophy Points:
    0
    #4
    so what I have to do is just to create a robots.txt file having

    isn't it ?
     
    poseidon, Jan 24, 2006 IP
  5. Jean-Luc

    Jean-Luc Peon

    Messages:
    601
    Likes Received:
    30
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Exactly. Make sure you upload the robots.txt in the right directory. You have to be able to view it at www.your-site.com/robots.txt.

    Jean-Luc
     
    Jean-Luc, Jan 24, 2006 IP
  6. noiprox

    noiprox Peon

    Messages:
    27
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #6
    i have read the tutorial, have a great robots.txt file, but when i do a sitemap generator, it still indexing those pages i want to disallow


    User-agent: *
    Disallow: /cgi-bin/
    Disallow: /tmp/
    Disallow: /cache/
    Disallow: /class/
    Disallow: /images/
    Disallow: /include/
    Disallow: /install/
    Disallow: /kernel/
    Disallow: /language/
    Disallow: /templates_c/
    Disallow: /themes/
    Disallow: /uploads/
    Code (markup):
    this is an example... its called robots.txt and is in the root

    any thoughts?
     
    noiprox, Jan 25, 2006 IP
  7. GoGlobal

    GoGlobal Peon

    Messages:
    192
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    Yeah,
    It's good but which is the most important use in Robots.txt file.
     
    GoGlobal, Jan 27, 2009 IP
  8. manish.chauhan

    manish.chauhan Well-Known Member

    Messages:
    1,682
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    110
    #8
    A Sitemap Generator tool doesn't read the robots.txt, it collects all web page urls and put it into a single file, you have to manually remove the web pages from there.
     
    manish.chauhan, Jan 27, 2009 IP
  9. udayns

    udayns Peon

    Messages:
    237
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #9
    udayns, Jan 28, 2009 IP
  10. manish.chauhan

    manish.chauhan Well-Known Member

    Messages:
    1,682
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    110
    #10
    manish.chauhan, Jan 28, 2009 IP
  11. NickR25

    NickR25 Peon

    Messages:
    394
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    0
    #11
    Don't put it in your sitemap; you will get the error in Google Webmaster Tools (if you use it) that there are some URLs in your sitemap being restricted by robots.txt.
     
    NickR25, Feb 1, 2009 IP
  12. sriraj46

    sriraj46 Peon

    Messages:
    102
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #12
    How would one put robots.txt in sitemap. First of all does the robots file be included in the sitemap page. I guess it doesn't.Correct me if i'm wrong
     
    sriraj46, Feb 2, 2009 IP
  13. manish.chauhan

    manish.chauhan Well-Known Member

    Messages:
    1,682
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    110
    #13
    You can put the sitemap.xml in robots.txt... For more information check
    http://www.sitemaps.org/protocol.php#submit_robots
     
    manish.chauhan, Feb 2, 2009 IP
  14. shailendra

    shailendra Peon

    Messages:
    1,225
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    0
    #14
    How can you say that this is crap?
     
    shailendra, Feb 4, 2009 IP
  15. manish.chauhan

    manish.chauhan Well-Known Member

    Messages:
    1,682
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    110
    #15
    just because it only offers general instructions that one can get from anywhere else. However, when it comes to specific points like use of regular expressions, it doesn't provide solid information.
     
    manish.chauhan, Feb 4, 2009 IP
  16. ggmittal

    ggmittal Guest

    Messages:
    27
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #16

    yes... i agree.. robots.txt does need to be included in sitemap...
     
    ggmittal, Feb 17, 2009 IP
  17. proson

    proson Well-Known Member

    Messages:
    573
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    130
    #17
    hi manish I guess you want to exclude certain files?


    if you want to exclude certain file why don't you use meta robots on that page instead?

    why complicated things when you can do it easily...
     
    proson, Feb 17, 2009 IP
  18. DareDevils

    DareDevils Active Member

    Messages:
    607
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    70
    #18
    yes , this is the one
     
    DareDevils, Feb 23, 2009 IP
  19. infomalaya

    infomalaya Banned

    Messages:
    103
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #19
    Thanks for the tips!
     
    infomalaya, Mar 30, 2009 IP
  20. 3drendering

    3drendering Peon

    Messages:
    260
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    0
    #20
    He Shailendra,

    Manish is Absolutely Right.

    Why you Oppose him??

    Manish is Right
     
    3drendering, Apr 9, 2009 IP