How to protect our robots.txt?

Discussion in 'robots.txt' started by teohcl, Mar 28, 2010.

  1. #1
    Just wonder, can we protect our robots.txt from being access by others? Any idea how to do that? Please let me know if you know how. Thanks
     
    teohcl, Mar 28, 2010 IP
  2. katarina

    katarina Peon

    Messages:
    25
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #2
    But why should you want to protect robots.txt? Is there any thing happen, if we don't protect robots.txt.
     
    katarina, Mar 30, 2010 IP
  3. bavington

    bavington Peon

    Messages:
    74
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #3
    You cannot protect your robots.txt file. Any technique to hide it, and it won't be picked up by the search engines. It is important to consider that the robots.txt file is usually the starting point of website hackers.

    If you have admin areas that you don't want to be cached, or seen by the public, use htaccess files to password protect your directories.
     
    bavington, Mar 31, 2010 IP
  4. Zirkos

    Zirkos Greenhorn

    Messages:
    25
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    11
    #4
    It is possible (for example using .htaccess) to only allow certain IPs or hosts to see your robots.txt. But you'll have to research exactly which ones you want to allow. :) It's probably better to follow bavingtons advice.
     
    Zirkos, Mar 31, 2010 IP
  5. D3xter

    D3xter Member

    Messages:
    22
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    36
    #5
    You can use .htaccess to deny access to robots.txt to anyone but the search engine Bots (User Agent: Googlebot, Yahoo! Slurp).
     
    D3xter, Apr 7, 2010 IP
  6. niliven

    niliven Well-Known Member

    Messages:
    72
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    101
    #6
    The purpose of the "robots.txt" file is to instruct search engine web bots or spiders as to which content should be indexed and which content should be avoided. There are three important tips that can help you to gain the maximum benefits from using this file in the right way on your server.
     
    niliven, Apr 19, 2010 IP
  7. chauhanmanish

    chauhanmanish Peon

    Messages:
    36
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    You can do that by blocking robots.txt for all except search engine bots through htaccess. This way is not feasible as Google doesn't provide a list of its crawlers IPs.

    <Files robots.txt>
    Order Deny,Allow
    Deny from All
    Allow from *input bots ip*
    </Files>
     
    chauhanmanish, Apr 21, 2010 IP