1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Is it normal that the /robots.txt is the most downloaded item from Server stats

Discussion in 'robots.txt' started by TheSyndicate, Mar 29, 2012.

  1. #1
    Is it normal that the /robots.txt is the most downloaded item from Server stats? Many of my domains show robots.txt as the most hits?
     
    TheSyndicate, Mar 29, 2012 IP
  2. SoftCloud

    SoftCloud Well-Known Member

    Messages:
    1,060
    Likes Received:
    28
    Best Answers:
    2
    Trophy Points:
    120
    #2
    Yeah, it's normal. I think it's everytime a search bot goes onto your site it re-downloads it just to see if there's any changes. It's the same with me too.
     
    SoftCloud, Apr 5, 2012 IP
  3. henrywilliams

    henrywilliams Peon

    Messages:
    119
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Yes it is because every time a search engine visit your site it download robots.txt very first and then start crawling your sit according to the instructions you have provided in robots.txt.
     
    henrywilliams, Apr 6, 2012 IP
  4. TheSyndicate

    TheSyndicate Prominent Member

    Messages:
    5,410
    Likes Received:
    289
    Best Answers:
    0
    Trophy Points:
    365
    #4
    Can they not just download it one time? It is lot of bandwidth they download so many time.
     
    TheSyndicate, Apr 6, 2012 IP
  5. ryan_uk

    ryan_uk Illustrious Member

    Messages:
    3,983
    Likes Received:
    1,022
    Best Answers:
    33
    Trophy Points:
    465
    #5
    It's really not much bandwidth. Robots.txt is very small (just text). Unless of course you have a lot of instructions in there? If you updated robots.txt and then it wasn't downloaded again, the bots could be doing something that you don't want them to, so this is why they keep downloading it. I don't know about other bots, but GoogleBot will download it once every 24 hours.
     
    ryan_uk, Apr 7, 2012 IP
  6. TheSyndicate

    TheSyndicate Prominent Member

    Messages:
    5,410
    Likes Received:
    289
    Best Answers:
    0
    Trophy Points:
    365
    #6
    In some of my domains the Robot.txt is the most downloaded item on my domain?
     
    TheSyndicate, Apr 7, 2012 IP
  7. ryan_uk

    ryan_uk Illustrious Member

    Messages:
    3,983
    Likes Received:
    1,022
    Best Answers:
    33
    Trophy Points:
    465
    #7
    So that just means you get a lot or robots visiting you. Have you checked AWStats or similar? This is good for giving you an idea of bot visits. You could always block any you don't want visiting (using .htaccess, based upon their user-agent).
     
    ryan_uk, Apr 7, 2012 IP
  8. TheSyndicate

    TheSyndicate Prominent Member

    Messages:
    5,410
    Likes Received:
    289
    Best Answers:
    0
    Trophy Points:
    365
    #8
    Yes i get a lot sometimes way much! Even brought the server down sometimes. Most of the bandwidth are from robots not from visitors.
     
    TheSyndicate, Apr 8, 2012 IP
  9. ryan_uk

    ryan_uk Illustrious Member

    Messages:
    3,983
    Likes Received:
    1,022
    Best Answers:
    33
    Trophy Points:
    465
    #9
    You need to analyse your log files with AWStats, Webalizer or similar, and figure out which bots are visiting a lot or it might not even come under bots, but a particular user-agent (or IP) that is coming up a lot (and a bit unusual, not a normal browser one).
     
    ryan_uk, Apr 8, 2012 IP
  10. TheSyndicate

    TheSyndicate Prominent Member

    Messages:
    5,410
    Likes Received:
    289
    Best Answers:
    0
    Trophy Points:
    365
    #10
    Yeah i have done that and blocked some bad once but mainly it is badu or Google that keep loading the robot.txt.
     
    TheSyndicate, Apr 8, 2012 IP
  11. ryan_uk

    ryan_uk Illustrious Member

    Messages:
    3,983
    Likes Received:
    1,022
    Best Answers:
    33
    Trophy Points:
    465
    #11
    Google will keep doing it every 24 hours (unless you've used .htaccess to change the robots.txt expiry time). I don't know about Badu, but I would suspect it's the same. That file being requested a few times per day won't be causing a server to go down. There's something else wrong with your site or webhost.
     
    ryan_uk, Apr 9, 2012 IP
  12. TheSyndicate

    TheSyndicate Prominent Member

    Messages:
    5,410
    Likes Received:
    289
    Best Answers:
    0
    Trophy Points:
    365
    #12
    No the robot.txt wont cause the server to go down but it is kind of depressing to see all the bandwith waster for a Robot file. You say you can change this in .htaccess? Can you do serverwide on a dedicated server? What should you set them 1 month? i hardly change the robot.txt.
     
    TheSyndicate, Apr 9, 2012 IP