Please advise with robots.txt file

Discussion in 'HTML & Website Design' started by 1812, Jan 30, 2012.

  1. #1
    One of my client's sites is using far too much bandwith on a regular basis. I have been informed by the hosting company that it is due to an unusual over attention from search engine spider bots. They said we need to set up a robots.txt file to limit it.. I have absolutely no idea how to do this?..Can anyone advise in very very simple terms please? I don't want to stop the search engines obviously but how to limit them? There are a lot of unidentified search bots hitting on the site, but Google is also giving the site far too much attention..

    Thanks in advance
     
    1812, Jan 30, 2012 IP
  2. cossa

    cossa Peon

    Messages:
    2
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #2
    User-agent: *
    allow: /directory-to-allow/
    disallow: /directory-to-deny/

    that's it
     
    cossa, Jan 30, 2012 IP
  3. 1812

    1812 Active Member

    Messages:
    1,101
    Likes Received:
    12
    Best Answers:
    0
    Trophy Points:
    80
    #3
    Thank you for that. so how would I disallow these 'unidentified robots'... and Is there a way to lessen the amount Google is in there for example?
     
    1812, Jan 30, 2012 IP
  4. popartns

    popartns Member

    Messages:
    29
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    31
    #4
    cossa explained how to filter directories and files for robots. You can find a list of robots that are most important to your web site and then add them to allow. Also there are few tools online for creating robots.txt just google them.
     
    popartns, Jan 31, 2012 IP
  5. WebCare||360

    WebCare||360 Member

    Messages:
    124
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    46
    #5
    Buddy, you can delay the crawl rate of Google Bot if you wanna to delay it and for unidentified bots you can restrict then even by using .htacess.
     
    WebCare||360, Jan 31, 2012 IP