One of my client's sites is using far too much bandwith on a regular basis. I have been informed by the hosting company that it is due to an unusual over attention from search engine spider bots. They said we need to set up a robots.txt file to limit it.. I have absolutely no idea how to do this?..Can anyone advise in very very simple terms please? I don't want to stop the search engines obviously but how to limit them? There are a lot of unidentified search bots hitting on the site, but Google is also giving the site far too much attention.. Thanks in advance
Thank you for that. so how would I disallow these 'unidentified robots'... and Is there a way to lessen the amount Google is in there for example?
cossa explained how to filter directories and files for robots. You can find a list of robots that are most important to your web site and then add them to allow. Also there are few tools online for creating robots.txt just google them.
Buddy, you can delay the crawl rate of Google Bot if you wanna to delay it and for unidentified bots you can restrict then even by using .htacess.