If you would like to block all the unwanted user agents that scrap your site or the bots that you don't want to access your site then in your http.conf file add the following statements. Below I have added a big list of user agents that I deny access, you may want to edit the list depending on your requirements. Hope this would help you
If I'm going to use a bot to scrape a site, I set the user agent to a valid IE user agent. These types of lists don't stop much of anything and just add extra processing time on your web-server. If someone is scraping your site you're better off blocking their IP at the network level. I used to do that with my Windows 2000 server. I just routed IPs to never never land using Windows built in functions for that sort of thing. It's a lot more efficient than making apache do it. Linux can also block IPs at the network level. You could even have your site keep track of what IPs are downloading what and auto block their IP if you felt so inclined.
I think this can be done in robots.txt as well, I saw a robots generator that did something to this extent.
I agree with you. Although finding the ip and blocking them manually takes time and big effort. I am looking for a way to do it automatically meaning, if someone consumes too much of bandwidth, stop him from using the site altogether and couldn't find a better way to do it. Let me know if you have a better approach.
Apache 2 I believe has bandwidth throttling. I also design my sites so everything (except images and js) goes through index.php, even downloads. So if I feel a need to I can log per IP usage and issue the Windows command to reroute IPs automatically if an IP uses more bandwidth per day than allowed.