I just do not want to make google go away or something like that. But some unknow engines are taking all my bandwith (in one day, some "crawl" take 4,5 GIgas!!). Give me some advice please.
If there is a specific spider doing it, block them. There isn't any way to allow somthing, so it's best to block the ones you don't want. You can do this in robots.txt. http://sitemaps.blogspot.com/2006/02/using-robotstxt-file.html
There's no way to. Robots.txt is a black-list style blocking method. You can only tell it what to block, not what to allow.
I will code myself some solution, similar to some i read somewhere: I will put some small image (1 pixel) that drives to some url. Also the link will be nofollow. A normal user will not see it and Google wont follow it. Each time someone go to that url will be banned and an email will be sent to me. So i can verify if leave the ban or remove it. I will see if it works well.
as a general rule robots.txt is a WISH list and NO way to CONTROL the spiders/bots listed on robots.txt are in no way forced to follow the rules specified in robots.txt if you wish to CONTROL access on a bots/ ( agents ) basis then you use .htaccess as shown below in an example. this way you allow for all agents ( bots) and block access for specific listed bots. this is full tru control as it works for all bots whether or not they want to obey robots.txt rules! if you google for bad bots then you find lists of bots that either are used for non-friendly purpose or just use your resources without benefit for you. to block on a agent by agent basis gives you 2 benefits all are welcome - hence NO accidental access-denial by unknown real SE bot all those that YOU find abusing your resources are added on a one by one basis depending on your own access_log records. if you look at your daily/weekly access-stats and see a new bot looping or abusing your resources, just add this new bot with a new line to your "stayout" list in your htaccess file. if a single SE bot hs GB of daily traffic, then usually a NEW SE with looping bots. if that new SE is of any value, then you may take the time to inform that SE with a few lines of your access_log file or if you consider that SE to be useless for your real / human traffic, you may simply add to the deny access list.