Looks like bots don't care about the https redirect in the .htaccess and keep crawling my http version of the site. Blocking them via the .htaccess only works for the https version. Any ideas how to block them from accessing either version of the site period?
That sounds like a problem in the .htaccess rather than the bots overriding anything. Can you ask them nicely in the robots.txt?
One option could be to implement a CAPTCHA on the website to prevent bots from accessing it. Another solution may be to use a bot management tool or service to automatically block known bots. also, regularly monitoring website traffic and manually blocking suspicious IP addresses could help prevent unwanted bot access.