Hi, I want to block SemrushBot using htaccess file. How to do it? Its just using my bandwidth and the info collected is being used by my competition to outrank me in google etc. huh... I want to "completely" block it out. Not even my favicon of 1kb... Thanks
I suggest HTACCESS and PhP and not Robots.txt as only solution, it might work but not all bots follow robots.txt to the word so to be safe follow the link from @qwikad.com and try htaccess. Just a precaution though, please make back ups of everything before you start messing about with thing is you are not confident or a programmer, things can get real ugly real soon.
I ended up doing this: But @qwikad.com is there a reason to use BrowserMatchNoCase like in the link you gave? Should I uncomment the BrowserMatchNoCase line as well? .htaccess rule: Options -Indexes +FollowSymLinks RewriteEngine on # BrowserMatchNoCase SemrushBot bad_bot SetEnvIfNoCase User-Agent "SemrushBot" bad_user SetEnvIfNoCase User-Agent "semrush" bad_user Deny from env=bad_user Thanks This thing is trying to access my site so often that php & mysqli are giving up...
Its still coming in using the ips below. I can see in cpanel that 13945 bytes of data was served to it (per hit) 46.229.164.98 46.229.168.72 46.229.168.71 I ended up doing this, but I need a more permanent solution, like blocking that whole range of ips. Anyone please post that solution of blocking that IP range completely. <?php if(preg_match('/Semrush/is', $_SERVER['HTTP_USER_AGENT'])){ header('HTTP/1.0 403 Forbidden'); } ?>
I wish I could read that article. However, securepubads.g.doubleclick.net keeps hijacking that site rendering me a blank white page. Even after hours of waiting for securepubads.g.doubleclick.net to finish doing whatever it is doing, I am still looking at a blank white hijacked page. I can't even view the source to read it that way. Unfortunately, doubleclick.net hijacks EVERY page that that some idiot has installed doubleclick.net on, so I get lots of blank white 100% CPU hogs as a result.
@mmerlinn Its the same code I posted here. The htaccess one. (2 posts above this one) The link uses the "BrowserMatchNoCase" line instead of "SetEnvIfNoCase" The SetEnvIfNoCase lines are from stackoverflow website. If you want to use the BrowserMatchNoCase, just remove the "#" sign from the beginning. I think you can use both SetEnvIfNoCase and BrowserMatchNoCase at the same time, but I'm not sure about it... Anyways, only the php solution seems to be working for me...
Had the same problem, and emailed them explaining that they were crawling 60 to 70 thousand pages a day on my site. Fairly quickly, I got an email back from a nice lady who said she had just added my domains to their no crawl list. Haven't had a problem since.
hello I have a question, in php code what is correct? 1) if(preg_match('/Semrush/is 2) if(preg_match('/Semrush/i is with "i" only or be with "is" because in this link said be only "i" . https://www.blackhatworld.com/seo/block-semrush.838057/ PLEASE CONFIRM.
@JEET are you able to block semrushbot? or still facing this issue. If you are able to solve this issue, please share with us.
Generally, we don’t recommend preventing bot attacks via htaccess, but for this purpose, you can add the below code to your htaccess file. <IfModule mod_rewrite.c> RewriteCond %{HTTP_USER_AGENT} semrush [NC] RewriteRule .* - [F,L] </IfModule>
@uday_yadav2 I blocked it using php code. if(preg_match('/Semrush/is', $_SERVER['HTTP_USER_AGENT'])){ header('HTTP/1.0 403 Forbidden'); } @santiago1745 I don't know regex that much, but "s" is used to check for white spaces. Since there is no white space in the pattern provided, so only "i" can be used without problem. 2) if(preg_match('/Semrush/i
Hello, My dear, I want to check my site and modify it to provide good content about profit from the Internet on my site. I want to review it and advise me to correct the errors on it. Website link below... https://mrte-ch.blogspot.com