1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

how to block SemrushBot

Discussion in 'Site & Server Administration' started by JEET, Apr 18, 2017.

  1. #1
    Hi,
    I want to block SemrushBot using htaccess file.
    How to do it?

    Its just using my bandwidth and the info collected is being used by my competition to outrank me in google etc. huh...
    I want to "completely" block it out. Not even my favicon of 1kb...
    Thanks
    SEMrush
     
    JEET, Apr 18, 2017 IP
  2. mmerlinn

    mmerlinn Notable Member

    Messages:
    1,939
    Likes Received:
    229
    Best Answers:
    6
    Trophy Points:
    240
    #2
    mmerlinn, Apr 18, 2017 IP
  3. qwikad.com

    qwikad.com Illustrious Member Affiliate Manager

    Messages:
    5,683
    Likes Received:
    1,055
    Best Answers:
    20
    Trophy Points:
    400
    #3
    qwikad.com, Apr 18, 2017 IP
    JEET and Mehdi.b like this.
  4. Mehdi.b

    Mehdi.b Active Member

    Messages:
    354
    Likes Received:
    50
    Best Answers:
    0
    Trophy Points:
    65
    #4
    I suggest HTACCESS and PhP and not Robots.txt as only solution, it might work but not all bots follow robots.txt to the word so to be safe follow the link from @qwikad.com and try htaccess.
    Just a precaution though, please make back ups of everything before you start messing about with thing is you are not confident or a programmer, things can get real ugly real soon.
     
    Mehdi.b, Apr 18, 2017 IP
  5. JEET

    JEET Well-Known Member

    Messages:
    2,187
    Likes Received:
    100
    Best Answers:
    1
    Trophy Points:
    185
    #5
    I ended up doing this: But @qwikad.com is there a reason to use BrowserMatchNoCase like in the link you gave?
    Should I uncomment the BrowserMatchNoCase line as well?


    .htaccess rule:

    Options -Indexes +FollowSymLinks
    RewriteEngine on

    # BrowserMatchNoCase SemrushBot bad_bot

    SetEnvIfNoCase User-Agent "SemrushBot" bad_user
    SetEnvIfNoCase User-Agent "semrush" bad_user
    Deny from env=bad_user

    Thanks :)
    This thing is trying to access my site so often that php & mysqli are giving up...
     
    JEET, Apr 18, 2017 IP
  6. JEET

    JEET Well-Known Member

    Messages:
    2,187
    Likes Received:
    100
    Best Answers:
    1
    Trophy Points:
    185
    #6
    Its still coming in using the ips below. I can see in cpanel that 13945 bytes of data was served to it (per hit)

    46.229.164.98
    46.229.168.72
    46.229.168.71

    I ended up doing this, but I need a more permanent solution, like blocking that whole range of ips.
    Anyone please post that solution of blocking that IP range completely.

    <?php
    if(preg_match('/Semrush/is', $_SERVER['HTTP_USER_AGENT'])){
    header('HTTP/1.0 403 Forbidden');
    }
    ?>
     
    JEET, Apr 19, 2017 IP
  7. mmerlinn

    mmerlinn Notable Member

    Messages:
    1,939
    Likes Received:
    229
    Best Answers:
    6
    Trophy Points:
    240
    #7
    I wish I could read that article. However, securepubads.g.doubleclick.net keeps hijacking that site rendering me a blank white page. Even after hours of waiting for securepubads.g.doubleclick.net to finish doing whatever it is doing, I am still looking at a blank white hijacked page. I can't even view the source to read it that way. Unfortunately, doubleclick.net hijacks EVERY page that that some idiot has installed doubleclick.net on, so I get lots of blank white 100% CPU hogs as a result.
     
    mmerlinn, Apr 19, 2017 IP
  8. JEET

    JEET Well-Known Member

    Messages:
    2,187
    Likes Received:
    100
    Best Answers:
    1
    Trophy Points:
    185
    #8
    @mmerlinn Its the same code I posted here. The htaccess one. (2 posts above this one) The link uses the "BrowserMatchNoCase" line instead of "SetEnvIfNoCase"
    The SetEnvIfNoCase lines are from stackoverflow website.
    If you want to use the BrowserMatchNoCase, just remove the "#" sign from the beginning. :)
    I think you can use both SetEnvIfNoCase and BrowserMatchNoCase at the same time, but I'm not sure about it...

    Anyways, only the php solution seems to be working for me...
     
    JEET, Apr 19, 2017 IP
  9. RobertEV

    RobertEV Greenhorn

    Messages:
    1
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    11
    #9
    Had the same problem, and emailed them explaining that they were crawling 60 to 70 thousand pages a day on my site.
    Fairly quickly, I got an email back from a nice lady who said she had just added my domains to their no crawl list.
    Haven't had a problem since.
     
    RobertEV, May 9, 2017 IP
  10. santiago1745

    santiago1745 Peon

    Messages:
    1
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #10
    Last edited by a moderator: Dec 31, 2018
    santiago1745, Dec 31, 2018 IP