1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

how to block SemrushBot

Discussion in 'Site & Server Administration' started by JEET, Apr 18, 2017.

  1. #1
    Hi,
    I want to block SemrushBot using htaccess file.
    How to do it?

    Its just using my bandwidth and the info collected is being used by my competition to outrank me in google etc. huh...
    I want to "completely" block it out. Not even my favicon of 1kb...
    Thanks
     
    JEET, Apr 18, 2017 IP
  2. mmerlinn

    mmerlinn Prominent Member

    Messages:
    3,197
    Likes Received:
    818
    Best Answers:
    7
    Trophy Points:
    320
    #2
    mmerlinn, Apr 18, 2017 IP
  3. qwikad.com

    qwikad.com Illustrious Member Affiliate Manager

    Messages:
    7,151
    Likes Received:
    1,656
    Best Answers:
    29
    Trophy Points:
    475
    #3
    qwikad.com, Apr 18, 2017 IP
    JEET and Mehdi.b like this.
  4. Mehdi.b

    Mehdi.b Active Member

    Messages:
    353
    Likes Received:
    50
    Best Answers:
    0
    Trophy Points:
    65
    #4
    I suggest HTACCESS and PhP and not Robots.txt as only solution, it might work but not all bots follow robots.txt to the word so to be safe follow the link from @qwikad.com and try htaccess.
    Just a precaution though, please make back ups of everything before you start messing about with thing is you are not confident or a programmer, things can get real ugly real soon.
     
    Mehdi.b, Apr 18, 2017 IP
  5. JEET

    JEET Notable Member

    Messages:
    3,825
    Likes Received:
    502
    Best Answers:
    19
    Trophy Points:
    265
    #5
    I ended up doing this: But @qwikad.com is there a reason to use BrowserMatchNoCase like in the link you gave?
    Should I uncomment the BrowserMatchNoCase line as well?


    .htaccess rule:

    Options -Indexes +FollowSymLinks
    RewriteEngine on

    # BrowserMatchNoCase SemrushBot bad_bot

    SetEnvIfNoCase User-Agent "SemrushBot" bad_user
    SetEnvIfNoCase User-Agent "semrush" bad_user
    Deny from env=bad_user

    Thanks :)
    This thing is trying to access my site so often that php & mysqli are giving up...
     
    JEET, Apr 18, 2017 IP
  6. JEET

    JEET Notable Member

    Messages:
    3,825
    Likes Received:
    502
    Best Answers:
    19
    Trophy Points:
    265
    #6
    Its still coming in using the ips below. I can see in cpanel that 13945 bytes of data was served to it (per hit)

    46.229.164.98
    46.229.168.72
    46.229.168.71

    I ended up doing this, but I need a more permanent solution, like blocking that whole range of ips.
    Anyone please post that solution of blocking that IP range completely.

    <?php
    if(preg_match('/Semrush/is', $_SERVER['HTTP_USER_AGENT'])){
    header('HTTP/1.0 403 Forbidden');
    }
    ?>
     
    JEET, Apr 19, 2017 IP
  7. mmerlinn

    mmerlinn Prominent Member

    Messages:
    3,197
    Likes Received:
    818
    Best Answers:
    7
    Trophy Points:
    320
    #7
    I wish I could read that article. However, securepubads.g.doubleclick.net keeps hijacking that site rendering me a blank white page. Even after hours of waiting for securepubads.g.doubleclick.net to finish doing whatever it is doing, I am still looking at a blank white hijacked page. I can't even view the source to read it that way. Unfortunately, doubleclick.net hijacks EVERY page that that some idiot has installed doubleclick.net on, so I get lots of blank white 100% CPU hogs as a result.
     
    mmerlinn, Apr 19, 2017 IP
  8. JEET

    JEET Notable Member

    Messages:
    3,825
    Likes Received:
    502
    Best Answers:
    19
    Trophy Points:
    265
    #8
    @mmerlinn Its the same code I posted here. The htaccess one. (2 posts above this one) The link uses the "BrowserMatchNoCase" line instead of "SetEnvIfNoCase"
    The SetEnvIfNoCase lines are from stackoverflow website.
    If you want to use the BrowserMatchNoCase, just remove the "#" sign from the beginning. :)
    I think you can use both SetEnvIfNoCase and BrowserMatchNoCase at the same time, but I'm not sure about it...

    Anyways, only the php solution seems to be working for me...
     
    JEET, Apr 19, 2017 IP
  9. RobertEV

    RobertEV Greenhorn

    Messages:
    1
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    11
    #9
    Had the same problem, and emailed them explaining that they were crawling 60 to 70 thousand pages a day on my site.
    Fairly quickly, I got an email back from a nice lady who said she had just added my domains to their no crawl list.
    Haven't had a problem since.
     
    RobertEV, May 9, 2017 IP
  10. santiago1745

    santiago1745 Peon

    Messages:
    1
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #10
    Last edited by a moderator: Dec 31, 2018
    santiago1745, Dec 31, 2018 IP
  11. uday_yadav2

    uday_yadav2 Well-Known Member

    Messages:
    143
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    108
    #11
    @JEET are you able to block semrushbot? or still facing this issue.
    If you are able to solve this issue, please share with us.
     
    uday_yadav2, Apr 2, 2021 IP
  12. monovm

    monovm Active Member

    Messages:
    29
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    73
    #12
    Generally, we don’t recommend preventing bot attacks via htaccess, but for this purpose, you can add the below code to your htaccess file.
    <IfModule mod_rewrite.c>
    RewriteCond %{HTTP_USER_AGENT} semrush [NC]
    RewriteRule .* - [F,L]
    </IfModule>
     
    monovm, Aug 7, 2022 IP
    JEET likes this.
  13. JEET

    JEET Notable Member

    Messages:
    3,825
    Likes Received:
    502
    Best Answers:
    19
    Trophy Points:
    265
    #13
    @uday_yadav2 I blocked it using php code.
    if(preg_match('/Semrush/is', $_SERVER['HTTP_USER_AGENT'])){
    header('HTTP/1.0 403 Forbidden');
    }

    @santiago1745
    I don't know regex that much, but "s" is used to check for white spaces. Since there is no white space in the pattern provided, so only "i" can be used without problem.
    2) if(preg_match('/Semrush/i
     
    JEET, Aug 13, 2022 IP
  14. Zankou

    Zankou Peon

    Messages:
    2
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    1
    #14
    Hello, My dear, I want to check my site and modify it to provide good content about profit from the Internet on my site. I want to review it and advise me to correct the errors on it. Website link below...

    https://mrte-ch.blogspot.com
     
    Zankou, Dec 10, 2022 IP