1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

Googlebot is gobbling all my bandwidth!

Discussion in 'Search Engine Optimization' started by babrees, Mar 29, 2014.

  1. #1
    argh! I don't want to put the googlebot off but awstats shows googlebot has eaten 43.88GB this month.

    Does anybody have any suggestions or ideas on what I can do to reduce this without harming my stance with G?
     
    babrees, Mar 29, 2014 IP
  2. Aurelius

    Aurelius Greenhorn

    Messages:
    12
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    18
    #2
    1,550 megabytes of bandwidth a day by g-bots? That's crazy. Are you sure? Look at your server logs and double check it's G. Then look to see if there's anything you can lockout the .bots from seeing that won't hurt your rankings. How many scripts you have running? There's where a problem could exist.

    Aurelius
     
    Aurelius, Mar 30, 2014 IP
  3. kalseo

    kalseo Active Member

    Messages:
    733
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    80
    #3
    Most likely it's not the real thing. Actually everybody can call a bot - Googlebot and leak your bandwidth.
    Sure, you can bad bots access through .htacces file.
    Here we go:

    RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto: [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Custo [OR]
    RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
    RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
    RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
    RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
    RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
    RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
    RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
    RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR]
    RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
    RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^larbin [OR]
    RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR]
    RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR]
    RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR]
    RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR]
    RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR]
    RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
    RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
    RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
    RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR]
    RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
    RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
    RewriteCond %{HTTP_USER_AGENT} ^Zeus
    ## Note: The final RewriteCond must NOT use the [OR] flag.

    ## Return 403 Forbidden error.

    Hope this will help

    Cheers
    Kal
     
    kalseo, Mar 30, 2014 IP
  4. babrees

    babrees Active Member

    Messages:
    29
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    86
    #4
    Thanks folks. It's driving me mad. Been going on for a couple of months now, thought I had fixed it as it slowed down for a couple of weeks, but it's started up again.

    For scripts, front end runs wordpress with just a couple of plugins. Backend runs PriceTapestry to import product affiliate links. There are hundreds of products imported, however, I run the same setup on other sites with no problem whatsoever.

    Attached is a copy of my awstats which show it. For my analytics I use a script that doesn't show bots so have to check the awstats for those.
     

    Attached Files:

    babrees, Mar 30, 2014 IP
  5. kalseo

    kalseo Active Member

    Messages:
    733
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    80
    #5
    Why not give a shot of Cloudflare?
     
    kalseo, Mar 30, 2014 IP
  6. babrees

    babrees Active Member

    Messages:
    29
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    86
    #6
    Thanks Kalseo. I added Wordfence Security wp plugin last week and have blocked a number of IPs and specified to block false googlebots, but I'm frightened to touch anything google!!

    I'll add your list to my .htaccess and give Cloudflare a look. Not heard of them before
     
    babrees, Mar 30, 2014 IP
  7. damoncloudflare

    damoncloudflare Greenhorn

    Messages:
    78
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    16
    #7
    Have you simply tried adjusting your crawl rate at Google as well?
     
    damoncloudflare, May 1, 2014 IP
  8. iwebsocial

    iwebsocial Well-Known Member

    Messages:
    1,715
    Likes Received:
    69
    Best Answers:
    4
    Trophy Points:
    170
    #8
    Yes! Wordfence is best option to block false or fake Google bots. I have also use it for my blog. Great plugin though!
     
    iwebsocial, May 2, 2014 IP