1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

i can't stop Baidu, pls help!!!

Discussion in 'robots.txt' started by web_master, Dec 27, 2011.

  1. #1
    my robots.txt is

    User-agent: *
    Allow: /
    Disallow: /help/
    Disallow: /search/
    Disallow: /stats/
    Disallow: /calendar/
    Disallow: /reminder/
    Disallow: /login/

    User-agent: Baiduspider
    User-agent: Baiduspider-video
    User-agent: Baiduspider-image
    Disallow: /
    SEMrush
    It looks like baidu is ignoring robots.txt and still coming to my forum. i have 10-20 visitors and over 50 baidu IPs. I don't want it, i don't want visitors from china anyway?

    How to stop baidu spider? Is my robots.txt right?

    tahnks
     
    web_master, Dec 27, 2011 IP
    SEMrush
  2. kokopelli

    kokopelli Peon

    Messages:
    2,436
    Likes Received:
    29
    Best Answers:
    0
    Trophy Points:
    0
    #2
    I think your syntax is wrong (you don't normally have to use "Allow") - try:

    
    User-agent: *
    Disallow: /help/
    Disallow: /search/
    Disallow: /stats/
    Disallow: /calendar/
    Disallow: /reminder/
    Disallow: /login/
    
    
    User-agent: Baiduspider
    Disallow: /
    
    User-agent: Baiduspider-video
    Disallow: /
    
    User-agent: Baiduspider-image
    Disallow: /
    
    Code (markup):
    BTW There are actually more Baidu uses search engine spiders/bots to crawl different types of content (and you'd have to block them all, if required):

    Baiduspider-image crawls images
    Baiduspider-mobile crawls mobile search content
    Baiduspider-video crawls videos
    Baiduspider-news crawls news content
    Baiduspider-favo crawls bookmarks
    Baiduspider-sfkr crawls Baidu PPC/ads
    Baiduspider-cpro crawls Baidu’s contextual advertising network


    If that doesn't work, you'll have to block Baidu via htacess, or on server-level if you have admin privileges. (Just Google it)

    Here's a good resource: http://www.robotstxt.org
     
    kokopelli, Dec 28, 2011 IP
  3. scylla

    scylla Notable Member

    Messages:
    1,021
    Likes Received:
    33
    Best Answers:
    1
    Trophy Points:
    225
    #3
    Or just IP ban the baidu bots.
     
    scylla, Jan 1, 2012 IP
  4. sarahk

    sarahk iTamer Staff

    Messages:
    24,363
    Likes Received:
    3,279
    Best Answers:
    94
    Trophy Points:
    615
    #4
    Or just "don't sweat the small stuff"

    Does it really matter?
    Your site will be indexed weekly by hundreds of spiders from sites you won't be able to identify easily (I once had a site documenting them). You will get a better result for your business if you focus on productive tasks and ignore the rogue spiders.
     
    sarahk, Jan 2, 2012 IP
  5. codingtalks

    codingtalks Peon

    Messages:
    10
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Y you need to stop Baidu Bot ??? :eek:

    Do u have Bandwidth Problem ??? :p
     
    codingtalks, Jan 3, 2012 IP
  6. Bohra

    Bohra Notable Member

    Messages:
    12,582
    Likes Received:
    537
    Best Answers:
    0
    Trophy Points:
    260
    #6
    actually the bot ur facing problem from is not exactly Baidu its just fooling people by saying its Baidu its a forum crawler which is collecting all ur data
     
    Bohra, Jan 7, 2012 IP
  7. WFgirls

    WFgirls Peon

    Messages:
    33
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    thnk you...
    is very wonderfull :p
     
    WFgirls, Jan 14, 2012 IP
  8. RoundShots

    RoundShots Active Member

    Messages:
    231
    Likes Received:
    16
    Best Answers:
    1
    Trophy Points:
    80
    #8
    if you don't want visitors from china, then i suggest you to ban ip ranges in your .htaccess file. :)
     
    RoundShots, Sep 11, 2015 IP