1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

how to block all other BOT except Google, MSN, Yahoo

Discussion in 'robots.txt' started by danieloffice, Jan 9, 2008.

  1. #1
    Hi,

    Need help,

    in robot.txt, how to exclude all other BOT except Google, MSN, Yahoo?


    Thanks
     
    danieloffice, Jan 9, 2008 IP
  2. SwapsRulez

    SwapsRulez Peon

    Messages:
    32
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #2
    You have to allow Google, MSN, Yahoo to crawl the whole site & disallow other bots to look into the root also..

    here is the sample code..

    User-agent: google
    Disallow:
    
    User-agent: yahoo
    Disallow:
    
    User-agent: msn
    Disallow:
    
    User-agent: *
    Disallow: /
    Code (markup):

    The above code will match the user agent by checking the substrings of the name of the robot of the perticular system... So that they will work perfectly fine... peace!
     
    SwapsRulez, Jan 12, 2008 IP
  3. sajidmm

    sajidmm Peon

    Messages:
    579
    Likes Received:
    10
    Best Answers:
    0
    Trophy Points:
    0
    #3
    wrong info.

    i know about google. it should be

    User-agent: googlebot
    Disallow:

    i am not sure about yahoo and msn bots.
     
    sajidmm, Mar 5, 2008 IP
  4. lhughes33309

    lhughes33309 Peon

    Messages:
    3
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Hi,

    The only problem with using the robots.txt file, is many bots don't follow the rules and are there to simply scrape the site. You need to use a script like perl that will feed your html etc. to all users and MSN, GOOGLE and Yahoo, but will trap the others and either give them a 500 server error or redirect them some where else.
    I have written many scripts to do just that.

    Thanks,
    lhughes33309
     
    lhughes33309, Mar 21, 2008 IP
  5. manish.chauhan

    manish.chauhan Well-Known Member

    Messages:
    1,682
    Likes Received:
    35
    Best Answers:
    0
    Trophy Points:
    110
    #5
    manish.chauhan, Apr 6, 2008 IP
  6. al-zabir

    al-zabir Active Member

    Messages:
    208
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    58
    #6
    I think this will help all: :)

    User-agent: *
    Disallow: /
    
    User-agent: Googlebot
    Allow: /
    
    User-agent: Yahoo-slurp
    Disallow: 
    
    User-agent: Msnbot
    Disallow:
    Code (markup):
     
    al-zabir, Jul 11, 2008 IP
  7. rozane

    rozane Banned

    Messages:
    49
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    it should be
    User-agent: Googlebot
    Disallow:



     
    rozane, Sep 11, 2008 IP
  8. ggmittal

    ggmittal Guest

    Messages:
    27
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #8
    you want to disallow all other than google, yahoo, msn...

    User-agent: *
    Disallow: /

    User-agent: Googlebot
    Allow: /

    User-agent: Yahoo-slurp
    Allow: /

    User-agent: Msnbot
    Allow: /



    don't you think, this is the right one...
     
    ggmittal, Feb 13, 2009 IP
  9. jik34

    jik34 Active Member

    Messages:
    586
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    83
    Digital Goods:
    4
    Articles:
    2
    #9
    You can also use .htaccess to deny some ip's from the robots that are bugging you.
     
    jik34, Feb 14, 2009 IP
  10. ggmittal

    ggmittal Guest

    Messages:
    27
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #10
    right.. with .htaccess you can redirect the links to any desired links.. this option work in case you are having a problem of multiple links...
     
    ggmittal, Feb 15, 2009 IP
  11. jik34

    jik34 Active Member

    Messages:
    586
    Likes Received:
    11
    Best Answers:
    0
    Trophy Points:
    83
    Digital Goods:
    4
    Articles:
    2
    #11
    Yes, but you can block the BOT by ip's in .htaccess

    For example, if the BOT has ip: 66.249.71.xxx

    The command in .htaccess will be:

    <Limit GET POST>
    order deny,allow
    deny from 66.249.71.xxx
    allow from all
    </Limit>
     
    jik34, Feb 15, 2009 IP
  12. miccy

    miccy Well-Known Member

    Messages:
    850
    Likes Received:
    9
    Best Answers:
    0
    Trophy Points:
    128
    #12
    Thanks jik34, i will done that in my japanese av blog
     
    miccy, Feb 23, 2009 IP
  13. DareDevils

    DareDevils Active Member

    Messages:
    607
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    70
    #13
    this one is correct:

    User-agent: *
    Disallow: /
    
    User-agent: Googlebot
    Disallow:
    
    User-agent: Yahoo-slurp
    Disallow: 
    
    User-agent: Msnbot
    Disallow:
    PHP:
     
    DareDevils, Feb 23, 2009 IP
  14. kiranak

    kiranak Peon

    Messages:
    11
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #14
    Are there any bad "bots" that I can avoid explicitely?
     
    kiranak, Mar 1, 2009 IP
  15. shailendra

    shailendra Peon

    Messages:
    1,225
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    0
    #15
    absolutely correct
     
    shailendra, Mar 2, 2009 IP
  16. munchausen

    munchausen Member

    Messages:
    38
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    30
    #16
    Here is a version that is correct, compact, and more complete:

    User-agent: Googlebot
    User-agent: Slurp
    User-agent: msnbot 
    User-agent: Mediapartners-Google*
    User-agent: Googlebot-Image 
    User-agent: Yahoo-MMCrawler
    Disallow: 
    
    User-agent: *
    Disallow: /
    
    Code (markup):
     
    munchausen, Mar 16, 2009 IP
  17. stizzard

    stizzard Peon

    Messages:
    30
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #17
    Thanks, just what I was looking for.
     
    stizzard, Mar 20, 2009 IP
  18. kmpoaquests

    kmpoaquests Active Member

    Messages:
    55
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    63
    #18
    Robot Meta generator

    www.submitcorner.com/Tools/Robots
     
    kmpoaquests, Mar 20, 2009 IP
  19. jamesjame

    jamesjame Peon

    Messages:
    66
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #19
    User-agent: *
    Disallow: /

    User-agent: Googlebot
    Disallow:

    User-agent: Yahoo-slurp
    Disallow:

    User-agent: Msnbot
    Disallow:

    This one is helpful to block all other BOT except Google, MSN, Yahoo.
     
    jamesjame, Mar 24, 2009 IP
  20. unknownpray

    unknownpray Active Member

    Messages:
    3,831
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    70
    #20
    you can use robots.txt, but thats not guranteed to work because some bots might not honor that. sure fire approach is detecting user agent and redirect them programatically.
     
    unknownpray, Mar 21, 2010 IP