how do i block agents/bots

Discussion in 'Search Engine Optimization' started by darrens, Jul 17, 2007.

  1. #1
    Hi,

    Can anyone tell me the correct way to block a set few pages on my site from being indexed/spidered by search engine bots.?
    I would like to do this via .htaccess.

    if you could show me a example of a .htaccess file that would block the following i can then cut/paste/edit it into my own .htaccess file.

    details.php
    contact.php
    contact_us.php

    thanks in advance.
     
    darrens, Jul 17, 2007 IP
  2. mann3r

    mann3r Peon

    Messages:
    1,416
    Likes Received:
    100
    Best Answers:
    0
    Trophy Points:
    0
    #2
    why not block it on robots.txt?
     
    mann3r, Jul 17, 2007 IP
  3. jazz7620

    jazz7620 Banned

    Messages:
    357
    Likes Received:
    12
    Best Answers:
    0
    Trophy Points:
    0
    #3
    use robots.txt
     
    jazz7620, Jul 17, 2007 IP
  4. darrens

    darrens Peon

    Messages:
    808
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #4
    ok ... how ... how do i write in robots.txt to ignore the pages
    details.php
    contact.php
    contact_us.php
     
    darrens, Jul 17, 2007 IP
  5. mann3r

    mann3r Peon

    Messages:
    1,416
    Likes Received:
    100
    Best Answers:
    0
    Trophy Points:
    0
    #5
    User-agent: *
    Disallow: /details.php
    Disallow: /contact.php
    Disallow: /contact_us.php

    please note, that they are case sensitive and if that file is located in the folder, just add the folder before it, like /folder/filetoblock.php
     
    mann3r, Jul 17, 2007 IP
  6. darrens

    darrens Peon

    Messages:
    808
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #6
    thank you ... :)
     
    darrens, Jul 17, 2007 IP
  7. mann3r

    mann3r Peon

    Messages:
    1,416
    Likes Received:
    100
    Best Answers:
    0
    Trophy Points:
    0
    #7
    no problem, my pleasure to help:)
     
    mann3r, Jul 17, 2007 IP
  8. uniqueasitis

    uniqueasitis Peon

    Messages:
    661
    Likes Received:
    29
    Best Answers:
    0
    Trophy Points:
    0
    #8
    By the way not all spider bots respect robots.txt
     
    uniqueasitis, Jul 17, 2007 IP
  9. mann3r

    mann3r Peon

    Messages:
    1,416
    Likes Received:
    100
    Best Answers:
    0
    Trophy Points:
    0
    #9
    this is true, only large players obeys this robots.txt, but if some of the crawlers are so called spam bots, then just block them via .htaccess
     
    mann3r, Jul 17, 2007 IP
  10. ajsa52

    ajsa52 Well-Known Member

    Messages:
    3,426
    Likes Received:
    125
    Best Answers:
    0
    Trophy Points:
    160
    #10
    Yes, you'll need to block those spam bots via .htaccess, but most spam bots are not using their own "bot name" on the user-agent field. Usually they send the user-agent of latest versions of Exporer or Firefox. So you'll need to block by IP and not by user-agent name.
     
    ajsa52, Jul 17, 2007 IP