How to stop Googlebot to spider this page?

Discussion in 'robots.txt' started by paladin2, Jun 25, 2007.

  1. #1
    Hi Guys and Girls!

    Do you have any idea what should I write in my robots.txt file in order to prevent Google from spidering a few hundred pages that look like this:

    http:// domain dot com/?e=a&subcat=Business&l=A

    The variables are highlighted with RED color.

    I read that this is correct:

    Disallow: /*?

    Is this correct?


    Also If google already crowled a few pages, do I have any choice to tell Google via Google Sitemaps to throw them away?

    Thanks.
     
    paladin2, Jun 25, 2007 IP
  2. DeViAnThans3

    DeViAnThans3 Peon

    Messages:
    785
    Likes Received:
    83
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Put this in your robots.txt
    User-agent: Googlebot*
    Disallow: /
    User-agent: *
    Allow: /
    Code (markup):
    Put that in your root directory. Should be working. This only disallows googlebot on all pages. Hpe this helps.
     
    DeViAnThans3, Jun 26, 2007 IP
  3. abedelrahman

    abedelrahman Peon

    Messages:
    213
    Likes Received:
    2
    Best Answers:
    0
    Trophy Points:
    0
    #3
    yes, you need to your root directory
     
    abedelrahman, Jun 27, 2007 IP
  4. trichnosis

    trichnosis Prominent Member

    Messages:
    13,785
    Likes Received:
    333
    Best Answers:
    0
    Trophy Points:
    300
    #4
    google support * characters. adding

    Disallow: /*subcat=*

    will stop google to access it. this line will stop google to access urls which include subcat= .

    or adding

    Disallow: /*l=*

    will stop google to access urls which include l= .

    this can be used for google and yahoo but not msn
     
    trichnosis, Jul 3, 2007 IP