Block Google Spider/Bot

Discussion in 'Programming' started by movidalatina, Feb 5, 2008.

  1. #1
    hello! anyone know how i can stop
    the Google spider bot from indexing
    my site? the whole site. I know there must
    be a file named robots.txt , but what should
    go in there?
     
    movidalatina, Feb 5, 2008 IP
  2. Alffy

    Alffy Peon

    Messages:
    9
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #2
    put this in your robots.txt:

    UserAgent: Googlebot
    Disallow: /

    Can also add:

    <meta name="robots" content="noindex, nofollow" />
    <meta name="robots" content="noarchive" />

    to the header sections of your web pages.
     
    Alffy, Feb 5, 2008 IP
  3. WebGeek182

    WebGeek182 Active Member

    Messages:
    510
    Likes Received:
    28
    Best Answers:
    0
    Trophy Points:
    95
    #3
    Those two lines should be combined into one line:
    <meta name="robots" content="noindex,nofollow,noarchive" />

    You don't want to have multiple meta robots tags (technically there only be one) because not all spiders interpret it the same way and some will only read one of the lines.
     
    WebGeek182, Feb 5, 2008 IP
  4. Webray

    Webray Active Member

    Messages:
    469
    Likes Received:
    7
    Best Answers:
    0
    Trophy Points:
    58
    #4
    ... or tell Google to go fly-a-kite with something like this:

    SetEnvIf User-Agent ^Googlebot stealthed #block googleBot - later dude

    <Limit GET POST >
    order allow,deny
    allow from all
    deny from env=stealthed
    </Limit>


    .htaccess ONLY
     
    Webray, Feb 5, 2008 IP
  5. movidalatina

    movidalatina Well-Known Member

    Messages:
    1,268
    Likes Received:
    15
    Best Answers:
    0
    Trophy Points:
    105
    #5
    Excellent. I'll try these.
    Thanks fellas.
     
    movidalatina, Feb 5, 2008 IP