1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

How could I block the google search spider but still allow adsense?

Discussion in 'robots.txt' started by explorer, Jun 11, 2006.

  1. #1
    Could anyone tell me, what is the best way of

    a. blocking google’s search spiders from individual pages while
    b. still allowing google adsense bots to see the page content and place contextual ads correctly.


    Background

    Two of my sites deal with redwidgets and bluewidgets.

    redwidget.com is loved by all Search Engines.
    SEMrush
    bluewidget.com is sandboxed by Google and loved by Yahoo and MSN.

    I am going to transfer some of the sandboxed material on bluewidget.com onto redwidget.com then block the googlebot from those pages on redwidget.com so there are no duplicate content concerns.

    Thank you.
     
    explorer, Jun 11, 2006 IP
    SEMrush
  2. tflight

    tflight Peon

    Messages:
    617
    Likes Received:
    38
    Best Answers:
    0
    Trophy Points:
    0
    #2
    The spider for AdSense has a different name than the search spider. So you can use robots.txt to block Googlebot from certain pages and the AdSense spider will still be allowed to go there unless you specifically deny it.

    User-agent: Googlebot
    Allow: /
    Disallow: /nogooglesearch.html
    Disallow: /nogooglesearch2.html

    If you subscribe to the Google Sitemaps tool (even if you don't have a sitemap to submit) you can test your robots.txt file against different pages on your site and different Google spiders to make sure it is working how you intended.
     
    tflight, Jun 11, 2006 IP
    explorer likes this.
  3. explorer

    explorer Well-Known Member

    Messages:
    463
    Likes Received:
    40
    Best Answers:
    0
    Trophy Points:
    110
    #3
    Thanks tflight.

    I've decided not to block googlebot because of this:

    http://www.mattcutts.com/blog/crawl-caching-proxy/

    Google can take content from their mediabot for the search engine results. Although it may not present a danger right now, it may in future.

    I see, as a result of your answer rather than the original question, this thread has been moved from adsense to the robots.txt forum. Perhaps it's not the best place for it.
     
    explorer, Jun 11, 2006 IP
  4. Jean-Luc

    Jean-Luc Peon

    Messages:
    601
    Likes Received:
    30
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Hi,

    You can use this in the pages that should not be indexed in Google search engine :
    <meta name="googlebot" content="noindex,nofollow">
    Code (markup):
    Jean-Luc
     
    Jean-Luc, Jun 12, 2006 IP