robots.txt file

Discussion in 'AdSense' started by oceanside monkey, Dec 2, 2008.

  1. #1
    I got an email from adsense saying this:

    While reviewing your ad implementation, we noticed that your robots.txt file is currently preventing our AdSense crawler from reaching a significant number of pages with ads in your account.
    
    In order to serve targeted, paid ads to your sites, our crawler needs to visit your sites’ pages to determine their content. Please update your robots.txt file to allow the AdSense crawler to access all pages showing Google ads. You can allow the AdSense crawler access to your sites by adding the following lines to your robots.txt file:
    
    User-agent: Mediapartners-Google*
    Disallow:
    
    Thanks for helping enable us to serve the most relevant ads to your sites.  Please note that in the future, if we can't crawl some of your pages, we may disable ad serving to those pages.
    
    For more information, please visit: https://www.google.com/adsense/support/bin/answer.py?answer=37091
    
    We appreciate your understanding.
    
    Yours sincerely,
    
    The Google AdSense Team
    Code (markup):

    I updated my robots.txt file, the following are the first four lines:

    User-agent: Mediapartners-Google*
    User-agent: *
    Disallow:
    Disallow: /cgi-bin/
    Code (markup):
    I'm not too familiar with the syntax of robots.txt files. Are the first three lines correct syntax? Any redundancies or suggestions? Please let us know if there are errors. Thanks
     
    oceanside monkey, Dec 2, 2008 IP
  2. eLeSlash

    eLeSlash Active Member

    Messages:
    1,233
    Likes Received:
    9
    Best Answers:
    0
    Trophy Points:
    80
    #2
    In my opinion its ok. Thats how my robots.txt is.
     
    eLeSlash, Dec 2, 2008 IP
  3. HarriL

    HarriL Member

    Messages:
    12
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    36
    #3
    I just received this same message.. Weird, I have no robots.txt on my sites and always get targeted ads on my sites. How many of you have received this? It would be nice to know if they are sending this to everyone.
     
    HarriL, Dec 2, 2008 IP
  4. JohanBru

    JohanBru Member

    Messages:
    80
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    41
    #4
    Shouldn't it be
    User-agent: Mediapartners-Google*
    Disallow:
    User-agent: *
    Disallow: /cgi-bin/
    Code (markup):
    So you pair up the User-agent with what it is allowed/disallowed?

    Some useful info about the email you got on http://forums.digitalpoint.com/showthread.php?t=1134946
     
    JohanBru, Dec 2, 2008 IP
  5. config_error

    config_error Well-Known Member

    Messages:
    1,719
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    130
    #5
    Allow single robot:
    User-agent: Google
    Disallow:

    User-agent: *
    Disallow: /

    Allow All robots:

    User-agent: *
    Disallow:
     
    config_error, Dec 2, 2008 IP
  6. malaysian explorer

    malaysian explorer Peon

    Messages:
    10
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #6
    I have a test page which is live on my website, and I do not wish it to be spidered. What's the code to use so that the page is not? Thanks.
     
    malaysian explorer, Dec 2, 2008 IP
  7. config_error

    config_error Well-Known Member

    Messages:
    1,719
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    130
    #7
    here is the code for robots. to disallow certain page.

    User-agent: *
    Disallow: /test/test.html
    Disallow: /articles/mark_sheldon_wong.html
    Disallow: /downloads.html


    hope this answers your problem..
     
    config_error, Dec 2, 2008 IP