Don't want linked page listed in google

Discussion in 'robots.txt' started by MicahCarrick, Jun 21, 2006.

  1. #1
    Everyone keeps pointing me to robots.txt, and I've read through the spec, but still don't understand what I can do.

    I already have a robots.txt file in which the subfolder 'cart' is dissallowed. There is a file in the cart folder named update.php which is a PHP script which updates a shopping cart and then redirects to the originating page. So, pages throughout the website might have links to update.php with GET variables:

    /cart/update.php?action=add&item=2
    /cart/update.php?action=add&item=5
    etc.

    So, even though robots.txt prevents /cart/ from being "crawled", the links to the update.php file from pages outside of /cart/ are resulting in those links showing up in google results. So now there is a google result for adding every item (there are thousands) to the cart. How can I ommit these results? The ones with update.php?

    - Micah
     
    MicahCarrick, Jun 21, 2006 IP
  2. mdvaldosta

    mdvaldosta Peon

    Messages:
    4,079
    Likes Received:
    362
    Best Answers:
    0
    Trophy Points:
    0
    #2
    disallowing /cart/ should disallow everything under it as well. Disallowing /cart/update.php would disallow everything else after it (all update.php? url's). You could also put nofollow tags on all links pointing to and from update.php if you like (but that won't help when other people link to them).
     
    mdvaldosta, Jun 21, 2006 IP
  3. Jean-Luc

    Jean-Luc Peon

    Messages:
    601
    Likes Received:
    30
    Best Answers:
    0
    Trophy Points:
    0
    #3
    You could add this line
    <meta name="robots" content="noindex,nofollow">
    Code (markup):
    within the <head>...</head> section of the pages you do not want to see in the index.

    Jean-Luc
     
    Jean-Luc, Jun 21, 2006 IP