Blocking query strings in URLs

Discussion in 'robots.txt' started by Dave Hybrid, Jun 1, 2007.

  1. #1
    Hey guys,

    I have a small problem.

    I have a site where I'm blocking some php files for example:

    example.php

    I have this entered in robots.txt and the page isnt being indexed but the query string versions are, like...

    example.php?u=4545

    What do i put in my robots.txt to stop all variations of this file.

    Some on suggested simply /example but one would assume that would affect a directory?

    Please help

    Thanks.
     
    Dave Hybrid, Jun 1, 2007 IP
  2. Jean-Luc

    Jean-Luc Peon

    Messages:
    601
    Likes Received:
    30
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Hi,

    User-agent: *
    Disallow: /example.php
    
    Code (markup):
    This robots.txt disallows access to:
    - /example.php
    - /example.php?u=4545
    - /example.php_followed by_any_characters

    Jean-Luc
     
    Jean-Luc, Jun 1, 2007 IP
  3. Dave Hybrid

    Dave Hybrid Active Member

    Messages:
    997
    Likes Received:
    21
    Best Answers:
    0
    Trophy Points:
    83
    #3
    In Google that works, but it does not for Yahoo.

    Yahoo is not indexing example.php but is indexing example.php?u=222
     
    Dave Hybrid, Jun 1, 2007 IP
  4. Jean-Luc

    Jean-Luc Peon

    Messages:
    601
    Likes Received:
    30
    Best Answers:
    0
    Trophy Points:
    0
    #4
    Do you mean that Yahoo does not even respect the most basic robots.txt directives ? I can hardly believe that.

    Jean-Luc
     
    Jean-Luc, Jun 1, 2007 IP
  5. Dave Hybrid

    Dave Hybrid Active Member

    Messages:
    997
    Likes Received:
    21
    Best Answers:
    0
    Trophy Points:
    83
    #5
    Why would I lie?

    It respects the block in the robots.txt its blocks example.php

    Query strings are not in the robots.txt so why should it respect them?
     
    Dave Hybrid, Jun 1, 2007 IP
  6. Jean-Luc

    Jean-Luc Peon

    Messages:
    601
    Likes Received:
    30
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Dave,

    The Robot Exclusion Protocol is very clear :
    Yahoo claims it respects this 1994 standard. Of course, you might have found a bug in Yahoo's algorithm.

    Jean-Luc
     
    Jean-Luc, Jun 1, 2007 IP
  7. agnivo007

    agnivo007 Peon

    Messages:
    4,290
    Likes Received:
    289
    Best Answers:
    0
    Trophy Points:
    0
    #7
    Blocks all query string urls :
    Disallow: /*?*

    Blocks all example.php query string urls :
    Disallow: /example.php?*
     
    agnivo007, Jun 1, 2007 IP
  8. Jean-Luc

    Jean-Luc Peon

    Messages:
    601
    Likes Received:
    30
    Best Answers:
    0
    Trophy Points:
    0
    #8
    You never read the specification.:confused:

    Jean-Luc
     
    Jean-Luc, Jun 1, 2007 IP
  9. Dave Hybrid

    Dave Hybrid Active Member

    Messages:
    997
    Likes Received:
    21
    Best Answers:
    0
    Trophy Points:
    83
    #9
    Thats great, I will block them all as all URLS are rewritten anyway.

    Thanks.
     
    Dave Hybrid, Jun 1, 2007 IP