Using Mod ReWrite to delete HTTP arguments

Discussion in 'Apache' started by Will.Spencer, Feb 27, 2007.

  1. #1
    I have several thousand pages which people (i.e. alien robots) call with unexpected and undesired arguments.

    I would like to automatically strip these arguments and send them to the plain URL.

    In other words, I would like this:
    
    http://www.example.com/page.html?bah=blah&what=ever
    
    Code (markup):
    To become this:
    
    http://www.example.com/page.html
    
    Code (markup):
    Does anyone have an idea as to how to accomplish this?
     
    Will.Spencer, Feb 27, 2007 IP
  2. Nintendo

    Nintendo ♬ King of da Wackos ♬

    Messages:
    12,890
    Likes Received:
    1,064
    Best Answers:
    0
    Trophy Points:
    430
    #2
    Options +Indexes
    Options +FollowSymlinks
    RewriteEngine on
    RewriteBase /
    RewriteRule ^(.*).html(.*)$ http://www.domain.com/$1.html [R=301,L]
     
    Nintendo, Feb 27, 2007 IP
    Will.Spencer likes this.
  3. Will.Spencer

    Will.Spencer NetBuilder

    Messages:
    14,789
    Likes Received:
    1,040
    Best Answers:
    0
    Trophy Points:
    375
    #3
    When I use that, I get this:
    
    The page isn't redirecting properly
    Firefox has detected that the server is redirecting the request for this address in a way that will never complete.
    
    Code (markup):
    But, it looks like a start in the right direction. I'm still reading through it to try to understand the syntax.
     
    Will.Spencer, Feb 28, 2007 IP
  4. Will.Spencer

    Will.Spencer NetBuilder

    Messages:
    14,789
    Likes Received:
    1,040
    Best Answers:
    0
    Trophy Points:
    375
    #4
    Ahh... I have another RewriteRule in a subdirectory which is causing the trouble.

    Unfortunately, I can't change that rule (easily). I need to change the script which uses that rule.

    Thanks for the assist Nintendo!
     
    Will.Spencer, Feb 28, 2007 IP
  5. rodney88

    rodney88 Guest

    Messages:
    480
    Likes Received:
    37
    Best Answers:
    0
    Trophy Points:
    0
    #5
    That rewrite rule simply takes any request for an HTML page and redirects it back to itself, so it's no surprise it's causing an infinite loop. Also, the query string is treated differently to the requested script name (i.e. html page) and will always be passed by default. Even if there wasn't an issue with infinite redirects, it still wouldn't make any difference to the ?bah=blah&what=ever part.

    For instance, for http://www.example.com/page.html?bah=blah&what=ever and assuming an htaccess file at www.example.com/.htaccess, the subject pattern that is passed to the RewriteRule is only page.html. If you want to access the query string (everything after the ?), you need a rewritecond.

    Basically you want a rewritecond to see if there is a query string, then a rewrite rule that overwrites the query string. So something along the lines of:
    RewriteCond %{QUERY_STRING} !^$
    RewriteRule ^(.*)\.html$ /$1.html? [R=301,L]
    Code (markup):
    By including a ? in the rewrite destination, we are effectively replacing the query string of bah=blah&what=ever with a blank string, i.e. we remove it.

    Just a fyi, if you ever want to add to the query string in the destination without overwriting it (i.e. /folder/pie.php?cheese=nice to /index.php?requested=folder/pie.php&cheese=nice), you just need to add the QSA flag (Query String Append).
     
    rodney88, Feb 28, 2007 IP
    Will.Spencer likes this.
  6. Will.Spencer

    Will.Spencer NetBuilder

    Messages:
    14,789
    Likes Received:
    1,040
    Best Answers:
    0
    Trophy Points:
    375
    #6
    Aha! After many errors (putting the rules in the wrong place, putting the rules in the wrong order, not coding the subdirectory name into the replacement string, etc...), the whole ruleset is now working!

    Thanks rodney88! :)

    I should still rework the script itself, instead of working around it, but that's a task for another day.
     
    Will.Spencer, Mar 1, 2007 IP