1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

FAQ: mod_rewrite, 301 redirects, and optimizing Apache.

Discussion in 'Apache' started by Nintendo, Jul 30, 2005.

  1. #1
    [size=+4]mod_rewrite[/size]​

    [size=+2]Introduction.[/size]

    Welcome to mod_rewrite, the Swiss Army Knife of URL manipulation! Despite the tons of examples and docs, mod_rewrite is voodoo!

    This module uses a rule-based rewriting engine (based on a regular-expression parser) to rewrite requested URLs on the fly. It supports an unlimited number of rules and an unlimited number of attached rule conditions for each rule to provide a really flexible and powerful URL manipulation mechanism. The URL manipulations can depend on various tests, for instance server variables, environment variables, HTTP headers, time stamps and even external database lookups in various formats can be used to achieve a really granular URL matching.

    This module operates on the full URLs (including the path-info part) both in per-server context (httpd.conf) and per-directory context (.htaccess) and can even generate query-string parts on result. The rewritten result can lead to internal sub-processing, external request redirection or even to an internal proxy throughput.

    This module was invented and originally written in April 1996. [1]


    [size=+2]How to change your URLs from dynamic to search engine friendly static URLs using mod_rewrite.[/size]


    Get an example of the dynamic URL and the way you want it. For example

    http://www.domain.com/cgi-bin/store.cgi?section=Nintendo&id=4867635&item=Pokemon
    and
    http://www.domain.com/store/Nintendo/4867635/Pokemon.html

    Now that you got both URLs, make a domain.com/.htaccess file starting with...

    Options +Indexes
    Options +FollowSymlinks
    RewriteEngine on
    RewriteBase /
    RewriteRule ^

    Depending on the server, you might not need the first two lines.

    Right after RewriteRule ^ enter the static URL, then a $, a space, and then original URL (with out the domain part for both URLs).

    You now got...

    Options +Indexes
    Options +FollowSymlinks
    RewriteEngine on
    RewriteBase /
    RewriteRule ^store/Nintendo/4867635/Pokemon.html$ cgi-bin/store.cgi?section=Nintendo&id=4867635&item=Pokemon

    In the first URL, the static URL code, where ever the URL will change, replace it with a (.*) (Nintendo, 4867635
    and Pokemon in the example above).

    Then after .html add a $ and add a \ before the .html
    If you have a hyphen (-) in the new static URL, add a \ before the hyphen, for example...

    RewriteRule ^store\-(.*)\-(.*)\.html$ cgi-bin/store.cgi?section=Nintendo&id=4867635&item=Pokemon

    If you don't add the \, you might get an Internal Server Error message, depending on the servers Apache version.

    Now in the static part of the URL where the URL changes, in the first change, change it to $1, then $2 and so on. Then add an [L] at the very end, with a space before the [L].

    You now got...

    Options +Indexes
    Options +FollowSymlinks
    RewriteEngine on
    RewriteBase /
    RewriteRule ^store/(.*)/(.*)/(.*)\.html$ cgi-bin/store.cgi?section=$1&id=$2&item=3 [L]

    Save the .htaccess file and upload it at domain.com/.htaccess and your static URLs will now work.
    http://www.domain.com/store/Nintendo/4867635/Pokemon.html

    Here's some other examples...

    http://www.domain.com/cgi-bin/store.cgi?section=Nintendo&id=4867635
    RewriteRule ^store/(.*)/(.*)\.html$ cgi-bin/store.cgi?section=$1&id=$2 [L]

    http://www.domain.com/cgi-bin/store.cgi?section=Nintendo
    RewriteRule ^store/(.*)\.html$ cgi-bin/store.cgi?section=$1 [L]

    http://www.domain.com/cgi-bin/store.cgi
    RewriteRule ^index\.html$ cgi-bin/store.cgi [L]

    In this last example domain.com will show the index of the script. If the page shows nothing, try

    RewriteRule ^$ cgi-bin/store.cgi [L]


    With all the examples combined, you got...

    Options +Indexes
    Options +FollowSymlinks
    RewriteEngine on
    RewriteBase /
    RewriteRule ^store/(.*)/(.*)/(.*)\.html$ cgi-bin/store.cgi?section=$1&id=$2&item=3 [L]
    RewriteRule ^store/(.*)/(.*)\.html$ cgi-bin/store.cgi?section=$1&id=$2 [L]
    RewriteRule ^store/(.*)\.html$ cgi-bin/store.cgi?section=$1 [L]
    RewriteRule ^index\.html$ cgi-bin/store.cgi [L]

    Notice the order. if you list it as...

    Options +Indexes
    Options +FollowSymlinks
    RewriteEngine on
    RewriteBase /
    RewriteRule ^index\.html$ cgi-bin/store.cgi [L]
    RewriteRule ^store/(.*)\.html$ cgi-bin/store.cgi?section=$1 [L]
    RewriteRule ^store/(.*)/(.*)\.html$ cgi-bin/store.cgi?section=$1&id=$2 [L]
    RewriteRule ^store/(.*)/(.*)/(.*)\.html$ cgi-bin/store.cgi?section=$1&id=$2&item=3 [L]

    then mod_rewrite will freak out and it won't work! List the line with the most variables first, then the second most and so on.

    [size=+2]Can I have the .htaccess in a directory?[/size]

    Yes.

    In the above example, for having it at domain.com/store/.htaccess, change the code to...

    Options +Indexes
    Options +FollowSymlinks
    RewriteEngine on
    RewriteBase /store/
    RewriteRule ^index\.html$ /cgi-bin/store.cgi [L]
    RewriteRule ^(.*)\.html$ /cgi-bin/store.cgi?section=$1 [L]
    RewriteRule ^(.*)/(.*)\.html$ /cgi-bin/store.cgi?section=$1&id=$2 [L]
    RewriteRule ^(.*)/(.*)/(.*)\.html$ /cgi-bin/store.cgi?section=$1&id=$2&item=3 [L]

    You moved store/ up to the RewriteBase line and added / before cgi-bin. If the script was in /store/store.cgi
    you would of had store/ instead of cgi-bin/ and then just got rid of it, to look like...

    Options +Indexes
    Options +FollowSymlinks
    RewriteEngine on
    RewriteBase /store/
    RewriteRule ^index\.html$ store.cgi [L]
    RewriteRule ^(.*)\.html$ store.cgi?section=$1 [L]
    RewriteRule ^(.*)/(.*)\.html$ store.cgi?section=$1&id=$2 [L]
    RewriteRule ^(.*)/(.*)/(.*)\.html$ store.cgi?section=$1&id=$2&item=3 [L]

    The URL to the index of the store will be domain.com/store/

    [size=+2]Ack!!! Now it's messing up the rest of my site.[/size]

    If you have domain.com/index.html for example, make sure your mod_rewrited URLs use another extension, like .htm or .shtml.

    [size=+2]The original script URLs don't have the product name in the URL. Can I add the product name to the URL?[/size]

    Yes! If you can change the script to put the product names in the URL, or edit the links to link to them, yes you can. Here's an example. Notice there are two (.*)'s and no $2.

    RewriteRule ^(.*)/(.*)\.html$ cgi-bin/file.cgi?Item=$1 [L]

    Just edit the script links, or links in the static page to link to domain.com/whatever/PRODUCT_NAME.html have the product name show up where the last (.*) is in the .htaccess code.

    [size=+2]But how can I get rid of special characters or spaces?[/size]

    For perl, you can do search and replaces, for example...

    $value =~ s/ /_/g;
    $value =~ s/?//g;
    or
    $value =~ s/[^\w\d\-_. ]//g;

    which gets rid of almost everything but letters and numbers. Just make sure it only changes the URL and not the content. As for php or asp, I don't know how to do it there.

    [size=+2]Can I rewrite a sub-domain to a directory?[/size]

    Yes. Here's the code mnemtsas came up with...

    xxxxx.domain.com

    to

    www.domain.com/XXXXXX/

    RewriteCond %{HTTP_HOST} ^[www\.]*xxxxx.domain-name.com [NC]
    RewriteCond %{REQUEST_URL} !^/XXXXX/.*
    RewriteRule ^(.*) /XXXXX/$1 [L]

    [size=+2]Does .htaccess increase server load?[/size]

    I have yet to ever see it increase server load on my dedicated server. IMO, that's just a rumor. I got about 30 domains with about 54 lines in the domain.com/.htaccess file and have yet to ever see it effect the server. The only effect I've ever got is getting GoogleBombed (Google chomping away at the static URLs so much that the server almost crashes or does crash!!!). Don't panic. This is why you have static URLs, to help search engines crawl your site.

    If you ever see high server loads or a slow server, try optimizing Apache.

    [size=+2]How do I optimize Apache?[/size]

    You have to have access to the actual server through telnet as root.

    Edit your httpd.conf file.

    Here's the best settings I've found.

    Timeout 50
    KeepAlive On
    MaxKeepAliveRequests 120
    KeepAliveTimeout 10
    MinSpareServers 10
    MaxSpareServers 20
    StartServers 16
    MaxClients 125
    MaxRequestsPerChild 5000

    and then restart apache. Even when I have massively HIGH server loads, the sites are fast. Once I had the server load above 100, which is EXTREMELY high, and the static pages loaded as if nothing was high!!

    Don't ask me how to do it. If you don't know what you're doing, don't mess with it. Ask your web host. Mess up and your sites can 'die' until it get's fixed! For example, simply pressing return can crash your sites until you go back and undo the return, geting it back to how it was before.

    [size=+2]How can I do a 301 redirect?[/size]

    at domain.com/.htaccess

    Options +Indexes
    Options +FollowSymlinks
    RewriteEngine on
    RewriteBase /
    RewriteRule ^whatever/(.*)$ http://www.domain.com/$1/ [R=301,L]
    or
    RewriteRule ^index.htm$ http://www.domain.com/ [R=301,L]

    The second example only changes one URL.

    (.*) and $1 work the same way here as in mod_rewrite, so you can easily change a lot of URLs with one line. The only change with redirects and mod_rewrite is the R=301 (Redirect 301).

    [size=+2]Conclusion.[/size]

    Yes, mod_rewrite is voodoo, and it may look hard to learn, but it's not that hard. When I first tried to figure it out, I spent a day over at apache.org and hardly got any where (hence there is only one link there as the source to the introduction.) I then posted over on the Amazon Associate board, some one gave me a few lines of code, I changed it a little and with in a day I had a completely search engine friendly Amazon store using
    MrRats script, and my mod_rewrite hack, which as you may know by now, it completely revolutionized the Amazon AWS industry, until it drove Google insane! mod_rewrite rocks, if you got any URLs that have ?, =, or &, do mod_rewrite!
     
    Nintendo, Jul 30, 2005 IP
    schlottke, Infiniterb, THT and 23 others like this.
  2. aboyd

    aboyd Well-Known Member

    Messages:
    158
    Likes Received:
    17
    Best Answers:
    0
    Trophy Points:
    138
    #2
    Wow, that's pretty nicely done. I'm off to visit your AWS link. Thanks!

    -Tony
     
    aboyd, Jul 30, 2005 IP
  3. Nintendo

    Nintendo ♬ King of da Wackos ♬

    Messages:
    12,890
    Likes Received:
    1,064
    Best Answers:
    0
    Trophy Points:
    430
    #3
    Thanks. I couldn't get it any bigger if I tried to. The max a post can be is 10K, and the size of the post is exactly 10,000 characters!!! I had to knock off about 100 characters. :D:D:D

    It's also my first FAQ!
     
    Nintendo, Jul 30, 2005 IP
    DavidF likes this.
  4. expat

    expat Stranger from a far land

    Messages:
    873
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    0
    #4
    good post just two remarks

    (.*) allows zero and all characters numbers etc. this could be exploited

    ([A-Z]+) requires one or more and capital leters
    ([a-zA-Z]+) one or more all characters mixed etc....

    also try to avoid %20 (space) as sometimes regex has problems with this underscore or dash solves this to include these in above

    ([a-zA-Z_]+) or ([a-zA-Z-]+)


    Also using mod_rewrite may srew up finding your css and pictures
    To avoid this I use html base

    <head>
    <!-- … -->
    <base href="http://www.mydomain.com/" />
    <!-- … other head tags … -->
    </head> ​
    As a template for multiple sites or development environment and live I use php

    <head>
    <!-- … -->
    <base href="http://<?php echo $_SERVER['HTTP_HOST'] .
    "/" ?>" />
    <!-- … other head tags … -->
    </head> ​


    Expat
     
    expat, Jul 30, 2005 IP
  5. THT

    THT Peon

    Messages:
    686
    Likes Received:
    8
    Best Answers:
    0
    Trophy Points:
    0
    #5
    I can never get my head round these bloody things.... great guide...

    Could someone explain this....

    http://www .domain. com/index.php?cat=Cheese
    should be:
    http://www .domain. com/Cheese.html
    as a static url...

    whats wrong with this:

    Options +Indexes
    Options +FollowSymlinks
    RewriteEngine on
    RewriteBase /
    RewriteRule ^(.*)\.html$ index.php?cat=$1 [L]

    Thanks
     
    THT, Jul 30, 2005 IP
  6. Nintendo

    Nintendo ♬ King of da Wackos ♬

    Messages:
    12,890
    Likes Received:
    1,064
    Best Answers:
    0
    Trophy Points:
    430
    #6
    Would

    RewriteRule ^section/([A-Z]+)$ cgi-bin/file.cgi?whatever=$1 [L]

    be correct for section/ANYTHING

    As soon as I changed everything from (.*) to ([A-Z]+), it generates a dead link.

    When I change it to
    RewriteRule ^section/([a-zA-Z-]+)$ cgi-bin/file.cgi?SearchIndex=$1 [L]
    some kinds of links work.

    That's why I've always used (.*), since that's the only way I've got everything to work, and any time I see people get mod_rewrite codes with those other marks, quite often there reply is 'It doesn't work'!

    What do you get when you try that? Internel Server Error?? It looks correct.
     
    Nintendo, Jul 30, 2005 IP
  7. expat

    expat Stranger from a far land

    Messages:
    873
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    0
    #7
    you will have a prblem when cat is empty
    mydomain.com/.html no very valid

    use [a-zA-Z]

    cheers
    Expat
     
    expat, Jul 30, 2005 IP
  8. expat

    expat Stranger from a far land

    Messages:
    873
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    0
    #8
    ...RewriteRule ^section/([a-zA-Z-]+)$ cgi-bin/file.cgi?SearchIndex=$1 [L]...


    depends a bit on how the original links look. if you have %20 (blank) in them or any weird characters numbers stay with *

    when I design I avoid numers _- etc or I keep the - to have dashed urls rather than /dir1/dir2/ etc I like page-dir1-dir2.htm

    Expat
     
    expat, Jul 30, 2005 IP
  9. kalius

    kalius Peon

    Messages:
    599
    Likes Received:
    27
    Best Answers:
    0
    Trophy Points:
    0
    #9
    Very good nintendo, Thanks.
    Still I cannot explain the weirdness of .htaccess on my windows setup.
     
    kalius, Jul 30, 2005 IP
  10. Nintendo

    Nintendo ♬ King of da Wackos ♬

    Messages:
    12,890
    Likes Received:
    1,064
    Best Answers:
    0
    Trophy Points:
    430
    #10
    Arg!!! The script I use has everything, It's an Amazon script, so you got over a million different product names, and all the categories...I got the perl search and replace geting rid of the special characters, though they can have both - and _ and numbers. So the (.*) might be the only option there.

    I don't think Windows can have mod_rewrite. You got some other way to change the URLs and I don't know anything about that. ISS Rewrite.
     
    Nintendo, Jul 30, 2005 IP
  11. kalius

    kalius Peon

    Messages:
    599
    Likes Received:
    27
    Best Answers:
    0
    Trophy Points:
    0
    #11
    In my computer I have apache + php on windows xp. I can get mod rewrite to work, but the htaccess needs some extra stuff on it, no explination for it..
     
    kalius, Jul 30, 2005 IP
  12. Nintendo

    Nintendo ♬ King of da Wackos ♬

    Messages:
    12,890
    Likes Received:
    1,064
    Best Answers:
    0
    Trophy Points:
    430
    #12
    Is ([0-9a-zA-Z.'_\-]+) secure? After leting some extra craracters in, all the URLs started working.

    ([^.]+) also works, and ([0-9a-z.'_\-]+) does when it ends with [NC].
     
    Nintendo, Jul 30, 2005 IP
  13. vprp

    vprp Peon

    Messages:
    274
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    0
    #13
    Hi, I applied this to my page and it works well since I can now go to the html file. The only problem is, when I access my site, it still goes to the php file. It doesn't automatically go to html files. The only way I can access the html files is by typing the exact url. How do I do this?
     
    vprp, Aug 2, 2005 IP
  14. Nintendo

    Nintendo ♬ King of da Wackos ♬

    Messages:
    12,890
    Likes Received:
    1,064
    Best Answers:
    0
    Trophy Points:
    430
    #14
    You have to edit the script to change the link URLs! All mod-rewrite does is make the static URLs work.
     
    Nintendo, Aug 2, 2005 IP
  15. vprp

    vprp Peon

    Messages:
    274
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    0
    #15
    What happens if you have two url's with the same # of variables?
     
    vprp, Aug 3, 2005 IP
  16. Nintendo

    Nintendo ♬ King of da Wackos ♬

    Messages:
    12,890
    Likes Received:
    1,064
    Best Answers:
    0
    Trophy Points:
    430
    #16
    RewriteRule ^store/(.*)/(.*)/(.*)\.html$ cgi-bin/store.cgi?section=$1&id=$2&item=3 [L]
    RewriteRule ^shop/(.*)/(.*)/(.*)\.html$ cgi-bin/store.cgi?category=$1&id=$2&product=3 [L]
    RewriteRule ^deals/(.*)/(.*)/(.*)\.html$ cgi-bin/store.cgi?savings=$1&number=$2&cost=3 [L]

    Give each of them different URLs (store, shop, and deals directories in this example).
     
    Nintendo, Aug 3, 2005 IP
  17. expat

    expat Stranger from a far land

    Messages:
    873
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    0
    #17
    Like nintendo said mod just changes the acceptable apearance

    rewrite of mypage.htm?var=1&var=2 to mypage/1/2,htm or mypage-1-2.htm

    doesen't change anything just makes it "static" so
    mypage.htm?var=1&var=2 in a link will still work !!
    To make it SEO friendly you than have to change all these links in your site from mypage.htm?var=1 to mypage-1.htm or whatever your mod.

    So be aware these pages do not disapear and are still usable if addressed or linked to.

    Also there is no need to change endings .php .htm .html .asp they are all treated the same - so there is no milage in changing from .php to .htm

    Also hosts are normally set up to hunt for index.xxx as initial or home page, some also check home.xxx etc.

    so if they find index.php thats what they use and this is after any mod.

    A last word if you embark on mod and make your pages apear static to gain indexing - make the title and description vary use a script no milage in havin 200 static pages all the same title and description

    Expat
     
    expat, Aug 3, 2005 IP
  18. Nintendo

    Nintendo ♬ King of da Wackos ♬

    Messages:
    12,890
    Likes Received:
    1,064
    Best Answers:
    0
    Trophy Points:
    430
    #18
    Here's how to find out if you have mod_rewrite on the server.

    At domain.com/.htaccess have

    XBitHack Full
    Options +FollowSymlinks
    RewriteEngine on
    RewriteBase /
    RewriteRule ^index.page$ index.html [L]

    (If the index file isn't index .html, change that to what ever it is.)

    Then go to domain.com/index.page
    If the index page shows up then you got mod_rewrite.
     
    Nintendo, Aug 4, 2005 IP
  19. Liminal

    Liminal Peon

    Messages:
    1,279
    Likes Received:
    63
    Best Answers:
    0
    Trophy Points:
    0
    #19
    Hey Nintendo,

    Great tutorial!

    I do some 301s on my site but never bothered with

    Options +Indexes
    Options +FollowSymlinks

    Is that necessary at all? Note: my site is configured as virtual host in apache and I edit via .htaccess and have no idea what's in httpd.conf

    Cheers,
    James
     
    Liminal, Aug 12, 2005 IP
  20. Nintendo

    Nintendo ♬ King of da Wackos ♬

    Messages:
    12,890
    Likes Received:
    1,064
    Best Answers:
    0
    Trophy Points:
    430
    #20
    If it works with out it, then keep it off. Some servers, like mine required them to use mod_rewrite. And if anything .htaccess wise causes an Internal Server Error, taking them off can also be something to try there.
     
    Nintendo, Aug 12, 2005 IP