htaccess Remove Multiple Characters from URL

Discussion in 'Apache' started by m0nk3y, Dec 10, 2012.

  1. #1
    Hi Guys

    Hopefully someone can see what I'm doing wrong, but here's the story...

    My current site URL's are auto-generated by the ecommerce software from the product and category names, therefore if the product/category name includes a non-alphanumeric characer, this is encoded in the URL which is a pain.

    EG: mysite.com/Shop/Furniture-Set-Large-Table%2C-4-Chairs.html
    Code (markup):
    I am moving to a new ecommerce solution, which also autogenerates the URL's from the product name, but is clever enough to remove all non-alphanumeric characters. It also converts to lowercase, which I have managed to find a htaccess solution for redirecting uppercase to lowercase. It also does not have the 'Shop' part of the URL, which I have also managed to solve via htaccess.

    EG: mysite.com/furniture-set-large-table-4-chairs.html
    Code (markup):
    To remove the 'Shop' part:

    RedirectMatch 301 ^/Shop/(.*)$ http://www.mysite.com/$1
    Code (markup):
    To replace uppercase with lowercase to prevent a 404 error:

    RewriteCond %{REQUEST_URI} [A-Z]
    RewriteCond %{REQUEST_FILENAME} !\.(?:png|gif|ico|swf|jpg|jpeg|js|css|php|pdf)$
    RewriteRule (.*) ${lc:http://www.mysite.com/$1} [R=301,L]
    Code (markup):
    These both work perfectly.

    So I need an htaccess rule, or possibly several, to remove these encoded characters from the URL. I don't need to replace them, just remove them, because the software creates the URL as "Table%2C-4-Chairs" - so only the %2C needs removed.

    I need to remove certain character encodings from the URL, such as:

    comma (%2C), apostrophe (%27), colon (%3A), etc.

    Can anyone advise a suitable htaccess rule for this?

    Thanks in advance.
     
    m0nk3y, Dec 10, 2012 IP