Tutorial: Changing CMS | Redirect old URLS to sub-folder without effecting new URLs

Discussion in 'Site & Server Administration' started by wtg, Jan 28, 2012.

  1. #1
    Scenario: I was using wordpress as CMS for one of my website, wanted to change CMS . At the same point didn’t wanna lose 2000+ pages indexed in google and the traffic for those pages.

    Motive of this post : Someone with similar can take advantage.


    Difficulties: We can easily move wordpress install to a subfolder like archive and then make global edirect rules to redirect www.domain.com/page-url to www.domain.com/archive/page-url. In this case all new pages on the new CMS will follow the same rule and www.domain.com/new-cms-page will redirect to /archive/new-cms-page (which does not exist)

    Expectation: Move wordpress to a new folder such a way that all old URLs will redirect to new folder but homepage and URLs made by new CMS will not redirect.
    Figured out this could be achieved if I can add a redirect rule for each individual URL (2000+ URLs).

    Expected output for .htaccess:

    redirect 301 /pageurl1 http://www.domain.com/archive/pageurl1
    redirect 301 /pageurl2 http://www.domain.com/archive/pageurl2
    redirect 301 /pageurl3 http://www.domain.com/archive/pageurl3




    Writing a redirect rule for each URL will take ages for this website. Figured out a quick way to do so, let’s see how I created a htaacess file with 2212 redirect rules.

    1. Download sitemap.xml in C:\Temp
    2. Open CMD and switch to C:\temp
    1. Start ->run
    2. Cmd {Enter}
    3. Cd /d C:\temp
    3. Sitemap.xml contains all the URLs but with extra informations, Let’s Get just the URLs.
    a. Type below command exactly in CMD and press enter (Type is included)
    b.type sitemap.txt | find /I “www. “ >urls.txt
    c. It will output a text file with list of URLs of your website.
    4. Now the URL list will have URLs but with something extra before and after like this

    <loc>http://www.yourdomain.com/sample-url-2006</loc>
    5. We need to remove these extra stuff.
    6. Press Ctrl +h (Find-Replace) in notepad and replace <loc> with nothing, same for </loc>.
    7. Now we have a clean list of just the URLs. Like this
    http://www.yourdomain.com/sample-url-2006

    8. We are ready to write redirect rule but before that we have to remove the domain from URls.
    9. Lets replace http://www.yourdomain.com with “Redirect 301 “ (without quotes and a space at end)
    10. It will replace http://www.yourdomain.com to redirect rule first part (means removing your domain name from URLs and adding first part of redirect rule to make it redirect rule friendly)
    11. Now copy everything and paste in a excel work sheet first column
    12. Get back to notepad and now replace “Redirect 301 “ with http://yourdomain.com/archive (making 2nd part of the redirect rule)
    13. Copy everything and paste in column B of excel work sheet.
    14. Now Come to column C of excel and type this
    =(CONCATENATE(A1,B1))
    15. The output of column C will be the combined value of first and 2[SUP]nd[/SUP] part of our redirect rule like this
    redirect 301 /pageurl http://www.domain.com/archive/pageurl

    16. copy Colum C, select all the rows in column C and hit Paste.
    17. You will see column C: have redirect rule for all the URLs.
    18. Copy column C and paste in your htaccess at root of the domain (append in last)
    19. Move your wordpress install to archive folder (Folder name can be anything)
    20. Note that moving wordpress install need additional steps which is not covered in this tutorial

    Finally you should have all your visitors coming from search engines to old URLs, Visitors looking for your homepage landing on your new homepage.


    I tried to explain it as much as I could, if you still have difficult time understanding this,try this on a test install. I would not mind replying to PM if you face issues with a similar condition.
     
    wtg, Jan 28, 2012 IP