1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

2 questions: how long can file be? regular expressions?

Discussion in 'Apache' started by yulsa, Aug 28, 2010.

  1. #1
    Hello, looking for a redirects specialist. I have two questions:

    1. How long can .htaccess be? Our platform provider changed naming structures of our stories, so I can't do a simple mod-rewrite. But redirecting each article would literally be thousands of redirects. Will that slow down the server? Is there a limit on how many redirects there can be?

    2. Does redirecting always have to use regular expressions? I have a few pages that need to be redirected that look similar to this: www.domain.com/SOMETHING?i=1116423256281&b=1116423256281&t=/Default/gateway&xref= and I need to redirect it to the home page, www.domain.com. What would be the right code?

    Thank you for any suggestions.
     
    yulsa, Aug 28, 2010 IP
  2. tolra

    tolra Active Member

    Messages:
    515
    Likes Received:
    36
    Best Answers:
    1
    Trophy Points:
    80
    #2
    1. Might be better to write PHP which is called when a file is not found which converts the old name to the new name and then does a 301, that way you're only running it for missing pages not having to load 1000s of lines of .htaccess. Also it might save on hours of rule writing.

    
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule ^(.*) redirectoldtonew.php
    
    Code (markup):
    The above invokes redirectoldtonew.php if a request for a non-existent file or directory happens, which should be the case for your old pages.

    However this does depend on the way your script works, it's rewrite rules, the old URL patterns the new patterns. You don't provide enough information to be able to give you much more of a suggestion than this.

    Either way I'd look for a different solution than 1000s of rewrites in a .htaccess

    2. You can use the %{QUERY_STRING} in the RewriteCond to match the query assuming you need that to know if to redirect to the home page rather than SOMETHING.
     
    tolra, Aug 28, 2010 IP
  3. yulsa

    yulsa Peon

    Messages:
    19
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Hi Tolra. Thank you for your responses.

    1. I'm not sure what you mean about the php. We use a proprietary CMS system that isn't php based. Basically here is the problem -- the developers used to export all the stories on our site into a folder, and name them according to a specific nomenclature: /folder/subfolder1/subfolder2/article_name_based_on_title_eng.html Now they changed the system and going to pull that old folder down. The stories are now in a whole new folder, sometimes not following the same subfolder structure (though often they do). But the problem is that they also changed their names. So the same story as above now will reside in /en/subfolder1/subfolder2/article_name_shorter_storyID.html At this point I have only thought of two redirecting options -- mod rewrite at a folder level -- then all folders will redirect 301, but all the stories will be 404 errors (I really don't want to do that, because we are literally talking of thousands of pages. Alternatively I could ask our programmers to create a list from each storyID that will give its old address and its new address, and then redirect old to new for each story. But that really spells thousands of redirects on .htaccess. Will that affect server performance?

    2. Could you explain what you mean? "SOMETHING" is always the same word -- name of the software we run on.
     
    yulsa, Aug 28, 2010 IP
  4. tolra

    tolra Active Member

    Messages:
    515
    Likes Received:
    36
    Best Answers:
    1
    Trophy Points:
    80
    #4
    1. It doesn't matter what language you write it in, just you create a script to map the old to new, I just said PHP as that's the most common.

    The script can read the database therefore it can be coded with an understanding of the old URL format so that it can look up the data and then redirect to the new format.

    2. As SOMETHING is always the same word then:

    If you always want to send SOMETHING to the home page no matter what the query string is then just rewrite ignoring the query e.g.
    
    RewriteRule ^SOMETHING$ http://yoursite.com? [R-301]
    
    Code (markup):
    If you only want to send to the home page if the query string is some match but not everything then you need to use %{QUERY_STRING} with RewriteCond
     
    tolra, Aug 29, 2010 IP
  5. yulsa

    yulsa Peon

    Messages:
    19
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Tolra:

    I'm sorry if my questions are coming off dumb -- I only have a very basic idea of programming/servers so I'm not fully understanding the solution you offering, but I want to understand it because its a first new alternative anyone suggested in months (and I need to really wrap my head around it if I'm going to ask our developers to implement it).

    Can you please explain to me where exactly the script would reside? How would it do 301 redirects (for SEO purposes) and would it affect .htaccess at all? I thought you can only do 301 redirects from .htaccess or server config file. Are you saying that there are other ways to redirect? I want to avoid robots finding thousands of 404 errors.

    I really appreciate your suggestions.
     
    yulsa, Aug 29, 2010 IP
  6. tolra

    tolra Active Member

    Messages:
    515
    Likes Received:
    36
    Best Answers:
    1
    Trophy Points:
    80
    #6
    I don't know how your script is configured in .htaccess, how it's database is structured etc so I can't be specific.

    From the examples you gave me of the URLs it looks like all the new documents sit under an en folder, therefore you can rewrite en/.* to be sent to the script that creates the new documents.

    You can now add to the .htaccess so that if a request for a file happens and that file doesn't exist then it invokes something, so taking your old URL example when someone tries to access /folder/subfolder1/subfolder2/article_name_based_on_title_eng.html then it doesn't start with en/ so it's not sent to the new script and as it doesn't exist on the disk you can catch it with:

    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteRule ^(.*) redirectoldtonew.php

    That is, if file not found run the script redirectoldtonew.php (I'm using PHP as that's what I know, the language is unimportant).

    So at this point you've modified .htaccess and new URLs are working and old ones are complaining redirectoldtonew.php not found.

    Now you create the redirectoldtonew.php script, in PHP you can send header back to the browser e.g. 301 and the new location or 404 or whatever you need.

    So in the script you validate that the URL is valid for the old URL form and that the article exists in the database, if it doesn't then you return a 404 header, otherwise you calculate the new location of the article using the same rules and database information as the main script, once you have this you send a 301 header along with the URL you just calculated.

    Result is new URLs just work, old where there's no corresponding new article issue a 404 and for old articles that have new articles there's a 301 redirect.

    You can do the same with a custom 404 error document as a script rather than looking for non-existent files.
     
    tolra, Aug 29, 2010 IP
  7. yulsa

    yulsa Peon

    Messages:
    19
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    Thank you so much!

    I'm going to show this to our programmers on Monday and see if they can implement this on the site. Really appreciate your help.
     
    yulsa, Aug 29, 2010 IP