URL rewrite for german characters

Discussion in 'Apache' started by frenchi1966, Nov 20, 2008.

  1. #1
    Hi,

    can you help me how i can rewrite german characters in url as follows:

    ä => ae
    Ä => Ae
    ö => oe
    Ö => Oe
    ü => ue
    Ü => Üe
    ß => ss

    currently if somebody writes on of this characters system is only returning a "_" for it.

    Example:
    now: /Wien_n_he_Schloss_Sch_nbrunn.html

    then: /Wien_naehe_Schloss_Schoenbrunn.html


    please kindly adv what i have to do resolve this issue - and if i have to put it on top.

    thanks and brgds

    Manfred
     
    frenchi1966, Nov 20, 2008 IP
  2. frenchi1966

    frenchi1966 Peon

    Messages:
    42
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Typo error - should read

    Ü => Ue
     
    frenchi1966, Nov 20, 2008 IP
  3. hans

    hans Well-Known Member

    Messages:
    2,923
    Likes Received:
    126
    Best Answers:
    1
    Trophy Points:
    173
    #3
    1.
    from SEO/SEF point of view your solution appears unhappy
    2.
    go UTF8 strictly across all german ( or other latin language texts )
    3.
    if you see what Google does with proper UTF8 - server PLUS text editor !
    then you see that Google just cuts the second byte resulting in ö>o, ü>u, etc
    and in G search results that results in SAME SERPS with or w/o umlaute !!

    in your above proposes rewrite solution however yoiu lose the SEF aspect of your URL completely

    finally
    there is NO reason why you should have ANY umlaute at all in your URL. if such URLs are written automatically by others or by scripts/CMS then strip the second byte from your UTF8 URL text to get plain ASCII - URLs. that makes it all much more simple and stable for the coming years for most current major SE and all current minor SE !!

    in UTF8 - the first byte is the standard ASCII letter > a, o, o and u and the second byte is the umlaut-zeichen on top of your standard ascii letter.

    the key to successful latin languages is to have strict UTF8 across all server system / apache AND your offline PC/editors/creative SW !!
     
    hans, Nov 20, 2008 IP