Other languages in the database

Discussion in 'MySQL' started by decepti0n, Mar 11, 2008.

  1. #1
    I'm building a site that'll hold other languages in the database as posts, and I'm wondering about any character issues.

    Normally I just leave the default "collation" settings which is in general latin something, but if I have say German or Spanish or even Japanese characters in the db, how can I ensure they're preserved right? Considering the data needs to be perfectly searchable by those people as well

    Thanks for any tips
     
    decepti0n, Mar 11, 2008 IP
  2. hans

    hans Well-Known Member

    Messages:
    2,923
    Likes Received:
    126
    Best Answers:
    1
    Trophy Points:
    173
    #2
    Currently I use in my blog de, es and en
    as charset of course UTF-8 in mySQL AS WELL AS ENTIRE SITE.

    zero problems with special characters - in flat file I additionally use ru, bg and other languages - same as above.
    important for multilingual sites of course is a correct and strict configuration
    starting from
    your text/html editors
    apache default charset
    files ( meta tags language and charset )
    all should have ONE and same charset configuration - in my case I have ALL strictly UTF-8
     
    hans, Mar 11, 2008 IP
  3. decepti0n

    decepti0n Peon

    Messages:
    519
    Likes Received:
    16
    Best Answers:
    0
    Trophy Points:
    0
    #3
    Would UTF-8 cover all characters for virtually all languages?
     
    decepti0n, Mar 11, 2008 IP
  4. hans

    hans Well-Known Member

    Messages:
    2,923
    Likes Received:
    126
    Best Answers:
    1
    Trophy Points:
    173
    #4
    I am NOT sure about that because "ALL" languages includes all chineese, asian, arab, south-east asian and more different characters. I am almost sure that SOME very extreme languages may require special treatment!

    however - on my desktop - fully configured in UTF-8 - I can view "most" asian languages, incl korean, japanese, chinese, ... just by using clean full UTF-8 configuration PLUS installing a bunch of foreign/asian fonts.

    but at least such a UTF-8 server configuration includes ALL eu-languages as well as cyrillic charset and many others.
    you would have to make a case by case test OR google for more details if you want "ALL" languages.
    nothing to worry about that because if you really have THAT HUGE traffic - then you also make ten thousands of $ / months adsense revenue to solve the problem.
     
    hans, Mar 11, 2008 IP
  5. decepti0n

    decepti0n Peon

    Messages:
    519
    Likes Received:
    16
    Best Answers:
    0
    Trophy Points:
    0
    #5
    Probably will do it case-by-case. Add a few ones that'll definitely work (closest to English characters) then experiment with chinese etc, adding as I go. Thanks for the tips
     
    decepti0n, Mar 11, 2008 IP
  6. hans

    hans Well-Known Member

    Messages:
    2,923
    Likes Received:
    126
    Best Answers:
    1
    Trophy Points:
    173
    #6
    with ALL eu languages UTF-8 will work perfectly.

    for asian ( chinese, etc ) languages - may be you first make some tests with a separate mysql db before risking problems with your main db

    you may find more help in a truly foreign language oriented mysql forum from webmasters who actually run exotic charset db's I am sure they know how to solve it.
     
    hans, Mar 11, 2008 IP