Some of my databases have html code which can have all types of weird characters in them. It is all automated so there is nothing I can do to change the characters before they are inserted. What would be the best collation to use in my mysql databases? The default collation is "latin1_swedish_ci" for some reason. I changed a few to "utf8_bin" to test because the webpages are encoded as: <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> Code (markup): I need a collation that can handle encoding of a wide range of characters. I don't know much about it so i'm just wondering which is the best to use for html. I see a lot of sites that can handle the encoding better than mine with the same content using charset=utf-8 so it must be the collation in their db that must be set differently. Also, when inserting html into a db I use: mysql_real_escape_string($html_content_to_be_inserted) for example. But a new line \n is not inserted into the db or it is affected in some way. So it outputs all on one line and looks like a mess. I know if a newline \n can be inserted into the db I can simply use the php nl2br function to convert every \n to a <br /> tag, so it outputs correctly.
Oops, I didn't realize the UTF-8 part is case-sensitive. What happens if it's written as "utf-8"? Also, what's the difference between utf8_collation_ci and utf8_unicode_ci and utf8_general_ci ?