I'm moving servers. Platforms are nearly identical in terms of Apache, PHP and MySQL versions. However, all apostrophes, pound sterling signs, quotation marks, copyright and celsius signs etc. become question marks. Any of you know where the character setting is done, where I need to look? Right now I need to look through all content and replace all with html ascii codes which is probably better long term but it's such a pain on 2500+ pages.
You need the character set to be either... iso-8859-1 or UTF-8 Try with both and see if it works. A search and replace is easy enough to do on that...
I'll try that thanks. BTW these are the relevant lines being generated by the CMS: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html dir="LTR" lang="en"> [...] <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> Code (markup): Does the xml schema have anything to do with all this as well? (Trying the char set now.)
It's this you need to change... <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> I don't see any XML Schema (Only Doctype, HTML Tag, and Charset)
Yeah doctype is what I meant. Cheers mate! I get the same ?s with: <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> Code (markup):
Yes and I'm not any wiser because of it... # Specify a default charset for all pages sent out. This is # always a good idea and opens the door for future internationalisation # of your web site, should you ever want it. Specifying it as # a default does little harm; as the standard dictates that a page # is in iso-8859-1 (latin1) unless specified otherwise i.e. you # are merely stating the obvious. There are also some security # reasons in browsers, related to javascript and URL parsing # which encourage you to always set a default char set. # AddDefaultCharset UTF-8 # # Commonly used filename extensions to character sets. You probably # want to avoid clashes with the language extensions, unless you # are good at carefully testing your setup after each change. # See http://www.iana.org/assignments/character-sets for the # official list of charset names and their respective RFCs # AddCharset ISO-8859-1 .iso8859-1 .latin1 AddCharset ISO-8859-2 .iso8859-2 .latin2 .cen AddCharset ISO-8859-3 .iso8859-3 .latin3 AddCharset ISO-8859-4 .iso8859-4 .latin4 AddCharset ISO-8859-5 .iso8859-5 .latin5 .cyr .iso-ru AddCharset ISO-8859-6 .iso8859-6 .latin6 .arb AddCharset ISO-8859-7 .iso8859-7 .latin7 .grk AddCharset ISO-8859-8 .iso8859-8 .latin8 .heb AddCharset ISO-8859-9 .iso8859-9 .latin9 .trk AddCharset ISO-2022-JP .iso2022-jp .jis AddCharset ISO-2022-KR .iso2022-kr .kis AddCharset ISO-2022-CN .iso2022-cn .cis AddCharset Big5 .Big5 .big5 # For russian, more than one charset is used (depends on client, mostly): AddCharset WINDOWS-1251 .cp-1251 .win-1251 AddCharset CP866 .cp866 AddCharset KOI8-r .koi8-r .koi8-ru AddCharset KOI8-ru .koi8-uk .ua AddCharset ISO-10646-UCS-2 .ucs2 AddCharset ISO-10646-UCS-4 .ucs4 AddCharset UTF-8 .utf8 # The set below does not map to a specific (iso) standard # but works on a fairly wide range of browsers. Note that # capitalization actually matters (it should not, but it # does for some browsers). # # See http://www.iana.org/assignments/character-sets # for a list of sorts. But browsers support few. # AddCharset GB2312 .gb2312 .gb AddCharset utf-7 .utf7 AddCharset utf-8 .utf8 AddCharset big5 .big5 .b5 AddCharset EUC-TW .euc-tw AddCharset EUC-JP .euc-jp AddCharset EUC-KR .euc-kr AddCharset shift_jis .sjis Code (markup): I also see this: AddLanguage da .dk AddLanguage nl .nl AddLanguage en .en [more] Code (markup): Maybe english should be moved up?
Meh - Give it a go... I doubt it will change anything. Is iso-8859-1 in that list (I imagine it is).... : |
Yes it is but ISO in caps though. I may have to try that, maybe it's anal about capitalization. Just stumbled on another osCommerce setting: @setlocale(LC_TIME, 'en_UK.ISO_8859-1'); PHP: Will have to play with that as well to line it up with the charset.
setlocale on RedHat seems supposed ot be en_UK without the charset after it. And it seems I'm a total timewasting muppet. I was changing all these things reloading the page only to now realize this: When my colleague entered the data, like the word "it's" back when the charset was wrong, it was stored into the DB as "It?s". So no matter how many times I refreshed the browser, it would serve up a ? nontheless. Gotta do a couple more checks but seems like the locale sussed it. So for other people moving from FreeBSD to RedHat: Instead of: @setlocale(LC_TIME, 'en_UK.ISO_8859-1'); PHP: use @setlocale(LC_TIME, 'en_UK'); PHP:
The saga continues... I now also have it posted at the osCommerce forums here. What's left is an issue with e-mail and PDF. I can get the site to display everything just fine. The pound sign shows as a pound sign but only when it's stored in the database as "£" without the quotes, inc the weird A. Current locale is set as en_GB.ISO_8859-1 and charset as iso-8859-1. I tried many many combo's of the two. Typing locale -a on the command line gives me, amongst many other, these: en_GB en_GB.iso885915 en_GB.utf8 So how should I format my locale and charset is I see no underscores or hyphens on the server?
Just had reply from the host, they were forcing it in httpd.conf as UTF-8. Switched that abck to ISO whatever and it works. Thanks for checking anyway! This has been such a waste of time and all due to this one bloody setting. Gotta hate computers
How would i set the autoindex to use UTF-8? I don't use html files. I use the default index. Some of the descriptions that are necessary for file listings have special characters. Thank you, ~Bob