Friends, I have a website. My pages don't remain in UTF-8 encoding, but change to Western ISO 8859-1 encoding. Because of that, all sorts of garbage appears on the site. I have done everything I could, but could not solve the problem. Please help... My pages are www.architectjaved.com/earthquakes/ Please advice what I can do. Thank you
1. you need first of all to validate your HTML pages see http://validator.w3.org/check?verbose=1&uri=http://www.architectjaved.com/earthquakes/ add a doc type declaration = line 1 ( one ) and correct any other errors until fully valid 2. since you have your char set encoding defined in the meta tags as utf8 <meta http-equiv="content-type" content="text/html; charset=UTF-8"> of course you also NEED to make sure these files ARE CREATED in utf8 !! it is inacceptable/illegal to CREATE a file using an editor that is configured in ISO and then display it as UTF8 UNLESS you have CONVERTED your file to the charset definition used online !!! see my small howto convert iso-8859-1 charset files to utf-8 the procedure is applicable for ALL conversions - just adjust the command-line for the desired source and destination charset accordingly each conversion per file MUST be done ONE single times only - if you accidently repeat the conversion more than ONCE - then you get a non-recoverable data salad !!! 3. as you can see in 1) HTML error validation output form w3 you have a Character Encoding mismatch! The character encoding specified in the HTTP header (iso-8859-1) is different from the value in the <meta> element (utf-8). you are running Apache/2.0.40 with Red Hat Linux hence you need to make sure your default charset configuration in your apache config is matchng your meta tag charset if you have meta tags UTF8 then either in your default apache server config (on my SuSE linux server this would be in /etc/apache2/default-server.conf) OR in your userspace .htaccess you would need to have a line like AddDefaultCharset utf-8 also make sure to have ALL your text or html editors in use properly configured for THE ONE charset encoding you actually want to use online.
Thank you, Sir, for an excellent answer. You are a real expert. I have updated .htaccess file The site looked fine on local computer. But all sorts of garbage text appeared once it was uploaded. Should i contact my webhost? It will take a long time. He just doesn't respond in time. If there is anything else I can do on my part to correct the problem, please tell me.
If the webhost guy's in Tahiti or something, then just for the meantime you can change your local page's encoding to ISO 8859-1 so at least the majority of the page looks fine (unless the page really is in like Japanese or something) and re-upload it. Then at least your page matches what the host is using until he can change it. The only way I could get my pages to change was to open a new file in my text editor, and copy paste my html (with the new encoding in the meta tag) and then save as ISO 8859-1 instead of my default utf-8.
1. you gave no URL to YOUR site - hence a precise reply / solution for YOU is impossible - everything else would be guess work. 2. if on your local machine the html pages display correctly and errors occur ONLY on remote www host - verify that your remote server ( htaccess default charset ) is EQUAL to the charset used offline by you - make sure meta tag - charset definition is equal to the above as well ALL system offline as well as online should be ONE charset definition to avoid char-salad in foreign language ( latin and asian ) content !!! changing meta tags OR htaccess ONLY gives correct results if the file originally is CORRECT! example fi you CREATE a file using tols/editors in ISO-8859-1 and accidently your entire local system is configured in ISO-8859-1 and your server runs UTF8 then changing server from UTF8 to ISO-8859-1 most likely solves the problem however IF you create files using tools/editors ISO-8859-1 and your server and all files have meta tags UTF8 - then you first need to convert every file as in above post link from original charset to actual online charset. a simple change of htaccess charset definition OR meta tag does NO conversion - just changes the way existing files are served/displayed but NOT the way they are CREATED. on my multi lingual site i switched many years ago from ISO-8859-1 to UTF8 - strictly ALL, i.e. file system, all remote apache default configuration, ALL my entire local host default configuration, all local tools, editors, system wide - and of course to have faith in such local configuration i run all local machines with LINUX as well. I would never trust the integrity of WIN tools. give me your URL and I may have a look - tell me what you use on localhost and remote server as well to facilitate precise replies. P.S.: if you talk about your site in your profile this site looks fine to my browser - of course i use firefox - correctly configured because there always is a possibility that a browser in use has a WRONG charset configuration ... a setting that typically can be changed by its user P.P.S: BTW it never hurts your own soul to get a kick ( motivation ) to create clean HTML content or clean up existing problems first http://validator.w3.org/check?verbose=1&uri=http://pinoybusiness.org/
Validator gave the message Character Encoding mismatch! The character encoding specified in the HTTP header (iso-8859-1) is different from the value in the <meta> element (utf-8). I will use the value from the HTTP header (iso-8859-1) for this validation. The problem is with the webhost. Unfortunately, he is not ready to remove default character encoding from the Apache server. Some dumbo, thinks that it may affect other websites hosted on his server... These people don't know the basics. Why don't they let publisher decide the character encoding of their page? Hell, I won't renew that contract next year...
your host is ABSOLUTELY right changing the default server-wide would totally screw up ALL other websites on his server and as a result all other site-owners may want to kickass ... your host loves to live and enjoy life, hence he is SMART host to do as he does! it's your job do redefine default needed for YOUR site in your own user-space .htaccess !! the line is the same for .htaccess as it is for apache default config - i.e.: AddDefaultCharset utf-8
Just out of curiosity, do you have a server-side programming language installed and enabled on the server by any chance (like PHP for example)? If you do, you can write a script that will serve your pages with the character encoding you want. To the best of my knowledge, this won't affect the other sites on the host's server, since it would only affect YOUR pages.
I'm with Dan on using a CGI or SSI to rewrite it for you - but I would suggest doing what I've done that ended ALL of the character set encoding headaches. Restrict yourself to 7 bit ascii and use ENTITIES for non-ascii characters. Then the character set really doesn't matter since they ALL have the base 7 bit ascii available as lowest common denominator. UTF-8 - and people thought this was a good idea WHY exactly? (maybe it's just because all my sites are US English - but I don't get it... but then I'm old school)
UTF8 is for all those who have strange languages or MULTI-lingual sites / pages. the cleanest and most efficient modern way to publish without any problems global content in any language is to offer UTF8 encoding. to have a site-wide UTF8 including file system, and all else makes it easier to have OTHER than HTML files in ANY other language/characters proplerly displayed, such as helpfiles or config files containing text to be included by SSI in HTML pages. i have at leat 7 languages and multiple language characters incl es/fr/de/bg/ru/en/cn/it properly displayed in drop down menus, html, mysql and all other places where an earlier iso8859-1 was insufficient and an ASCII would die and cause disaster-output. there is NO need for any scripting. .htaccess is the place to make your site-internal default charset just like the server owner defines a server-wide charset default, this is overridden by your own htaccess. as long as you have plain english and nothing else - then ASCII may be fine, as soon as you have forum or any other international activity, an internationalization is required to avoid problems and simplify all publishing. there always night be workarounds possible - but a full clean site wide utf8 or if you OWN a server ( i am happy to do ) a full serverwide UTF8 makes life easiest without any workaround at all - provided that also your FULL offline desktop/coding environment incl ALL editors are fully UTF8 configured. a difference between what you have on your desktop offline and your online environment may cause disaster, hence best always is to have exactly same on both ends - desktop+server, better even to have same OS on both ends to avoid any compatibility conflict. years ago we (friend of mine) had extensive char conflict for foreign (bg/ru) language publishing until he figured out how to properly configure his desktop. now all is fine. hence conflict solutions always should start with your OWN PC at home before screwing and twisting www stuff. a content should be CREATED on an environment with identical charset configuration as the publishing www server! WHY going multilingual and WHO? many of us are multilingual from highschool or so many have sites with at least partially graphical content, like wallpapers, ecards, photos, download, etc translating the small textual content around these graphical or download content oriented pages is little compared to the additional NEW customer range you may expand into! I recently expanded my ecards+wallpapers into fr/es and now have a few k unique visitors/day MORE than before with just a little extra work from mexico down to all southamerica and ES/FR as well hence IF you are multi lingual - USE it - adsense $ will love you and vv
AdSense won't love you if you're multi-lingual. (Off topic - Seriously, where does this garbage come from anyway?) Besides, I think Tommy Olsson said it best in his HTML FAQ over at SitePoint. I won't copy/paste it due to copyright issues, but it's right there in the first post. http://www.sitepoint.com/forums/showthread.php?t=428205
adsense LOVES ME A LOT MORE since I am multi-lingual since all my major publishing languages come from large global population regions/country groups and from countries which do offer adwords services by G - how else could I make a very solid xxxx$/m steadily increasing ?! of course if you happen to publish in a language NON-served by adsense currently, then you have visitors w/o interest in ads appearing or NOT understanding the language of ads appearing. but adsense / adwords is steadily expanding into new contries. example: friends of mine with native languages bg/ru started a website in both languages a year ago, then no adsense in bg only in ru. now months later both languages are serving adwords/adsense services in those countries - hence a new market for those being first in a particular language. just as a reminder for native popularity of some languages: fr expands wide into all africa es into most of central and south america each of fr/es may total a larger number of native speaking ppl than native en which is basically limited to us/uk/part of ca and au. totaling maybe some 400 million native en speaking. a number that easily is matched by each of fr/es speaking global population.
No, it doesn't. It just means that you have access to a larger customer base than you would normally have otherwise. AdSense is just a program. It has no emotion. Therefore it cannot love you. So use another ad network that targets those visitors. That is because Google is entering that particular market and wants to compete in it. Its goal is to make money, not make you money. The fact that its goals happen to coincide with yours is just a plus, nothing more. Yes, I know that. Again, it's market forces at work, not AdSense loving one person more than another. Google is trying to make money. If it can use you to make a mint while giving you a couple bucks in the process (or if you're good enough, give you a mint while it makes a fortune) it will do so without hesitation.