Hi, I have those pages: domain.com/tag/%D8%B7%D8 Does Search Engines read them and translate it to the real language?
Well yeah, just wonder if the search engines auto-generate it to the real languages or shall I do that.
lol, if you cannot read %D8%B7%D8 then how can you expect the spiders to? They aren't so intelligent as we, so you should be doing all the editing job to be on the safe side.
Jesus, you wanna kill me? hehe Can't edit millions of pages But I get lot of peoples tell me that SE's can understand it, after all they even gave same results..
What you have there is percentage encoded / url encoded URLs. In your cause the encoding seems to be from encoding non-ASCII characters as UTF-8 bytes. If that is indeed the cause, when the URLs and encoding are fully correct. I have just posted the first draft of my sitemaps FAQ item about character percentage encoding / url encoding.
It's an error if you do not url encode URLs in sitemaps / webserver. (Assuming that you have URLs that require to be encoded, e.g. if you use non-English characters etc.) Quote from official sitemaps.org protocol website: I will add that to my sitemap generator FAQ
Escape codes, url encoded strings are read very well by spiders. In fact if your browser can read the url string and display the results from the server, then the spider can! Spiders work like pieces of browser while opening remote content, but instead of parsing the result, they just analyze it. Take a look at http://www.google.com/support/webmasters/bin/answer.py?answer=35769, and if you are not sure if your url are spider readable, try to paste them into lynx. If it reads them, then spiders will. And this is not because human are less intelligent! Also if you want to know the reason why your url won't be followed by spiders, have a look to http://www.webrickco.com/buildsitemap.php. It will generate a sitemap, display in red every url that are not accessible and if you click on one of those lines, states the reason why it is not followable.