what to do with those (���) that keep appearing?

Discussion in 'Search Engine Optimization' started by abuzant, Jan 2, 2007.

  1. #1
    Hello,

    In some cases, a strange � sign appears in syndicated content. How can we get rid of it?

    I tried uTF-8 as this could have been some char. from another charset, but that was useless. Also, what effect does that have on SE's?

    Thanks, Ruslan
     
    abuzant, Jan 2, 2007 IP
  2. cellularnews

    cellularnews Peon

    Messages:
    246
    Likes Received:
    6
    Best Answers:
    0
    Trophy Points:
    0
    #2
    It means that the web browser does not understand what the symbol is.

    I tend to accept the content in from a news feed, then the script which uploads it to the database scans the file for characters I know to be problematic and replaces them with unicode references.
     
    cellularnews, Jan 2, 2007 IP
  3. abuzant

    abuzant Well-Known Member

    Messages:
    956
    Likes Received:
    45
    Best Answers:
    0
    Trophy Points:
    140
    #3
    Well, do you have any clue what and how exactly to replace in such a pattern? :D
     
    abuzant, Jan 2, 2007 IP
  4. netprophet

    netprophet Banned

    Messages:
    288
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #4
    there are two possibilities:
    1) Change your utf-8 to ascii code
    2) or try to make some change html code in your web page and before uploading check in browser.

    even though it's not solved then give me your site's url.
     
    netprophet, Jan 2, 2007 IP
  5. kh7

    kh7 Peon

    Messages:
    2,715
    Likes Received:
    109
    Best Answers:
    0
    Trophy Points:
    0
    #5
    You need to have the character encoding that fits the language. So we can't help you with that, unless you tell us the language in question.
     
    kh7, Jan 2, 2007 IP
  6. abuzant

    abuzant Well-Known Member

    Messages:
    956
    Likes Received:
    45
    Best Answers:
    0
    Trophy Points:
    140
    #6
    1- bad call IMHO, you can not use ascii if you know your site is multilingual as it will never show anything but english letters.

    2- maybe better. i can do some changes if you give me any clues. Checking before uploading is not an option as my website is as big as technorati itself and has like 10K feeds syndicated every minute.. i can't control (check etc..) each of those.

    At the moment, i am thinking deeper by playing with the XML parser itself. But the thing i discovered is that the XML parser gets correct data and saves to cache (local filesystem) also good. The problem is in the rendering process.. :mad:

    Regarding the URL you asked for, i promise you will be the first to know about it (within 12 hours). The site is not officially released now, still preparing PR :D

    Thanks and still waiting for clues..
    Ruslan
     
    abuzant, Jan 2, 2007 IP
  7. abuzant

    abuzant Well-Known Member

    Messages:
    956
    Likes Received:
    45
    Best Answers:
    0
    Trophy Points:
    140
    #7
    abuzant, Jan 2, 2007 IP