w3 Validator "Sorry! This document can not be checked."

Discussion in 'HTML & Website Design' started by LazyD, May 4, 2007.

  1. #1
    So, im trying to validate one of my sites and it seems the majority of the pages are spitting out this error, except for the numbers change...

    From what I can tell its taking random lines, some lines that its choosing are blank, some have divs in them, etc.

    I cant find any characters or "bytes" that are non standard, etc.

    Does anyone know what causes this kind of thing? Im sure this may have some bearing on my SERPs so I would really like to fix this issue...
     
    LazyD, May 4, 2007 IP
  2. Felu

    Felu Peon

    Messages:
    1,680
    Likes Received:
    124
    Best Answers:
    0
    Trophy Points:
    0
    #2
    Could you please link to the page you are trying to check? Its just because of some special character which you copy/pasted from somewhere.
     
    Felu, May 4, 2007 IP
  3. LazyD

    LazyD Peon

    Messages:
    425
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #3
    LazyD, May 4, 2007 IP
  4. hans

    hans Well-Known Member

    Messages:
    2,923
    Likes Received:
    126
    Best Answers:
    1
    Trophy Points:
    173
    #4
    such validator error messages occur if you have used more than one editor to edit or create or modify a page and if the editors had different char set configurations

    or

    if you copied and pasted text from another page/another website with a different charset config to your page

    either way

    since you have the lines where these invalid characters occur

    open that file in a CLEAN editor
    LOOK at those lines - TOP one listed in error message always first !!!

    and delete the words and RETYPE with your clean editor again the same text - NO copy / paste !!!

    even BLANKS might be different if they have been pasted instead of created

    always start with TOP listed error first
    then revalidate to see if validation changes

    make sure you have correct charset definition in all meta tags

    a site NAME URL would have helped a LOT !!!
     
    hans, May 4, 2007 IP
  5. LazyD

    LazyD Peon

    Messages:
    425
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #5
    I just tried rewriting one of the lines that shows as an error and it didnt fix anything...

    I tried removing that line and the one above and below and it changed the error lines...
     
    LazyD, May 4, 2007 IP
  6. hans

    hans Well-Known Member

    Messages:
    2,923
    Likes Received:
    126
    Best Answers:
    1
    Trophy Points:
    173
    #6
    what you think to be space are NO space but invisible wrong characters

    example - when i look at your page with a qualified professional editor - then i see the anchor text of the link

    Discover� Student Platinum Card

    after the word Discover you have a wrong charset sign that is displayed as ? because it is wrong

    it seems at forst glance that all or most are the same
    hence that might be text you copied instead of typed into your pages ...

    in above sample case
    DELETE until it looks like

    Discovetudent Platinum Card
    then retype until it looks
    Discover Student Platinum Card

    and repeat on all cases listed
    hence

    MasterCard� from Chase
    0% 6�Months
    Discover� Student Tropical Beach Card
    0% 6�Months 16.99%
    0% 6�Months 18.24%
    Discover� Student Monogram Card
    ...
    etc

    did u or did u NOT - just copy foreign content ?? instead of creating unique content ....

    PS
    these kind of errors are so extremely FATAL to bots that you have to find each error yourself rather than looking ar the error message lines !!!
     
    hans, May 4, 2007 IP
  7. Katy

    Katy Moderator Staff

    Messages:
    3,490
    Likes Received:
    513
    Best Answers:
    7
    Trophy Points:
    355
    #7
    Ah, the validator is down at the moment, but try taking out this line:

    <meta http-equiv="Content-Type" content="text/html;charset=utf-8" />

    and revalidate it. It should give you an error that the charset is missing and then try to use this one:

    <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />

    Good luck!
     
    Katy, May 4, 2007 IP
  8. LazyD

    LazyD Peon

    Messages:
    425
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #8
    Actually, the content was in XLS(Excel) format, then saved as CSV(Comma Delimited) then using a PHP script inserted into a MySQL database.

    Im sure there could have been some issue from the conversion of the XLS file. Im not sure how you identify a clean or professional editor, since im at work and we use Macs, can you reccomend an editor for Mac, currently im using one called TextWrangler and I dont see those messed up chars...

    Edit -

    I just saved the View Source of my page, put it into TextWrangler and opened it up with UTF-8 Encoding, and im seeing the errors your talking about now
     
    LazyD, May 4, 2007 IP
  9. hans

    hans Well-Known Member

    Messages:
    2,923
    Likes Received:
    126
    Best Answers:
    1
    Trophy Points:
    173
    #9
    sorry no mac editors known
    since the web mostly is Linux I run Linux and use Quanta Plus as editor

    but for future when converting files
    KNOW the charset definition of initial file
    then CONVERT file
    renaming charset is NO conversion !!

    there are special tools that convert all text of a file into another charset of same file

    before validator survives such kind of files and gives full validaton output you need to replace ALL to the very last one of such series of illegal code manually then revalidate

    the entire page apparently is UTF-8 with the exception of those space-looking special characters that appear even in my ffx browser as a black questionmark

    ultimately you may have to find a reliable procedure of creating clean text rather than converting a converted text
    even MySQL may one day screw up all data if you mess up things too much
    your MySQL also needs to be in same charset as your page/text/ etc
     
    hans, May 4, 2007 IP
  10. LazyD

    LazyD Peon

    Messages:
    425
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #10
    Truth be told ive never had an issue with database driven sites before, this is the first issue ive had with data types being wrong....

    Im not seeing the question marks in my browser, probably because I changed the encoding type of the page to UTF-8

    Now, I really need to figure out if I can get PHP to identify the invalid characters and remove them
     
    LazyD, May 4, 2007 IP
  11. hans

    hans Well-Known Member

    Messages:
    2,923
    Likes Received:
    126
    Best Answers:
    1
    Trophy Points:
    173
    #11
    :)
    and when job done successfully - you may remove your Lazy before your nick "D" - or better now to prevent self-hypnosis and to remove the laziness first before doing your job ...
    good luck

    P.S.

    one procedure would be to first find out which original charset the text was written
    by
    finding the desktop/PC on which the original doc has been created and by looking at that SW-config for charset

    it appears that you have 2 or 3 special characters that now appear empty blanks

    one of these special chars is a trademark symbol ( i the blanks after the credit card names

    using my usual conversion procedure the original html text appears to be NO utf8 nor iso8859-1 nor windows-1252

    but since you run mac - if the doc also has been created on a mac- may be the mac default charset !!??

    anyway
    when you finally figure out what charset
    then you have to correct the original full mysql db accordingly
     
    hans, May 4, 2007 IP
  12. LazyD

    LazyD Peon

    Messages:
    425
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #12
    Well, the document was originally created by my affiliate, so im not sure what they are using.

    The last time I updated the database via the XLS>CSV document was on my windows PC

    I just made a new one on my Mac and saved it as a CSV upped it and got different chars, but in the same places, I just went ahead and did a MYSQL REPLACE on the 3 different chars I found and that has eliminated 99% of the errors, however, im still having one that error that doesnt seem to be showing a strange char or anything, how do I go about finding that one?
     
    LazyD, May 4, 2007 IP
  13. VimF

    VimF Well-Known Member

    Messages:
    307
    Likes Received:
    27
    Best Answers:
    0
    Trophy Points:
    118
    #13
    I've just checked this with the validator:
    getcredit365.net/credit-cards/student.php

    There are 182 errors, most of them are easy to be fixed:

    + Use <br /> instead of <br>.
    + Missing ".
    + Don't use <b> and target=.
    + ID should only be used one time. Need more than one, use CLASS.

    Hope that helps.
     
    VimF, May 4, 2007 IP
  14. hans

    hans Well-Known Member

    Messages:
    2,923
    Likes Received:
    126
    Best Answers:
    1
    Trophy Points:
    173
    #14
    as said above - easy to fix every error message is explained in the validator

    you may also look at your template and make corrections where needed

    important:

    start always with top errors first
    as a single error on top of page may cause additional errors all the way down the page
     
    hans, May 4, 2007 IP
  15. fireshark

    fireshark Peon

    Messages:
    190
    Likes Received:
    3
    Best Answers:
    0
    Trophy Points:
    0
    #15
    from some posts above - try Coda, from Panic
     
    fireshark, May 5, 2007 IP
  16. LazyD

    LazyD Peon

    Messages:
    425
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #16
    Yea, I know about the errors in the validator, however, before the page couldnt even be checked for validation.
     
    LazyD, May 5, 2007 IP