I am trying to SEO my existing website, and I have read that errors in the html are bad. So I checked my index page with www.validator.w3.org, and found loads of errors which I have mostly cleaned up, But I still get 'This page is not Valid HTML 4.0 Transitional!' As for the meta content type tag, the validator quite likes this one: <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> but not this one: <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> I have only got another 18 pages to do !! , but I'm wondering if there is any point, because while I was at the validator website I checked a few other sites, and they are full of errors, e.g.-- www.pagetutor.com (Google PR6) This page is not Valid HTML 4.01 Strict! Result: Failed validation, 40 Errors www.tesco.com (Google PR7) This page is not Valid XHTML 1.0 Strict! Result: Failed validation, 16 Errors No Character Encoding Found! Falling back to UTF-8. http://www.plumbworld.co.uk (PR4) shows: This page is not Valid -//IETF//DTD HTML//EN! Unable to determine parse mode! 265 errors So I was wonderng, is there any point?? --
No, of course there is no point! Why make it easy for your site visitors to see your content exactly how you have prepared it, let alone trying to make sure those pesky ROBOTS from the Search engines are able to spider your content?
I see your point about the other websites.. but just cos they're big corps doesn't mean they are a role model in web development... Validation is useful especially in eliminating most causes of inconsistency's between browsers..
Mike in my opinion if your a program or a designer offering services I would highly recommend Validating HTML valid css valid sitemap valid xml sitemap and Google like it as well, you can verify it in a Google account in Google webmaster tools it will also help your site get indexed ,google crawler I have seen 10000's that are not valid with great ranking
Yes, there is a point. But it's not for search engines, it's for the health of your Web site as well as your own sanity (not to mention the benefit of the people using your site). Valid pages tend to weigh less (byte wise), have fewer rendering issues, and can also be easier to maintain. Oh, and the "I'm trying to SEO my Web site" scares me to no end (I'm very active on the SEO board here at DP - just ask around if you don't believe me). You should be buildilng your Web sites for the people who will be using it, not the search engines. If you need solid valuable and real SEO advice, check out the first link in my signature.
Thanks for the replies. I am simply the 'webmaster' for our own small business selling our own product on the web. The site has a PR4, Google webmaster tools reports zero problems crawling it, it looks fine across IE and FFox, and for most (not all) of our main search terms we appear on Google page 1. But the validator show loads of errors. Just wondered how perfect we need to be that's all.
I'd validate the HTML and CSS anyway, and then address any usability and accessibility issues that are present so that when people do get to the Web site, they're actually able to access the content they're looking for and use the site as best they can.
Great article Dan, a mass of good advice there. I have validated my index page except for 4 'margin' attribute conflicts which I do not understand. I marked the page Transitional 4.0 but validator says it ain't. Also calling the content type utf-8, although my other pages are currently 8859. I have validated the css fine. Thing is, I constructed my site by learning a bit here, a bit there, trial and error, and copying others that work, so it's a bit of a mishmash really, and consequently I have no idea which convention I should aim for in validating it all -- 1.0 transitional, 4.0 strict, html, xhtml, pork and beans etc. means very little to me, nor does utf-8 and 8859. As I'm short of time for all this, is there somewhere or way to get a simple dummy level run down on which html convention and doctype I should aim for, bearing in mind I'm not starting from scratch, but trying to validate an existing site?
Validation is important for one obvious reason - passing it means your code is VALID, failing to pass means it is NOT... for too long web designers have gotten a free ride with their buggy bloated crap code still working despite being so bug-ridden you'd need to drop a RAID factory on it to 'fix it'. I still say that browsers SHOULD have treated all HTML just as the XML specification is worded - when you encounter an error STOP rendering the page and throw a huge error message, do NOT just keep plodding along as if nothing is wrong. Because frankly, if you cannot be bothered to write valid HTML, the users shouldn't bother visiting your site. As to choosing a doctype and character set, it really is up to you... a good rule of thumb is that new pages, be they HTML or XHTML, should be done in a strict doctype so you are not using tags that have been dropped from the specification. The choice of HTML or XHTML depends on how you feel about structure, and if you plan on allowing the page to be parsed not only by conventional browsers, but to work when sent to an XML parser. (most HTML pages with their unclosed tags will just stop when they hit what XML considers an error) - I prefer XHTML 1.0 strict because it mandates closing tags and declaring certain things explicitly... (though I disagree with some of it, I disagree with XHTML less than I do HTML) For character set the choice is muddier - If you restrict yourself to the 7 bit ASCII set and use entities for non-ascii characters, it doesn't matter which of the 'big three' you choose... If you make extensive use of latin characters not in the ASCII set either ISO-8859-1 or UTF-8 are the better choices (STAY AWAY FROM WINDOWS 1252), and if you plan on a multilingual site, UTF-8 is the best choice though getting it working properly is frought with problems for nubes. Let's take a look at the worst of the lot - plumbworld.co.uk Tabs before the doctype, IE would be in quirks mode... Entire document tabbed in once for no good reason, that's nothing more than bloat. Generic HTML 3.2 doctype, kiss off the code on this page ever validating as there's more of 1997 to it than 2007. Javscripted menu - again this is 2007 not 1997 Table based layout for a two column fixed width page - that's just a waste of code. For all the indenting the indenting makes no sense. Use of line breaks, inlined CSS and presentational markup... that all needs to go. div's set to align center inside a TD - WHY? Wasted code. Half the tables and associated tags are not closed properly. Loaded down with a bunch of javascript that doesn't actually DO anything. I'm just surprised I'm not seeing spacer .gif's - all in all I'd say that's 17k of HTML doing the job of 4-6k... and I'd have to throw it all out and start over to even approach 'well written'. Seriously, if you don't know what's wrong with this: <table width="144" border="0" cellspacing="0" cellpadding="0"> <form name="searchbox" action="/search/1523" method="get" onSubmit="return checksearch(this)"><tr><td class="TableText" bgcolor="#ffffff"> Code (markup): One needs to back away from the keyboard and take up macrome weaving.