W3C-compliance - always necessary?

Discussion in 'Search Engine Optimization' started by northpointaiki, Dec 5, 2005.

  1. #1
    This relates to another thread, but as for some reason I could not post an added reply to that thread ("no live links and signatures until after 10 posts"), and this is really a separate question, I wanted to start it up here.

    Background: I have a new site, www.a1-outdoors.com, which is ranking well in MSN, and my web logs are showing regular crawling from google, msn, and others. I am indexed in Google, my google sitemap (and, now, Yahoo sitemap) is working (shows as "O.K."), and my links are returning for many SE's. However, no data is available when I go to the Google sitemaps page and look for data on queries, errors, etc. Only the indexed pages - which in this case, as yet, is only my home page, for Google (using selfpromotion.com, I've recently multisubmitted more pages) - show up on the index stats. Hans helpfully pointed me to W3C's validation tool, and indicated the host of errors found there may be prohibitive with respect to googlebot's effective crawling of my site.

    I have another website, www.aikido-marquette.com, which contained a ton of W3C errors (22, including a lack of DOCTYPE opening line), and one which is not valid, by WC3's tool analysis. However, the site ranks quite nicely on both google and yahoo - first page for both, for selected keyword.

    Now, with respect to www.a1-outdoors.com, after running the W3C validation tool, a ton of errors - 41 - came up. My question is this:

    Hans (and others):

    I do have W3C errors obviously, on www.a1-outdoors.com. These pages and their CSS were created from a website template company, and I know next to nothing about more substantial coding. I am also very pleased with this template company generally. I am afraid my coding abilities fade to nothing when looking at the WC3 report, and I am afraid that the more I monkey around, the more I may screw up that which was essentially a simple, template driven-site.

    My question is this - just how important are these W3C errors in coding in causing the crawler aborts? From my limited understanding, it seems to me, based on the Aikido site, that it is not always a problem - since my Aikido site ranks well, despite plenty of errors. I ask for the reasons I stated above - I am new to this and am afraid that in "correcting" the errors, I may screw things up and make them worse. But if it is a must-do, then, I will investigate and do it.

    Hans, and others - thoughts on the critical value of W3C error-free coding?

    Many thanks.
     
    northpointaiki, Dec 5, 2005 IP
  2. wrmineo

    wrmineo Peon

    Messages:
    3,087
    Likes Received:
    379
    Best Answers:
    0
    Trophy Points:
    0
    #2
    For me, the biggest benefit to being W3C validated is not necessarily for optimal SEO efficacy, rather, it ensures that there are no obvious "stoppers" to a bot being able to quickly, cleanly, and effectively crawl your site, which would have an affect on SEO if the spider can read your content.

    I can't think of a single instance where validating html code to be W3C compliant would hurt a site's standings.
     
    wrmineo, Dec 5, 2005 IP
  3. Web Gazelle

    Web Gazelle Well-Known Member

    Messages:
    3,590
    Likes Received:
    259
    Best Answers:
    0
    Trophy Points:
    155
    #3
    I don't know if it is necessary, but they may tell you it is over at webproworld.
     
    Web Gazelle, Dec 5, 2005 IP
  4. JamieC

    JamieC Well-Known Member

    Messages:
    226
    Likes Received:
    5
    Best Answers:
    0
    Trophy Points:
    138
    #4
    Thanks to the major browsers being extremely lenient when it comes to UI code for websites (ie HTML and XHTML), W3C validation is certainly not necessary to have a sucessful website - how about this for an example.

    Having said that, it certainly won't hurt you to write W3C compliant code, and I would strongly encourage you to do so. As wrmineo says, at the very least it will probably mean that your site gets crawled more regularly. Seeing if your site validates should be the first step if you have any rendering issues during the design phase, and the last step before you deploy :).
     
    JamieC, Dec 5, 2005 IP
  5. northpointaiki

    northpointaiki Guest

    Messages:
    6,876
    Likes Received:
    187
    Best Answers:
    0
    Trophy Points:
    0
    #5
    W.R. - thanks. My question is actually the other side of the coin - would not having fully W3C compliant pages hurt google crawling (as opposed to the question of whether ensuring you do have fully compliant pages would hurt your crawl). Based on my aikido site, I wondered, and, as I seem to have a host of errors on my site (beyond my current ability to fix them), I am wondering if it is truly necessary for google to effectively crawl.
     
    northpointaiki, Dec 5, 2005 IP
  6. northpointaiki

    northpointaiki Guest

    Messages:
    6,876
    Likes Received:
    187
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Thank you all - of course, it just makes sense. I am just a bit lost as to how to go about correcting these errors. On overwhelm, but you are all (incl. Hans), of course right and it should be done. Many thanks.

    Paul

    ps: Jamie, just saw the website you linked to as W3C non-compliant. Can't be a successful site - Google, Japan - who is this? Must be a mom-and-pop outfit...:)
     
    northpointaiki, Dec 5, 2005 IP
  7. wrmineo

    wrmineo Peon

    Messages:
    3,087
    Likes Received:
    379
    Best Answers:
    0
    Trophy Points:
    0
    #7
    Again, it could be either way IMO.

    You could have non-compliant code that doesn't necessarily stop Google or other engines from indexing your site.

    However, you could have an invalid bit of code that stops the bot from being able to do it's job, like invalid javascript.

    This is why I say it's best to have W3C valid code to be on the safe side.
     
    wrmineo, Dec 5, 2005 IP
  8. minstrel

    minstrel Illustrious Member

    Messages:
    15,082
    Likes Received:
    1,243
    Best Answers:
    0
    Trophy Points:
    480
    #8
    AS wrm says, it's a matter of what the W3C "error" is. For them it's a pass-fail thing. In the real world it's not.

    For example, W3C validation will balk at certain "deprecated" elements which actually work just fine and which do not concern any current browsers for a moment. If the W3C validator is churning out those types of "errors", feel free to ignore them.

    If they're telling you about errors like unclosed tags or improper nesting and such, that you need to worry about. Even if it doesn't confused the spiders (thought it often will), your page is going to look like hell in some browsers.

    I usually use the DW validation feature or a stand-alone HTML checker.
     
    minstrel, Dec 5, 2005 IP
    wrmineo likes this.
  9. northpointaiki

    northpointaiki Guest

    Messages:
    6,876
    Likes Received:
    187
    Best Answers:
    0
    Trophy Points:
    0
    #9
    I just feel like I wasted $60, on Netmechanic's toolbox. Navigating the site's features, including, getting to my repaired page, is not easy, and the "repaired" page is an absolute mess in MSN and Mozilla. I am doing this site to make money, not waste it, and I am deeply disappointed.

    Any offer of help re: cleaning up code, or whether I have just gotten Netmechanic's toolbox dead wrong (and I'm just a loser), would be appreciated. We lost a restaurant venture recently, and wasted money is not a possibility we can afford to entertain.

    Thanks.

    Paul
     
    northpointaiki, Dec 5, 2005 IP
  10. walshy

    walshy Banned

    Messages:
    124
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #10
    Paul, I personally think that having valid code will help Google and other bots crawl your site. Our newly launched web site has Googlebot constantly looking at our site, its shame we have hardly any content for it to look at :D But this tells me that there is something in having valid code.

    http://www.building-site.co.uk

    We dont really have many backlinks, we just seem to have google bait! The only thing I can really put it down to is "code to content ratio" if you take a look at the source code of any of our sites you'll notice how lightweight it is, I may be completely wrong but perhaps bots like this becuase its easy to distinguish the actual content from the code.

    All our sites use valid XHTML markup and css for layout, perhaps this is what Google is finding attractive.

    If you get really stuck, as Im a nice guy I'll take a look at making one of the template pages validate for you, perhaps this will help you out. PM me and I'll give you my email address.
     
    walshy, Dec 5, 2005 IP
  11. minstrel

    minstrel Illustrious Member

    Messages:
    15,082
    Likes Received:
    1,243
    Best Answers:
    0
    Trophy Points:
    480
    #11
    "Valid code", yes, but not necessarily "W3C validated code".

    It's not the same thing. "Valid code" means no errors in coding. "W3C validated code" means you conform to the W3C recommendations (note: I did NOT say standards - they have been recommendations since 1999 or so - why aren't they "standards" yet? - good question, IMO).

    As noted above, the W3C recommendations identify certain elements (like <b> and <i> and target="_blank") as "deprecated", when in fact they work perfectly well in ALL browsers and spiders don't have an issue with them for a moment.

    This MAY have a small effect, depending on the page, but spiders are pretty good at finding what they need to find - that's what they are built to do.

    Despite what you may read at WebProWorld, I have never seen any evidence at all that this is true.
     
    minstrel, Dec 5, 2005 IP
  12. walshy

    walshy Banned

    Messages:
    124
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #12
    I totally agree Minstrel the W3C validation results should only be taken as a recommendation, little minor code infractions like using an element that they determine to be deprecated isnt a problem for anyone be it bot or browser.

    The point I was making is that having valid, clean, lightweight code in my experience may be helping me attract the bots to feed on my very limited content? This is complete speculation on my part, may warrant some further investigation.

    The docutype declaration as something to do with the validation process XHTML Strict will throw up many more errors than XHTML Transitional, and transitional will throw up more than HTML 4.0.

    I suppose the key is to match the docutype to the pages code structure and make sure there are no glaring errors like missaligned end tags that could potentially hamper a bot spidering the site.
     
    walshy, Dec 5, 2005 IP
  13. minstrel

    minstrel Illustrious Member

    Messages:
    15,082
    Likes Received:
    1,243
    Best Answers:
    0
    Trophy Points:
    480
    #13
    Absolutely. No argument whatsoever with that, Walshy.
     
    minstrel, Dec 5, 2005 IP
  14. northpointaiki

    northpointaiki Guest

    Messages:
    6,876
    Likes Received:
    187
    Best Answers:
    0
    Trophy Points:
    0
    #14
    Minstrel, Welshy, Hans - my gratitude.

    Hans was quite helpful in aiding the removal of the DOCTYPE problem.

    Many of the A element errors I am getting on W3c are on the order of:

    Error Line 209, column 72: document type does not allow element "A" here .

    ...ck-1850606-10372948" target="_blank" >Get up to 70% off on skis, snowboards,

    The element named above was found in a context where it is not allowed. This could mean that you have incorrectly nested elements -- such as a "style" element in the "body" section instead of inside "head" -- or two elements that overlap (which is not allowed).

    One common cause for this error is the use of XHTML syntax in HTML documents. Due to HTML's rules of implicitly closed elements, this error can create cascading effects. For instance, using XHTML's "self-closing" tags for "meta" and "link" in the "head" section of a HTML document may cause the parser to infer the end of the "head" section and the beginning of the "body" section (where "link" and "meta" are not allowed; hence the reported error).



    - with the error in question being that which I've bolded and supersized. I have re-downloaded the links from the appropriate sites, and made sure I have cleaned up all code around the area (now, I know that when I cut and paste the links in the past, I may have done so within some pre-existing code). Still, the errors are returning.

    Any thoughts continue to be appreciated.

    Paul

    PS: I should add that many of the W3C errors appear to be due to image redirects - i.e., the affiliate link redirects to an image source file, which is in fact looks like a hyperlink (for example, www.sportsmansguide.com). There are no ALT tags after the image source code. Is this an accessibility issue, and not a crawling issue for googlebot and others?
     
    northpointaiki, Dec 5, 2005 IP
    GTech likes this.
  15. wrmineo

    wrmineo Peon

    Messages:
    3,087
    Likes Received:
    379
    Best Answers:
    0
    Trophy Points:
    0
    #15
    It may be a matter of an improperly closed or an unclosed tag.

    <A href="http://www.olsens.freshdames.com" target=_blank>Mary-Kate and Ashley Olsen</A> aren't
            identical.
    Code (markup):
    Post the url of the page and error in question if you want - someone will have the exact answer you're looking for no doubt.
     
    wrmineo, Dec 5, 2005 IP
  16. walshy

    walshy Banned

    Messages:
    124
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #16
    Ideally the link form should be like this withe the Accessiblility Title atribute

    <a href="(whatever the url is)" title="Anchor Text Title" target="_blank">Anchor Text</a>

    This is an example without the title atribute

    <a href="(whatever the url is)" target="_blank">Anchor Text</a>

    (the red colour indicates the start and end of the tag)

    As wrmineo quite rightly suggested it could be the <a> tag not being closed. In your example you didn't paste the end </a> This is what might be causing the validation error, and could be causing Googlebot to "throw a wobly" (London Slang)

    Hope this helps, I wouldn't worry about affiliate code not validating if you tighten up your end tags then IMO you don't have much to worry about.
     
    walshy, Dec 5, 2005 IP
  17. northpointaiki

    northpointaiki Guest

    Messages:
    6,876
    Likes Received:
    187
    Best Answers:
    0
    Trophy Points:
    0
    #17
    Hi all - thanks to yourselves, and, I must say, a torrent of help from this board's Hans, I think I am on the right road. My index page is down to 3 errors (from 41), and I am following a similar pattern with all other pages. Most of the errors are stemming from the affiliate links, which appear to be coded poorly.

    The others are due to "&" in end tags - document.write("</center&nbsp;>"), and javascript on a "bookmark this page" includes. I am learning a good deal. I owe all of you, most especially Hans, a good deal.

    Paul
     
    northpointaiki, Dec 5, 2005 IP