ok, i won't dispute that. ohhhhh? this sounds like a challenge michael. i've always enjoyed watching you joust with other users here, because you tend to be arrogant and unnecessarily combative just like jhmattern tends to get supercilious and condescending, resulting in a lot of unnecessary antagonism which clouds your judgment and hinders your ability to win the argument cleanly, even though most of the time you are right. ok, michael, then why don't you define this thing called "topic" or "theme" for me. pick one theme, any theme, and we'll take it from there. nick
Thank you. Moi...? (adjusts halo) No, my judgement does not get clouded unless I haven't slept. No, what mostly hinders my ability to win is the others persons inability to see that they have lost. I often have to take the Gore approach and back down, or risk looking stupid trying to get someone to understand something they just can't. (Arrogant enough? ) Here, play with this a bit: http://labs.google.com/sets Edit - and this: https://adwords.google.com/select/KeywordToolExternal Also, I invite you to actually try the experiment I laid out earlier. You mentioned (albeit incorrectly spelled) the Turing test. Are you familiar with neural networks? -Michael
what this set toy from g labs proves is that, if you give it five keywords, it can cough up a list of words that are statistically likely to be found on the same site, page, or in the same paragraph, or sentence. this is very easy for google to do because all it needs to do is build a relational database of incidence frequency. it doesn't prove nothing about relevancy. if i have a page with keyword "kitty litter" and three score keywords not featured in google's set, it does not mean google has the authority to deem my page irrelevant, because all my page is is statistically located towards the long tail end of the graph. moreover, most web pages don't have a meta tag with "set" and 5 csv keywords clearly indicating the topic. i stand by my postulation that this thing called relevancy is but a phantom as far as a search engine is concerned. nick
i think the way to think about relevancy vis-a-vis search engines, specifically google, is like this: relevancy is genuine interest vs blatant seo-driven linking. in other words, even if i have a site that never ever mentioned the keyword, if i come across it and i think it's a good thing, and link to it, that's genuine interest, even though it's off-topic... in other words the link anchor text is from a "set" that has never appeared on my site before. now this is ok if the motivation is genuine, it is relevant even though the keyword stats indicate otherwise (like the case with the stock market analyst linking to a cat litter company). but if the link is due to a link exchange, or purchased link, or a link begged from a friend or relative for seo purposes, that is a link lacking in relevancy. now, the problem is that something called "intent" is awfully hard to figure out, because it resides in people's hearts. what i would do if i were google is trawl all the seo forums (like this one), vacuum up the links in people's signatures, and automatically downgrade them if their inbound links come from pages without overlapping sets. but that's as far as i would go. anything else would be presumptuous, and possibly wrong, and i would not want to make such a mistake - nor would google, i think...
Well, what your statement proves is that you're more interesting in disagreeing than in actually looking at it and thinking about what you are seeing. Without getting into the whole "is it worth pursuing" argument, it should be obvious that it can indeed be determined. I mean, we're not talking about a machine looking at the data and saying "I think, therefore I am, and I think this is relevant to that"... but the patterns, as determined by historical and existing linking patterns, are there to be seen. Hell, with control over a large enough sampling, relevancy, in the eyes of a search engine, can even be created from scratch. -Michael
yeah, the adwords tool... it's basically the same as the other one. it's about expanding a set of keywords into a larger list based on statistical likelihood of appearance. so this is enough to define a topic? if i type in keywords A - E, and google spits out a list, and my link anchor isn't on the list, does that mean my link is off-topic michael?
That's actually not what I was saying, and those tools were more to illustrate that something that could be called relevancy exists, or at least close enough to it that it could be used as part of the ranking algorithm, if they so chose. Remember, I wasn't taking the side that relevancy was important, just that it's within the scope of reasonable to say that it was possible (or at least a reasonable facsimile of it is). -Michael
no, what you can come up with is a group of words that are likely to be near each other. i wouldn't call that relevancy *chortles* unless you are talking about a very primitive search engine that gets it wrong a lot of the time. something microsoft would code, for example. i suppoooooose i might concede that in some circumstances googs might award you a wee bonus point for having a link that falls into the same set with a 99% match. but google is nowhere near being able to downgrade yo ass because your link is outside the predicted set, because it's just that, a prediction that could well be wrong.
Actually, no, I wasn't talking about word proximity, and that's not how the Labs tool works either. -Michael
check MATE buddy, because that is precisely what it ain't. unless of course your disclaimer (reasonable facsimile) is along the lines of "well it might be... in some cases... somewhat... mostly correct... " googs is a loooooong way off from nailing relevancy, or topic, or theme, because there ain't no clearcut boundary or definition. all topics spill over into other ones, there are concentric circles of intermeshing circles, and if site A's keywords are all over the map, it just means that your statistical model isn't good enough. unless, of course, someone is building links artificially. that's the problem. you just don't know!
pray dear sir, enlighten this humble journeyman; how, then, does google labs work, if not incidence frequency + proximity?
Better yet, why don't you convince yourself that my being bored with discussing this with you actually equates with you being right, and you just go to bed happy in the comfort that you won. -Michael
ah yes, when all else fails, cook up that little mix of personal insult and unabashed retreat that you have practically trademarked, leaving those of inferior intellect and a lesser sense of self-worth puzzled and angered beyond the ability to see the white banner flapping in the wind. very well, i will take your suggestion and do just that. i bid you good night. till we next meet michael. nick
This thread have got lots of posts since I posted and I haven't read all yet. monosodium: Who is Matt Cutts? Who does he work for? He will only tell you what Google wants you to know. If we know too much we can manipulate and "cheat" in the SERP and that is not in his interest to help us with. So Matt Cutts might be a good source on some subjects but not all. About relevancy it's not that hard to understand that it actually does exist. Just like mvandemar mentioned, they use artificial neural network to find patterns and detect relevancy. The only problem with this is that it needs training and a lot of information to do that. Guess if Internet is the perfect place for this? Then maybe they are fine-tuning it manually but I don't know if that even is necessary.