Hi all! I'm not new to the online marketing field, but SEO/SEM (outside of PPC) is a topic I've stayed only moderately abreast of because it's just not my bread and butter (that may sound odd if you're ensconced in SEO/SEM, but it's a big field out here). As such, when I come to re-educate myself from expert forums like DP, I usually stumble upon something that I have a question about, so this time I thought I'd ask. I picked up a tip about VIPS from a thread here and followed the topic through several posts across the web, including Microsoft's original paper on the topic. Coming from the world of technology as much as the world of marketing, it's not a hard concept to grasp at a 30,000 foot level. What is hard to grasp is how it's being implemented. I can't seem to find any info about how to tune a page particularly to cater to VIPS-friendly content analysis. I posted this in the Google section because previous discussion was here, but it could apply to any engine that analyzes content. To what degree does Google value the VIPS model in analysis and, if it does to a large degree, how does Google's analysis decide what the content frame is in a given page? CSS-based page design makes it relatively impossible to assign a certain weight to a given div based on position, so I guessed it must have something to do with either common naming conventions or with what div has the most seemingly relevant content in it (but this wouldn't be VIPS, it would just be semantic content analysis). That's just educated guessing though, I have no idea what the clever mischief-makers at Google do much of the time. Furthermore, once the basic chunks of a page are sussed out, how is analysis weighted for the various parts of the page? How are pages penalized for having heterogenous designs? These questions certainly wouldn't have definite answers and may not have any answers, but they are all questions that popped into my head based on what I've read. Does anyone have any light they can shed on the topic? It has piqued my interest. Thanks!
Voodoo Inspired Page Sorting. But seriously, it stands for Vision-based Page Segmentation. Here's the original MS stuff on it: http://research.microsoft.com/research/pubs/view.aspx?tr_id=690 It's probably about as informative from a basic POV to just Google "VIPS + SEO" and read the scant blog postings. EDIT: The above link isn't linked because I apparently don't meet the forum qualification of 10 posts yet. So you'll have to copy and paste. Not my fault.
Hi there... VIPS is simply the MSN version of what's also called 'block level analysis'. So VIPs is a MSN centric term.. sorry about that... Another common terminology is 'page segmentation' There are more than a few Google patents dealing with page segmentation in a variety of directions... While it can be used for a variety of things during the indexing and retrieval process, the more interesting of note is in Link Profiles As the valuations of back links (and even internal link structures) become more sophisticated this tool helps in a simple manner (at theis point at least) SIMPLIFIED ; the also can divide the page into 'segments' for our purposes we shall have; Header Side panel Content Footer From some stuff I have read the valuation of links in some forms of BLA is as such from least valued to most Footer (lot's o SPAM and Footer nav) Header (much the same as above) Side panel (slight valuation increase) MAIN CONTENT - these are considered 'editorial links' and have hte highest valuations So that's the main effect for page segmentation (or BLA) can be seen in Link Profile development. Once again other aspects can be used for filtering duplicate content and more.... that's the basic idea Getting anywhere?
Right, most of that I knew from all the stuff I read before posting. What I don't understand - if my post wasn't clear enough - is how they decide what is what and how best to structure your content to make sure that the right part of the page gets indexed as intended. Is it important to call your content div 'id="content"' and such? Can the engine value more than one section if more than one contains what appears to be primary content? You see what I'm getting at? The info about backlinks is interesting though, I hadn't considered that.
Well utlimately not much to worry about content wise... if anything it is an effort by the SEs to ensure they GET the proper content indexed. The SE with the best results wins (in theory)..so they dearly WANT to index you site This methodology would allow the spiders to look deyond the early code (header and left panel) and ensure the content is properly indexed. You considerations can come into play via internal linking structures via editorial links. In essence telling the SEs which pages on your site U value more than others. U also have the external linking (link profile) implications as mentioned earlier... It is an implementation of the algos and the indexing/retrieval process... not an SEO tool ... lol. Regardless understanding it is a good idea and the concepts have been picking up speed the last few years. Worth learning about. Much of SEO learning is about absorbing information that becomes your theories through osmosis, not direct tutorial..
I understand everything you're saying, but it's not technically coherent enough to make much sense. For certain there's nothing here worth understanding in practical terms if there's no expected idea of the implementation that can elucidated. I know it's not an SEO tool, my question is purely about how an engine like Google's is believed to parse the code on the page before it then attaches a visually-related value to it so that it can THEN make VIPS-related judgements. After that I can make a judgement about how it values content based on expected norms that are laid out by the original spec. Of course they would value primary content more than the other sections and of course that would factor into the value of links found in that content, that doesn't need to be explained to anyone who has read the specs because it's the entire point of the model, certainly it doesn't need to be re-explained to me twice in my own thread after I mentioned having read all about it. Anyway, without understanding its implementation there are waayyyyy too many assumptions being made in your post about what would be valued in reality based on theory. In short, I don't need to be told what VIPS is intended to do as a philosophical exercise, I need to know how the engines implement it as a technical process and it doesn't sound like you can help me there. Sorry to be cranky, but I was very detailed from word one about what I'm asking and for some reason you have repeatedly ignored my very detailed posts and, instead, answered other, less complicated questions that aren't very difficult to answer. I've no doubt you have a great deal of knowledge about SEO and all other manner of things that I am a moron about, but if you can't answer specific questions about the hows and whys of technical application of theory, particularly when I'm super-clear about the questions and have a decent understanding of the fundamentals behind them, then please don't simply re-state the theory as a non-answer, it's not helpful and after the second time it's actually quite patronizing to be honest.
Then U need to shake yer head a bit laddy.. cause unless there is a G engineer or any other running around spilling secrets.. then a great deal IS speculation based upon the available information and making one's one judgements So get to work reading.. and make your OWN decisions.. I won't bother U a 3rd time ..and grab the Windows source code and the MacDonalds secret sauce while UR at it -- we aren't being handed everything in the SEO game unfortunately...
I have a page on my website about VIPS, but it is more of the same of what you already know. I discuss the theory for people who have never heard of it. I honestly believe that perhaps the answer to your question is a matter of looking at a website as a whole, rather than at a page as a disjointed piece of the puzzle. When the search engines want to know what part of a page is content and what is not, it is nearly impossible to see from looking at the code on a single page. There are just too many factors to consider. However, if the search engine's look at one page as a comparitive model between it and the other pages of the same website, then they should be able to find the truth of the matter. Most of us will tend to provide the same template from page to page. And, although we might change up the content that appears in one sidebar from page to page, there are usually minor pieces of that sidebar that carry a common format or theme, perhaps even some similar links. Of course, this is just more theory, but if the search engines look at a page within context of the whole site, implementation of VIPS should be pretty straight forward for them. Having said that, I have never seen any indication from any search engine company or their chatter that they are in fact pursuing VIPS or Segmentation as an integral part of their algorithms. MSN and Google both have patents either approved or pending on this basis, but Google has many patents that they do not have implemented in their current algo's. Until I see evidence coming from within the search engine beltway that they are utilizing the concept now, I personally treat it as something that we should consider for our future, without giving it much concern just yet. Bill Platt
Bill, thanks! That makes a lot of sense. Comparative analysis within the confines of one site would be the most obvious model and I hadn't even thought of that. I also appreciate your comments on current vs. future implementation, because things can be so murky that I worry whether I'm missing something that I should be programming for. Thanks again for the sharp commentary, it's refreshing.
Well, the footer/menu links should be easy to distinguish anyhow: those are the links that aren't in text, that are mostly internal to the site, and have a lot of links with other pages in common. If I were a google engineer I would make those a first candidate for less weight IF there are too many external links there. But I have good indication that they still count. I had straight HTML coded affiliate links in my menu's, without rel-no-follow on them and those pages got less google-exposure then expected. So I concluded that the page they linked to got ranked instead. Put rel-no-follow on those and the pages the links were on got ranked far better.