building contextual web page analysis algorithms a la adsense et all

Discussion in 'Programming' started by disgust, Oct 27, 2007.

  1. #1
    I'm looking for literature on how to go about this but I'm a bit lost on what to even search for.

    it seems pointless to reinvent the wheel and I'm sure literature out there exists on how a program like this should be structured.

    I am assuming bayesian probability would be used to run the likelyhood of words [or phrases] occurring on any page on the internet against the number of times it appears on any specific page. I understand that.

    but to do that, wouldn't you need to analyze every word [and combination of words] found on the page? this doesn't seem practical for obvious reasons.

    I'm not even asking for anyone to lay out a solution, but if you could even point me in the right direction on where to find answers to a problem like this it'd be much appreciated ! :)
     
    disgust, Oct 27, 2007 IP