:::The Core Of How Google Works:::

Discussion in 'Google' started by PRBot.Com, Sep 19, 2004.

  1. #1
    PRBot.Com, Sep 19, 2004 IP
  2. flawebworks

    flawebworks Tech Services

    Messages:
    991
    Likes Received:
    36
    Best Answers:
    1
    Trophy Points:
    78
    #2
    I'll be savin that one for a rainy day.....

    Providing the power stays on of course. Lately ya never know.
     
    flawebworks, Sep 19, 2004 IP
  3. nohaber

    nohaber Well-Known Member

    Messages:
    276
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    138
    #3
    1. Publishing papers on a site without permission is illegal.
    2. The google paper is great, but there is lots of additional stuff added to it (like duplicate-document detection, LocalRank..).
    3. I like the idea of the site. 99.9999% of all "experts" are con-artists. There are no experts or professionals or whatever nonsense. Most SEOs have no programming/algorithmical backgrounds and can in no way make even educated guesses about search engines.
    The only truely SEO experts are the guys who've spent years studying search engine papers/researching and programming. And the best of these work at Google, Yahoo etc.
    Creating a SEO industry out of people with no knowledge is a nice way to talk nonsense and get paid, though. :D
     
    nohaber, Sep 20, 2004 IP
  4. Mel

    Mel Peon

    Messages:
    369
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    0
    #4
    If you have some concrete indications that Google has added duplicate detection or Local rank to the Core of Google I would be very interested in seeing them.

    IMO if in fact Googles duplicate content patent has been implemented it is not as a part of the core ranking system but as a standalone program that is run across Googles database whenever they feel a need for it, as they do with many other standalone programs.

    I really doubt that localrank has in fact been implemented as a part of Googles ranking algo, though it is possible that they could have implemented it and set the constants such that it is such a minor factor as to be unnoticable .

    Sorry Nohaber, but I can't agree that the only SEO experts are those who have spent years studying search engine papers, programming and researching.

    IMO if you can consistently get top rankings with good traffic for your clients then you have passed the SEO experts test, and it is not rocket science.
     
    Mel, Sep 20, 2004 IP
  5. john_loch

    john_loch Rodent Slayer

    Messages:
    1,294
    Likes Received:
    66
    Best Answers:
    0
    Trophy Points:
    138
    #5
    I'd say that sums it up, at least where peace-meal sites are concerned.

    There is of course the next level where SEO practitioners plan, build and manage large distributed networks.. but at the end of the day I still believe it comes down to a few basic principles (per engine).

    Consistency is definitely a good thing :)
     
    john_loch, Sep 20, 2004 IP
  6. Mel

    Mel Peon

    Messages:
    369
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    0
    #6
    Yes John, but the question is if these large distributed networks of sites which are built only for ranking purposes (and they are, no matter what fancy window dressing they try to use) are ethical and if they may not be subject to penalties.
     
    Mel, Sep 20, 2004 IP
  7. fluke

    fluke Guest

    Messages:
    209
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #7
    as mel says if you can consistently get good rankings with good traffic - than that will be what determines whether u are an SEO expert or not.
    however i think it's the guy's who spend all the time researching the white papers, researching and experimenting who are going to be the ones who will be most likely to ensure your site carries on getting good rankings after each algo change.
    - by the way i read the original paper(anatomy of a large scale....blah blah) - i noticed people are mentioning local rank etc is this that paper but edited by other people ?

    cos if it is couldn't they get in alot of trouble?
     
    fluke, Sep 20, 2004 IP
  8. Mel

    Mel Peon

    Messages:
    369
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    0
    #8
    LocalRank is a different topic, it has been patented so you can find it here
     
    Mel, Sep 20, 2004 IP
  9. nohaber

    nohaber Well-Known Member

    Messages:
    276
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    138
    #9
    That's bullshit. If you consistently get top rankings, it only means that. Anyone can learn to put keywords here and there. Everyone knows that lifting weights and eating lots of protein builds muscle, but not every muscular man is a muscle building expert.

    A SEO expert is someone who knows the inner workings of search engines.

    Consider two people: someone who (for example) has worked at Google and knows the precise algorithm and someone who has 20 high PR sites. The guy from google does not have money and PR sites, so while he is really an expert because he knows Google, he can't get his site to #1 for Search Engine Optimization, while the other guy will outrank him by simply putting links from his high PR sites. The second guy has leverage, although he is no expert.

    Do you think that the guys that top google for search engine optimization are the best experts? or are they the richest who buy links from every high PR webmaster site?

    A SEO expert is someone who understands the fine points of SEs, and that is only possible with programming/algo etc. background.

    I am no expert. You are no expert. No one here is a SEO expert. Those who thing they are experts, are full of ****.

    1. Google has the best duplicate detection of all engines. You rarely see duplicate content in the SERPs. And the ones that you see, are the ones who bypass the detection algorithm. For humans it is easy to see dup-content, but when you add a sentence here and there and change something on the page, it may bypass the detection algorithm.
    In fact, the 2 biggest problems that Google faced after they went online were:
    1) the growth of the web
    2) spam

    1) they solved by first developing their distributed network architecture and after that developing the google operating system
    2) the duplicate detection is one of the best spam detectors.

    Here's a quizz for you, Mel. Read the dup-content patents and tell me what are the *minimal* steps you need to do, to bypass the duplicate document detection? How many SEO Experts can actually understand things like hashing, union-finds etc?

    The duplicate document detection is a part of the crawling function. The query-specific duplicate detection is a part of the searching facility. It is not used to rank, but to exclude pages from the results.

    LocalRank either improves the relevancy or it does not. LocalRank delays query response time, and no sane programmer would put it to work, if it does not improve the relevancy. So, it is either heavily used, or not used at all. Your "minor factor" does not make any sense.
    IMO, LocalRank is used. I have noticed it in the keywords that I compete for my dieting software site.

    A large distributed network of high PR sites is a definite advantage. It gives leverage for faster ranking. But it does not make someone a SEO expert. Just a "PR rich" person. :cool:
     
    nohaber, Sep 20, 2004 IP
  10. Jenny Barclay

    Jenny Barclay Peon

    Messages:
    55
    Likes Received:
    1
    Best Answers:
    0
    Trophy Points:
    0
    #10
    1. Hristo, excellent post

    2. "If you have some concrete indications that Google has added duplicate detection"

    Masses. Duplicates dropped from SERPS and given PR0

    Not mine I might add.
     
    Jenny Barclay, Sep 20, 2004 IP
  11. Mel

    Mel Peon

    Messages:
    369
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    0
    #11
    Sorry but IMO that person would be a search engine expert, not a search engine optimization expert. A good search engine optimizaton expert also has to be able to understand page design, user behavior, keyword selection, copywriting etc etc.

    Yes he is a google expert, but he may not be an MSN expert and in fact he may not be an optimization expert either.

    Well I can only speak from personal experience and that is that I have never paid for or traded a link in my life but I still top Google rankings for my keywords.

    You are entitled to think as you like but not all here may share your opinion, even if they are too modest to label themselves as experts.

    Sorry but I am not interested in bypassing the Google duplicate detection system

    Well yes I agree that it is not a ranking program, but I believe it would be really stupid to implement it during the crawling process, (even if the bots could be upgraded to that extent,which I doubt) when it is so much easier and faster to run it across the database, simple engineering and common sense, but if you can come up with a reference that shows it is in fact done in the crawling process, then I too will crawl.

    If in fact local rank is heavily used in the ranking system then it would filter out all internal links, and I can assure you that internal anchor links are still a ranking factor.

    Since you seem to be aware of the LocalRank algo then you would know that it is an addition to the "old rank" system and that the equation suggested for its use in the patent shows a variable which can be applied to both the old rank and local rank portions of the equation. The choice of constants allows either the old rank or localrank to have either a minimal or maximum effect on the final rankings.

    This is the first thing we seem to agree on.
    :)
     
    Mel, Sep 20, 2004 IP
  12. Mel

    Mel Peon

    Messages:
    369
    Likes Received:
    14
    Best Answers:
    0
    Trophy Points:
    0
    #12
    Hi Jenny the entire quote goes:
    (emphasis mine)

    I know that Google is using very good duplicate detection, but IMO it is not a part of the Core of Google, or the ranking algo. I am of the opinion that it is a stand alone program run on an as needed basis on the Google database. The reasons I believe this are that it is not necessary to check every page for every query for duplicates and doing so would reduce Googles half second response time to perhaps one or two minutes.
     
    Mel, Sep 20, 2004 IP
  13. nohaber

    nohaber Well-Known Member

    Messages:
    276
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    138
    #13
    Crawling is not only fetching pages. It is a lot of other things. Detecting duplicate documents is important so that you don't recrawl dup-docs and don't waste CPU time and space on indexing dup-docs. If you need reference, just read the damn patent :D

    How would it "filter out" all internal links??? Internal links are counted in the OldScore. The LocalScore has nothing directly to do with internal links. A high OldScore produces a high LocalScore to the document the link points to, so it indirectly uses internal links.


    You misunderstood me. You don't want to put up something that increases response time, unless it *substancially* improves SERPs. If it is a *minor* factor, it can't do any good. Why waste CPU time for something *minor*? That's bad algo design.


    How do you define the Core of Google?
    It is NOT necessary to check EVERY page for every query unless the programmer is stupid :D You need reference? Read the damn patents. :D

    Here's a snipper from Google's API reference:

    "The <filter> parameter causes Google to filter out some of the results for a given search. This is done to enhance the user experience on Google.com, but for your application, you may prefer to turn filtering off in order to get the full set of search results.

    When enabled, filtering takes the following actions:

    Near-Duplicate Content Filter = If multiple search results contain identical titles and snippets, then only one of the documents is returned.
    Host Crowding = If multiple results come from the same Web host, then only the first two are returned. "

    (emphasis mine)

    Why do you think Google returns less than 1000 results for most queries? Because query-specific dup-doc and host-crowding detection. And yes, it does not take one or two minutes :cool:

    btw. what is your programming/algorithmic background?
     
    nohaber, Sep 20, 2004 IP
  14. PRBot.Com

    PRBot.Com Guest

    Messages:
    244
    Likes Received:
    40
    Best Answers:
    0
    Trophy Points:
    0
    #14
    I have the proof.

    I submitted PRBot.Com and it was indexed a week later.

    Then I submitted OpenPress.Com and it was indexed a week later.

    3 days later OpenPress.Com was removed.

    Even though the sites do NOT have identical content the software and templates are identical and I guess it was enough to get one of the sites banned.

    As for that Paper, I have received permission from the university to publish it provided I did not edit anything out or in.
     
    PRBot.Com, Sep 20, 2004 IP
  15. PRBot.Com

    PRBot.Com Guest

    Messages:
    244
    Likes Received:
    40
    Best Answers:
    0
    Trophy Points:
    0
    #15
    Exactly! To be a "search engine expert" you have to have:

    Mathematical Statistics Analysis background,
    Knowledge of Databases
    Algo design, test and implementation,
    Deep understanding of the laws of averages,
    Extensive bulean algebra as it applies to statistics comparison
    Understanding of Server Hardware architectures
    Pattern Analysys

    and a bunch more I can't think of right now cause I've been up for two days SEOing a site.
     
    PRBot.Com, Sep 20, 2004 IP
  16. fluke

    fluke Guest

    Messages:
    209
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #16

    - cheers mel - (i printed this paper (local rank)off a few weeks ago and still havn't got round to reading it all!) what i was saying though was if the paper (in the original post) is "anatomy of a large scale hypertextual search engine - by brin and page - which has then had other things added to it by other people - could they not get in trouble for doing so - or is it clearly stated that it has been added..- sorry i didn't make it very clear - i havn't got time to look at things properly at the minute.

    - edit - erm i just had a really quick look at it it's totally different to what i thought it was - i'm confused.
     
    fluke, Sep 20, 2004 IP
  17. PRBot.Com

    PRBot.Com Guest

    Messages:
    244
    Likes Received:
    40
    Best Answers:
    0
    Trophy Points:
    0
    #17
    NOTHING has been added to it. The paper was co-authored and it is exactly as it is on the University website.
     
    PRBot.Com, Sep 20, 2004 IP
  18. nohaber

    nohaber Well-Known Member

    Messages:
    276
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    138
    #18
    Great. Can you tell me whom you e-mailed/called etc.? I want to put it on my site.
    10x
     
    nohaber, Sep 20, 2004 IP
  19. PRBot.Com

    PRBot.Com Guest

    Messages:
    244
    Likes Received:
    40
    Best Answers:
    0
    Trophy Points:
    0
    #19
    E-Mail:
    Phone: +1-650-723-2911
    Fax: +1-650-723-2353
    US Mail: Polya Hall Room 252, 255 Panama Street, Stanford University, Stanford, CA 94305-4136


    The lady's name is Lynne
     
    PRBot.Com, Sep 20, 2004 IP
  20. nohaber

    nohaber Well-Known Member

    Messages:
    276
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    138
    #20
    10x PRBot.
     
    nohaber, Sep 20, 2004 IP