1. Advertising
    y u no do it?

    Advertising (learn more)

    Advertise virtually anything here, with CPM banner ads, CPM email ads and CPC contextual links. You can target relevant areas of the site and show ads based on geographical location of the user if you wish.

    Starts at just $1 per CPM or $0.10 per CPC.

getting better rankings for deeper pages

Discussion in 'Search Engine Optimization' started by fluke, Aug 19, 2004.

  1. Owlcroft

    Owlcroft Peon

    Messages:
    645
    Likes Received:
    34
    Best Answers:
    0
    Trophy Points:
    0
    #21
    There are 570,534 "topic" pages, plus an uncounted number of image pages; each topic page is, of course, about a different topic. Each topic page is more or less "self-optimizing" for its topic (which is used as the title and the major part of the h1 header), but I can see no way to make use of that so far as getting links might go, except, as I said, to comb through for sites linked as resources on each topic and ask them.

    Right now, the site has, of course, a zero PR, even with several PR 5 and 6 pages pointing at it, and the Backlink Tracker shows 1 page in the site. Obviously, that will change in time. I suppose, indeed, that in the depths of time, G will necessarily eventually index the entire site, at which time its front-page PR ought to be pretty interesting.

    (If anyone wants to provide a link to it--and I, at least, think it really is a useful resource--I have now added a "supporting sites Honor Roll" reciprocal-links page just off the front page.)

    The site is organized with a list of, at present, 58 topmost divisions set forth on the front page, each a link to one of 58 mid-level directory pages in the root; those 58 each point to 100 bottom-level directory pages in a subdirectory; and each of those 100 (which thus total 5,800) points to 100 actual topics (each served, of course, by php).

    The 58 mid-level pages have been indexed in G's cache for a week now (despite the currently listed page count of 1), but none of the 5,800 lower-level pages is listed yet (much less any of the topic pages). I am trying the technique of putting a link to the first of the 5,800 on one of my other-site PR 5 link pages, to see if I can get the ball rolling, but, at least in the few days since I put it up, no indexing.

    I still need promotional ideas.
     
    Owlcroft, Aug 19, 2004 IP
  2. Dominic

    Dominic Well-Known Member

    Messages:
    1,725
    Likes Received:
    121
    Best Answers:
    0
    Trophy Points:
    185
    #22
    I don't mind it when you are cranky compar - I just hope you realise these forums can be pretty intimidating at times, despite how friendly we all are most of the time - it's the level of knowledge that can be intimidating. It's easy to worry that if you write something people will call you a goose.

    Personally I only worry people read my posts and loose faith in the value of a university degree when they see how many spellin mistakes I make. :eek:
     
    Dominic, Aug 19, 2004 IP
  3. compar

    compar Peon

    Messages:
    2,705
    Likes Received:
    169
    Best Answers:
    0
    Trophy Points:
    0
    #23
    Owlcroft,

    That sounds like an interesting site. Why did you compile it and what are your long term goals for it?

    I have a plan for building a 100,000 plus page site, but I know exactly what I want to do with it and how much money I want to make from it.
     
    compar, Aug 19, 2004 IP
  4. Dominic

    Dominic Well-Known Member

    Messages:
    1,725
    Likes Received:
    121
    Best Answers:
    0
    Trophy Points:
    185
    #24
    Owlcroft - good golly miss molly! What a site!

    I suggest a ski mask and a sawn off to take with you into the googleplex - thats they only way you'll get it done in a hurry!

    It's only fair to say I think we will all want to follow your progress with this one - how would the search engine treat a site like that?

    AH - got it.

    Invent a toolbar for bloggers (or a blogging program tht lets you do it) called 'OmniLink' or something like that, where they can highlight a word in their blog (as they are writing it) and click a button on the toolbar and it inserts a link to a relevant page in omniknow. (catch my drift?)

    Geez I give away the big ideas don't I :D
     
    Dominic, Aug 19, 2004 IP
  5. Owlcroft

    Owlcroft Peon

    Messages:
    645
    Likes Received:
    34
    Best Answers:
    0
    Trophy Points:
    0
    #25
    I have an idea, but not an exact one, what I hope for in the long term, which is some modest but reasonable AdSense income, perhaps with the side bonus of a little meaningful PR that I could throw at some other sites.

    I was pondering how to make a little without, in effect, just buying lottery tickets and praying--by which I mean trying to make a site on a particular topic and hope that it is the one that catches fire. (I am long since convinced that simply making an excellent site on a topic means zunt in and of itself.) The not-terribly-original thought occurred to me that a sufficiently large site, if not simple trash (as we see a lot of these days), could provide some modest traffic and AdSense impression counts simply as a numbers game.

    The next issue was how to compile a really large site that isn't trash. It's not enough to stay away from those despicable practices of pasting search results up as "pages", or overtly stealing content. Sure, Wikipedia--which is open-source--is a good start, but Heaven knows there are enough sites that are Wikipedia with the serial number filed off (well, not filed off, that's not allowed, but pretty well covered over); just doing another of those is no value added. But then I said to myself, self, what's the other big and really useful free resource out there? So I married Wikipedia to dmoz to provide both a topic essay (which often has links, but usually just a few) and a long list of high-quality pertinent links. As Chico Marx often said, "Now, you gotta sumpin'."

    To effect the marriage seamlessly and smoothly was not a picnic in the park, and I am not sure yet that all the glitches are handle.d (I think so, but I can scarcely hand-check a half million pages!), but I am sufficiently pleased with the result, which, to my eyes anyway, has a cleaner, more pleasing look than either of the sources.

    In time, I may try to tie other public-domain or open-source resources into it as well, but for now, my chiefest task is simply to get people to find the thing. Google will eventually index all the pages. Even without any links other than the site's own internals, at least some--even if the more arcane only--will turn up in people's searches. That will, I hope, lead them to come back and look at other pages, and perhaps mention it or even link to it. It is very, very clearly a long-term project, but I am hoping that some day it can at least pay my medical-insurance bills, which have exploded so hideously in recent years as to postpone my retirement.

    But I can use any promotional ideas anyone has about hastening the indexing and/or linking-to of the actual content pages.

    Incidentally, if it does eventually, however slowly, get indexed, even with few links it ought to have some non-trivial front-page PR, what with 560,000 pages pointing at it. Anyone who wants to help that process, while yet getting aboard early for reciprocal linking, can check the supporting-sites "Honor Roll" page. I'm begging more than promising, but, as I said in the topic header, I think this is a potential win-win in the sense that, though I may be biased, I think it really is a nice, clean, useful general resource any site can link to for its own worth, and not some cobbled-up flycatcher of a "site".

    (But I could use feedback on that, too.)
     
    Owlcroft, Aug 19, 2004 IP
  6. Dominic

    Dominic Well-Known Member

    Messages:
    1,725
    Likes Received:
    121
    Best Answers:
    0
    Trophy Points:
    185
    #26
    see my pm re links from our sites
     
    Dominic, Aug 19, 2004 IP
  7. nohaber

    nohaber Well-Known Member

    Messages:
    276
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    138
    #27
    When I say PR update, I always mean an internal PR update at Google, not the toolbar.

    Compar,
    your examples are based one 2 wrong assumptions:
    1) the PR on the toolbar may have nothing to do with the PR that's used in the ranking process
    2) you don't consider things like LocalRank

    Your examples don't prove that anchor text is more important than PR and that PR is not important. A PR4 page can outrank a PR7 page if it the PR4 page has lots of IBLs and the PR7 has none. That's because the PR7's IR score will be low, BUT..

    Consider the classic query "search engine optimization".
    Every competitor has IBLs, on page optimization etc.etc. everything you can think of.
    Now who will be #1? The one with the highest PR and LocalRank.

    The IBLs have diminishing marginal returns. THe more you have the less they add to your rankings. After having a couple of hundred, all the additional ones mean almost nothing.

    Now consider PR. The sky is the limit with PR and LocalRank.

    My diet software site has more IBLs with proper anchor text than any of my competitors. And I am still #2 for diet software, #9 for fitness software.

    That's because of 2 simple reasons. They have more PR and more LocalRank. I am the only one without dmoz listing. And because the dmoz page ranks high on fitness software, all my competitors get high LocalRank and beat me. With diet software I am way better because they don't have LocalRank.

    Every time the DMOZ page increases its rankings, I fall because it pushes my competitors LocalRank (which is based on PageRank + IBLs).

    A low PR page can outrank higher PR pages with the help of LocalRank. Just one links from a high PR well placed for the keywords page, can make a low PR page kick out the higher PR ones. Because LocalRank modifies PR and you don't have a toolbar for it.

    The whole IBLs vs PR is outdated and totally meaningless. Take a look at recent Google patents. They talk about PR, the pictures attatched to the patents show PR...

    Back to the "search engine optimization" where everyone optimizes everything :). The #1 site is #1 because of 2 simple reasons:
    1) it has brutal PR
    2) it buys links from other well ranked pages for "search engine optimization" like isedb.com

    It's that simple. They buy PR + LocalRank. When you make a PR4 page outrank them I'll take my words back :)
     
    nohaber, Aug 20, 2004 IP
  8. Dominic

    Dominic Well-Known Member

    Messages:
    1,725
    Likes Received:
    121
    Best Answers:
    0
    Trophy Points:
    185
    #28
    Yay - someone else who believes local rank works. I'm with you on that point.
     
    Dominic, Aug 20, 2004 IP
  9. Owlcroft

    Owlcroft Peon

    Messages:
    645
    Likes Received:
    34
    Best Answers:
    0
    Trophy Points:
    0
    #29
    Would one of you elaborate some on "Local Rank"?
     
    Owlcroft, Aug 21, 2004 IP
  10. nohaber

    nohaber Well-Known Member

    Messages:
    276
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    138
    #30
    nohaber, Aug 21, 2004 IP
  11. compar

    compar Peon

    Messages:
    2,705
    Likes Received:
    169
    Best Answers:
    0
    Trophy Points:
    0
    #31
    I don't feel like writing a long protracted answer, but I'm very familiar with Local Rank and it has absolutely nothing to do with PageRank. So how do you know it isn't anchor text, which provides the relevance ranking, and Local Rank which boosts the relevance ranking, that are the drivers. That still works just fine without PageRank, which is in the end nothing but a mathematical construct that can't possibly measure relevance.

    A page's PageRank is simply a number calculated from the PR value of every page that links to it, internal or external. It is just a number. Numbers can't possibly judge relevance. Google's mission in life is to present relevant pages for terms searched on. How can they possibly do this with PageRank. It is just a number, not a relevance measurement.
     
    compar, Aug 21, 2004 IP
  12. nohaber

    nohaber Well-Known Member

    Messages:
    276
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    138
    #32
    compar, I don't know where you get this LocalRank information :) The only source that mentions LocalRank is the 2 Google patents.

    OldScore is based on PageRank and the IR Score
    LocalRank is based on the OldScore.

    So if OldScore is influenced by PageRank and LocalRank is made from OldScore :) then...

    It's all in the patent.

    No one has ever said that PageRank is a relevance measurement. It is an importance measurement. When you have 100000 relevant pages, you want to push the more important ones to the top, right? :D
     
    nohaber, Aug 21, 2004 IP
  13. fluke

    fluke Guest

    Messages:
    209
    Likes Received:
    0
    Best Answers:
    0
    Trophy Points:
    0
    #33
    not sure i understand what you mean there

    thank you

    it wasn't artificial.
    i am just trying to smooth over questions that i feel may well be felt by some as "o god not that question again" and " what the hell are you on about" questions.
    maybe not so much here but the forums i came before here i got answered with impatience and curtness to several of my inquiries.

    anyhow thanks for your answers anyway - starting to make things clearer
     
    fluke, Aug 21, 2004 IP
  14. compar

    compar Peon

    Messages:
    2,705
    Likes Received:
    169
    Best Answers:
    0
    Trophy Points:
    0
    #34
    Nohaber,

    I just did a couple of searches on the Patent Document and the term Local Rank does not appear anywhere in the document. The term used is Local Score.

    Nor does the term PageRank appear anywhere in the document.

    So your claim that Oldscore x by PageRank = Local Rank is completely unsupported. What the document says is "calculating the refined relevance scores as

    NewScore(x)=(a+LocalScore(x)/MaxLS)(b+OldScore(x)/MaxOS)

    where NewScore is the calculated refined relevance score value, a and b are predetermined constants, MaxLS is equal to the maximum of the calculated local score values, MaxOS is equal to the maximum of the calculated initial relevance score values, and LocalScore(x) refers to the local score value. "

    Where is the reference to PageRank?
     
    compar, Aug 21, 2004 IP
  15. Owlcroft

    Owlcroft Peon

    Messages:
    645
    Likes Received:
    34
    Best Answers:
    0
    Trophy Points:
    0
    #35
    It's there, but rather indirect. But where it is, is amidst some generally interesting text. The--dare I say it?--relevant extracts are, I reckon, these (with some emphases added for further comment):

    The initial rank value corresponds to a calculated relevance of the document. There are a number of suitable ranking algorithms known in the art. One of which is described in the article by Brin and Page, as mentioned in the Background of the Invention section of this disclosure. Alternatively, the functions of main ranking component 123 and document locator 121 may be combined so that document locator 121 produces a set of relevant documents each having rank values. In this situation, the rank values may be generated based on the relative position of the user's search terms in the returned documents. For example, documents may have their rank value based on the proximity of the search terms in the document (documents with the search terms close together are given higher rank values) or on the number of occurrences of the search term (e.g., a document that repeatedly uses a search term is given a higher rank value).

    [...]

    The initial rankings, for each document, x, in the returned set of relevant documents, is referred to herein as OldScores(x).

    [...]

    The LocalScore for each document x is based on the relative support for that document from other documents in the initial set (the computation of LocalScore is described in more detail below with reference to FIG. 3). Documents linked to by a large number of other documents in the initial set (i.e., documents with high relative support), will have a high LocalScore.

    [...]

    Accordingly, re-ranking component 124 removes documents . . . that have the same host as document x. (Act 302). More specifically, let IP3(x) denote the first three octets of the IP (Internet Protocol) address of document x (i.e., the IP subnet). If IP3(x)=IP3(y), document y is removed from B(y).

    [...]

    On occasion, multiple different hosts may be similar enough to one another to be considered the same host for purposes of Acts 301 and 302. For example, one host may be a "mirror" site for a different primary host and thus contain the same documents as the primary host. Additionally, a host site may be affiliated with another site, and thus contain the same or nearly the same documents. Similar or affiliated hosts may be determined through a manual search or by an automated web search that compares the contents at different hosts. Documents from such similar or affiliated hosts may be removed . . . .

    [emphases added, but grammar left in its atrocious original]

    As Arte Johnson used to say, Veeery interesting . . . .

    To me, the most interesting extracts are, first, "linked to", which, at least by implication, seems to say that the belief in outbound links giving "authority" weight is unfounded (as a rule patent claims throw in the kitchen sink when describing methodology, to make their scope as wide as possible)--and second, "Alternatively . . . In this situation", which, to me, at least suggest that on-page factors (and those are in themselves interesting examples of what on-page factors G may consider) may be considered an alternative to the PageRank method of assigning OldScores.

    It's also quite noteworthy, I think, that the discussion of the removal from the link pool of same-site and similar-site documents is provided for in this "afterburner"--which I'd say strongly implies that they are not removed or devalued in the initial scoring process ("OldScore"), many theories otherwise notwithstanding.

    There are some other details of interest, and it's worth the effort of slogging through the legalese casting and horrid grammar and style of the thing to get to the salient information. Food for thought, that's for sure . . . .

    As is the burning question: Is this thing deployed yet?
     
    Owlcroft, Aug 21, 2004 IP
  16. nohaber

    nohaber Well-Known Member

    Messages:
    276
    Likes Received:
    18
    Best Answers:
    0
    Trophy Points:
    138
    #36
    compar, it's my fault that I interchangedly use localscore and localrank as synonyms. Afterall LocalScore is the right term used by Google.
    Owlcroft has provided some the peices that imply PageRank. Note, that PageRank is used in the other patents and that Google writes about the original google paper as "the advanced ranking method".
     
    nohaber, Aug 21, 2004 IP
  17. Dominic

    Dominic Well-Known Member

    Messages:
    1,725
    Likes Received:
    121
    Best Answers:
    0
    Trophy Points:
    185
    #37
    I'm buying that it is in use because it's so simple and makes common sense.

    If two sites want to rank #1 for 'Britney Spears'

    The first site has the top 100 search results for 'Britney Spears' all linking to it with the anchor text 'Britney Spears'

    The second site has the top 100 search results for 'Nelson Mandella' all linking to it with the anchor text 'Britney Spears'

    Then my money will be on the first site with the benefit of 'LocalScore' even if the sites about Nelson Mandella were all 50% more powerful PR pages.

    It's a very good way to cut through what is now link-clutter and produce the best serps.

    In terms of relevance - what the doc clarifies is that relevant means found in the same set of serps.

    Hence the power of articles like the ones in compar's infopool and online newspaper articles - they have a better chance of appearing somewhere in the actual search results for the targeted phrase. And the power of owlcroft's new project.
     
    Dominic, Aug 21, 2004 IP
  18. compar

    compar Peon

    Messages:
    2,705
    Likes Received:
    169
    Best Answers:
    0
    Trophy Points:
    0
    #38
    Owlcroft and Nohaber,

    I completely fail to see any veiled reference to PageRank in the section quoted from the patent. The section refers to a "rank" but the description of the components of that "rank" don't sound anything like the components of PageRank.

    There is the first section you put in bold:
    That section is talking about relevance assessment based on the proximity and the number of keywords in the web page. PageRank has nothing to do with this.

    I think you guys are seeing things that just are not there. The use of the word "rank" does not mean, or imply, PageRank.

    LocalScore, and I agree everyone calls it LocalRank, is a measure of relevance. You can't measure relevance from PageRank. It's just a damn number. It has absolutely no relationship to the content of the pages involved, nor to the anchortext of the links involved.

    PageRank is thematically and semantically independent. You get PageRank from a link that says "Click here". You get PageRank from a link on an empty page, assuming the page had any PageRank to pass on.

    PageRank does not measure relevance, and cannot be used in any meaningful relevance calculation. PageRanK is not a component of LocalRank!
     
    compar, Aug 22, 2004 IP
  19. nadlay

    nadlay Guest

    Messages:
    306
    Likes Received:
    4
    Best Answers:
    0
    Trophy Points:
    0
    #39
    Just because Google has patents, dos not mean that their search algorithim is necessarily based on those patents.

    Their algorithim has changed over time, BUT, they haven't filed a new Patent Application everytime the algorithim changed.

    The only thing you can infer from the patents, is that the algorithim PROBABLY incorporates certain elements of the various patents in varying degrees.
     
    nadlay, Aug 22, 2004 IP
  20. Owlcroft

    Owlcroft Peon

    Messages:
    645
    Likes Received:
    34
    Best Answers:
    0
    Trophy Points:
    0
    #40
    I think I quoted too much material; it was all interesting (to me, anyway), but it may have obscured some things.

    The initial rank value corresponds to a calculated relevance of the document. There are a number of suitable ranking algorithms known in the art. One of which is described in the article by Brin and Page, as mentioned in the Background of the Invention section of this disclosure. ​
    Slightly restating that, and turning it into something closer to English:

    The OldScore corresponds to a calculated relevance of the document. There are a number of suitable ranking algorithms, one of which is PageRank.​
    That translation explicitly equates "a ranking algorithm described in the article by Brin and Page" with "PageRank", but I think that is clearly correct.

    LocalScore is indeed not determined by PageRank, save in that the initial list to be Locally Scored is, apparently, obtained by Page Rank. (Though the LocalScore technique looks like a pretty close cousin of PageRank in concept.)

    You're preaching in the wrong church. I, at least, could scarcely agree more. But it's not up to me, or to you, or to any of us, it's up to The Mighty Minds at Google. And, to my eye, a close reading of that patent's text, as I outlined earlier, seems to support the view that their initial ranking determination relies very heavily on sheer PageRank. Indeed, the whole point of LocalScore seems to be to try to extract some value from what is, by implication, a fairly valueless original listing, in turn implying that the pre-LocalScore methodology (which may still be all they use--we can't know) is not too good.

    (Me, I'd use stronger language, but I'm just trying to interpret what they say in terms of their context.)

    Does LocalScore help sort out the sorry mess that using "links are votes" as a strategy necessarily makes? In my opinion, Yes and No. Just as with PageRank itself, it will help in the short run. The shortness of that run will be--again, as it was for PR--the response time of SEO people, a span that is now much shorter than it was when Google was a-birthing. In quite short order--if it is widely believed that LocalScore has indeed been deployed--SEO will just mutate to specializing on buying/selling/trading PR from pages in the same category. That will not, as The Lads wishfully think, improve the quality of SERPs, it will in fact only exacerbate the problems ("problems" from the point of view of the ordinary search user and the amateur "webmaster", not necessarily of commercial webmasters or SEO experts); if it's hard now for a site that cannot beg, borrow, or steal artificial outside PR to rank, it will be exponentially harder with LocalScore, as the higher rankers within a topic will almost all be those who have bought their way in.

    Till it is recognized that Google is simply a Rube Goldberg machine, with every "improvement" one more typically Goldbergian gadget, this nonsense will be perpetuated.
     
    Owlcroft, Aug 22, 2004 IP