Hi Shaun, How do you determine pagecount ? I know it sounds like a stupid question, but here's why i ask: If you're using the 'site:' cmd on G, the numbers will be falling considerably short of the mark ie.. [site:digitalpoint.com] yields 91,600 while appending a query.. [site:digitalpoint.com digitalpoint] yields 170,000 (almost double the number of the typical site: cmd use. I know that my site sees a significant improvement (a touch closer to reality) when using an arg. How can these differences be addressed ? You may already have addressed this somehow - it's just the differences are site related, not a standard reduction/increase in numbers across the board, and I obviously want to ensure that what I bring to the network is properly assessed. I'll bet I wrote all this for nothing, and the answer will be an all too obvious one liner
Thx a389951l, but the thread doesn't address my question. It simply states that there are variations between the API results and G web. The same (above) can be applied to the API as well. It's primarily about the addition of the query arg. What the other thread *did* remind me of was the weighting cap, which seems to be 22,500. Just the same, I'd still like to know whether the above is considered/addressed. Cheers JL
The variations between the API and www.google.com are relatively similar (percentage-wise) from site to site. So yeah.. even though the API returns lower results than www.google.com, it returns lower results for everyone, and the weight is just relative to everyone else on the network.
I think I understand what you're saying (the percentage differences are essentially the same for all sites via the api and Google web.) But what I'm really asking is this: If I use a keyword with the site: cmd (in original post) the number of pages found is far higher than without the keyword (regardless of whether this is done via the google search page or the API). Given that we all have our own specific keywords that we know will return a higher count (in original post), I suppose I'm asking is there any plan to allow the use of an appended keyword via the user account for site: queries (*which I presume you're using to check the number of indexed pages* (you call it 'pages in URL' in the keyword tracker.)). The percentages change quite significantly across sites, so unlike the differential between web and api, it can't really be generically applied. Of course this may be entirely inconsequential and a waste of space to implement - but I see it as a way to get more accurate results from G (especially for those with low page counts, who would really notice it).
The search query you use has got nothing do to with page count. It's just like any other search but limited to that site. That's no good page count at all.
Sure. I'll agree with that. So what am I missing.. How do you determine the number of pages currently in G's index for a specific site ?
You can't. G's count might or might not be accurate, no one knows. But to keep it fair this count can be used since it seems like everybody is affected by its error the same percentage. Unless Shawn writes his own spider for this, the Network has to stick with what is publicly available...
I'll agree that the results are never accurate, but I'm yet to see a *better* way to get closer to the true count. I'm also yet to see G overstate the numbers. So, unless I'm mistaken, my original approach is more effective than the standard site: approach. The objective here is to afford ppl the opportunity to get as close to the real numbers as possible. Not to disadvantage anyone. And if it's implemented at acct level then it's publicly available for all to use. There's absolutely no need to deploy a crawler, nor is it implied. It's just a matter of minor coding, and an extra field in some db (well, theoretically anyway - I don't presume to know the back end). Just because not everyones going to use it, doesn't mean it won't be leveraged by those who want to, nor does it give anyone an unfair advantage. As I said in my previous posts, Shaun may be querying pagecount another way, and this entire discussion may be mute.
But your approach doesn't work because I can think of plenty of sites which don't have a certain keyword plastered on each page. DP might have digitalpoint on each page but I'm convinced that 95% of websites / domains will NOT have one sinlge, measurable keyword on all, not even most pages. So it doesn't work that way IMO. I agree it might get closer to the real number but adding variables doesn't make it a transparent weighting system.
And remember... If everyone indeed is affected similarly then you don't need the real figure because the end result will be the same.
There are plenty of sites that don't appear in G too but does that mean we shouldn't use it ? I think I've already pointed out that it can improve things - it's up to end users to determine whether they care for it. *** Oh and one more thing. It's not necessarily dependant on keywords being present on each page. Anyway, Shaun will chime in sooner or later
Yes. So you are essentially asking for a way for people to individually have a higher weight if they take the time to do so? Sorry, not going to happen. The pages in URL returned are pretty accurate (as far as percentages go), so the only way to make something like that work is if EVERYONE entered a "keyword" and then we would be back to the same percentages. It wouldn't really matter what the absolute number is because the weighing is determined as relative to everyone else, not an absolute number. An easy way to do it without the user needing to enter anything would be to make a query with part of the domain in it (for example digitalpoint in my case). But again, it's not going to matter at all, because the weight is relative to everyone else. If everyone's weight goes up by pretty much the same percentage, it's pointless.
Thx Shaun. I'm not convinced that the effects would remain consistent across all sites, (otherwise I wouldn't have bothered proposing it in the first place) - not that it matters. At the end of the day, I'm sure there are other ways to improve the coop (stats etc) that would prove far more beneficial for all, and make better use of the time you have for it. Thx for answering the post, it was getting a bit long winded
So, we know PR plays a role in the BW of each site. Do you have any idea when the new PR figures will factor into the BW figures in the Co-op?
I would guess the next time the base weights are updated...but that's just my guess Although I did see Shawn mention in another thread that the only time the PR is updated (using the api/keyword tool) is when backlinks are updated. If the api is being used here, maybe the PR change will be reflected in our base weights after the next backlink update?