Hi, I'm new to this site. I have a couple questions about the Googlebot. I am developing a website on a domain name I bought about a week ago. I didn't know when I bought it, but the domain name was previously owned and I discovered it had a PR4. I got scared because I had read that deleted domains are sometimes penalized. Anyway, Googlebot has been furiously spidering all the pages of the new site. Yesterday, the new site was in the Google index. Today, after much spidering of several pages, the site is no longer in the index. Something similar also happened to one of my older sites after I moved it to a new host and redesigned it. One day, the new site title and cache were in the index. The next day, the old site title and cache were there! The next day, the new site was back and now the new site design has stayed in the index. Any clues on why these things happen? Another queston - why does the Googlebot come to my site, spider one or two pages, then leave and come back two hours later to spider two more pages? Why doesn't it do it all at once?
I can't answer the first part of your post but I'll try with the second. My understanding is that Googlebot harvests links and puts them all in a big "to do" list. So when it's tackling url #1 it might be adding #37890. It then goes to #2. So if you have alot of links on one page of another site you might trigger a number of pages being checked. Googlebot then goes to #2 and adds the links it finds there. As you can see the #37890 is still along way from being checked. Then every so often it checks all the page it knows about, this used to be the job of "deepbot" which would do a total site crawl. Nowadays Freshbot/Googlebot does the lot. This means that not only does internal and external linking increase your PR it also increases the frequency of indexing. The other reason it might not do a total search all in one go is that it would slow you site down for the human visitors. Bots who do total indexes frequently find themselves banned.
Thanks for that helpful explanation. I understand why it doesn't do all the links at once now. You mentioned internal links. Do absolute links within a site from one page to another really help page rank? My site is going to be hundreds of pages with lots of internal links. That should work in my favor, right?
Yes, PR and backlinks are all calculated using internal links as well as external links. I think your day to day problem with your new sites index is simply reflecting the difference between data centers.
About a year ago i read that Google was killing PR and linkage factors for domains that had gone to 'deleted' or changed ownership per whois. I'd never really bothered checking accuracy of this one way or the other.. until recently someone acquired an expiring domain that had dmoz/yahoo listing and a handy pr taboot ! - at which point I recalled that story.. Anyway, let me know what happens. I'll be interested to see if you retain your pr. Oh ps: If your reference to 'absolute' internals (vs relative internals) was deliberate, I don't think it matters which you use.
Well it will be delisted from DMOZ automatically (at least for expired domains, not necessarily domains that change owners). - Shawn
I've heard from various resources that Google now finds these "expired and repurchased" domains and will drop the backlinks and PR on them. However, logically, this still doesn't make sense. If people are still linking to the domain, after one or two months, if the linkees haven't noticed, Google will pick up the backlinks again and award the site its original PR. hexed
Just got a newsletter with a link through to another forum: http://www.cre8asiteforums.com/viewtopic.php?t=8690 RustyBrick hangs here too doesn't he?
Hmm... so is there a difference between expired domain names and ones that have changed ownership? It looks like mine had gone unowned for at least a year. Over the past two days the PR has gone from 4 to 3 and I am sure it will go to 0. I don't care if the PR goes down. I didn't buy it knowing it had PR. I'm just worred that I will be somehow penalized or prevented from building my own PR? Why the heck would my new site on that domain be in the index, in Google's cache and now not in the index at all? John: My reference to absolute links was indeed deliberate. I was under the impression that it did matter?
Google has a huge network of computers. I heard someone today claim it was the largest network in the world. They are not all running a single database or one set of results. In fact I believe there are thought to be 56 different databases or data centers involved. McDar, who is a memeber of this forum, has a tool that allows your to see the individual result from all 56 data centers. You can find it here. One of the data centers has an IP that ends in 98. This is the data center that seem to be the earliest to reflect new upgrades when they are working there way through the system. In any case you can never be sure which data center you may be login on to when you do any search. So the fact that you get different results may be just because you got results from two different data center that were not in perfect sync.
Wow, awesome link! I found that the site is indeed there on several databases. Also, I saw that another new site I've started is there on all the servers. I hadn't bothered to check it before because it has no backlinks. How can my site be in the index if it has no backlinks? Is it because I have adsense on it?
Hi Chiara, Google operates several different data centers and you may see different results from each at a given time. So, that may explain your ranking changing from day to day if your view of the page used a different data center.
backlinks are all calculated using internal links as well as external links. I think your day to day problem with your new sites index is simply reflecting the difference between data centers.
izhar_saifi, I'm so amazed you digged this thread up to live again. probably the thread poster not gonna read it anymore XD