Didn't they become a fully-fledged domain registrar last year? I'm not sure what that means in terms of getting hold of Verisign/Nominet databases etc. but I can see the benefit to them if they can query the global DB of domain registrations. .
Yes Tops, at the time some of us said that this was probably the only reason they became a domain registrar. As you say it makes sense to them, it also I guess allows them to keep track of known spammers websites OMG they wouldn't be THAT dirty would they? lol
And even when yuo block them in robots.txt I guess they will still drop by and have a look for internal use (document age, document change etc.). (OT: your sig link says Scrum V but the domain is scumv... Just mentioning it in case either is wrong...)
I registered a domain about 10 days ago, waited about 5 to attach it to a site. I put dummy nameservers in for 4, the correct for the last 1. Threw up a no robots meta-tagged page, robots.txt blocking all. Within an hour I had Googlebot hitting the robots.txt file. I thought it was due to the Google toolbar. But now I have to wonder... I went ahead and unblocked the page, and rank #8 for a keyword I expect to be competitive before long. Nothing more than a headline, logo, subline. Gotta get that site up... I am getting a few error hits from a similar domain, and curiosity seekers.
I also have an example of a site that Google has indexed but it has never had an external link pointing to it. This site has been in "construction" mode for a year or so (customer has not worked on it to get it off the ground). It does have content as well as outbound links. Oddly enough, when I just checked for this site in Google, Google reports one link to the page! The link is from Google itself !!!!! The link Google reports looks like this - F2xxlETYqFsJ:www.thesite.com. This code F2xxlETYqFsJ, is the Google checksum for the cached version of the page. However, when I attempt to look at that page, it is blank.(???) Very odd indeed! Caryl ps - I do not believe anyone with a Google toolbar has ever viewed the site.
I have a one page site that has been up for about three months. It has NO links pointing to it. I HAVE been to the site with a browser that has the tool bar installed AND it does show using the site: command. I guess it could be from a whois also. Do we know if google actually uses whois info at this time?
I remember this (although I'd forgotten who said it - thanks!). My recollection was that he specifically said Google could find "orphaned" pages but he wouldn't say how. To me the message was, if you don't want a page indexed, use a robots.txt file to specifically disallow it or use the "noindex" meta tag to do so.
So, not only does G put in a lot of work sandboxing sites, it work hard to find sites to sandbox They must really like the sandbox idea
Yes. ©2005 Google - Searching 8,058,044,651 web pages ©2005 Google - Sandboxing 32,058,044,651 web pages