Hello, What is Googlebot like? Is it a PHP script or a script written in another programming language? Is it one script or many scripts? I am not talking about the 100 (or so) algos that Google implements in their SE. I am inquiring about the robot that visits my site and fetches my pages. What is it like?
i guess its just a simple php script which is instructed to follow every available link out there, well thats was the description i was given when i asked a similar question a few months ago.
A php script dont make me laugh! If you bothered to look for the published paper about google pagerank it tells you all about the crawler. Its a python script, which can access 100 pages per second using 4 threads. Pierce
I actually met the little beast.... http://blog.galway.ws/2007/05/13/google/i-met-a-miserable-little-searchbot-today.html
Actually, there are two bots, deepbot and freshbot, deepbot going through all the links he finds and freshbot searching for the new content. It was the case in 1998, but now capacities are clearly much higher --- it would take years and years for it to re-index all the Google's 10 billion+ pages index
I find GoogleBot to be like the pretty girl you dated, but never really paid you much attention. Always talking on the phone or to someone else when you were out in public. Never replied to your letters when you served in the Peace Corp but still went out with you when you got back. And like that pretty girl, I'm pretty sure that GoogleBot has a future working in a grocery store checkout line, 60 pounds overweight and covered in too much makeup.
yeah thats from the start there was 4 servers i think? so that was 400 pages per second. I personally believe they have the capability to touch every domain on the internet every 20 minutes. Where is your source for 2 types of bots(fresh/deep bots)? Pierce
I don't think there's a single Googlebot. There are probably 100's of thousands running at any given time and feeding information into batch servers that process the pages into the Google database.