hi can anyone tell me how do i do a script or is there a script which can act as a bot to copy the links from the yahoo and google itself into my database automatically ? because im trying to do a search engine in for a few countries like singapore and malaysia only. Anyone have any idea how to do it or start ? its somehow something like a web crawler
Broad question...., broad answer: * send search query to Google, + "site:my" (this will make sure only Malaysian websites are displayed) * grab the html page with results from Google * use some regular expressions or whatever method you'd prefer to extract the results from the html page -do the same for Yahoo -join the results ...-and come up with some algorithm to determine relevance to nicely order the results Every part of this can be easily realized in PHP, but I doubt it's in appliance with Google's or Yahoo's TOS to scrape their engines with automated systems (at least without using their APIs).
Basically you want to build a scraper. Use Snoopy Class, generate a session to wherever. Write a regex that captures the results you're looking for. Not too hard, eh ?