Hi, I am looking for maybe a tutorial or a open script on how to make a spider. I want to learn how to pull stuff like content, images, and urls. Thanks in advance.
This will help you. class spider { // This class grabs the content from the sites function setup() { $cookieJar = 'cookies.txt'; curl_setopt($this->curl,CURLOPT_COOKIEJAR, $cookieJar); curl_setopt($this->curl,CURLOPT_COOKIEFILE, $cookieJar); curl_setopt($this->curl,CURLOPT_AUTOREFERER,true); curl_setopt($this->curl,CURLOPT_TIMEOUT,30); curl_setopt($this->curl,CURLOPT_CONNECTTIMEOUT,25); curl_setopt($this->curl,CURLOPT_FOLLOWLOCATION,true); curl_setopt($this->curl,CURLOPT_RETURNTRANSFER, true); } function get($url) { $this->curl = curl_init($url); $this->setup(); return $this->request(); } function request() { return curl_exec($this->curl); } } PHP:
Ive used http://www.sphider.eu/ and found it pretty good. It is open source too so you have the code to work with and see how he did it.